Literature mining on pharmacokinetics numerical data: A feasibility study

Zhiping Wang1, Seongho Kim1, Sara K. Quinney1, Yingying Guo2, Stephen D. Hall2, Luis M. Rocha3,4,Lang Li1,*

1Biostatistics, School of Medicine, Indiana University, USA
2Eli Lilly and Company, Indianapolis, IN, USA
3School of Informatics, Indiana University, USA
4FLAD Computational Biology Collaboratorium, Instituto Gulbenkian de Ciencia, Portugal
*To whom correspondence should be addressed.

Citation: Z. Wang, S. Kim, S.K. Quinney, Y. Guo, S.D. Hall, L.M. Rocha, and L. Li [2009]. "Literature mining on pharmacokinetics numerical data: A feasibility study". Journal of Biomedical Informatics. 42 (4): 726-735. doi:10.1016/j.jbi.2009.03.010

The full text and pdf re-print are available from the Journal of Biomedical Informatics site. Due to mathematical notation and graphics, only the abstract is presented here.


A feasibility study of literature mining is conducted on drug PK parameter numerical data with a sequential mining strategy. Firstly, an entity template library is built to retrieve pharmacokinetics relevant articles. Then a set of tagging and extraction rules are applied to retrieve PK data from the article abstracts. To estimate the PK parameter population-average mean and between-study variance, a linear mixed meta-analysis model and an EM algorithm are developed to describe the probability distributions of PK parameters. Finally, a cross-validation procedure is developed to ascertain false-positive mining results. Using this approach to mine midazolam (MDZ) PK data, an 88% precision rate and 92% recall rate are achieved, with an F-score = 90%. It greatly out-performs a conventional data mining approach (support vector machine), which has an F-score of 68.1%. Further investigate on 7 more drugs reveals comparable performances of our sequential mining approach.

Keywords:Clearance; Data mining; Entity recognition; Information extraction; Linear mixed model; Midazolam; Pharmacokinetics.

For more information contact Luis Rocha at Check the Web Design Credits, for due credit.
Last Modified: October 27, 2009