Citation: Z. Wang, S. Kim, S.K. Quinney, Y. Guo, S.D. Hall, L.M. Rocha, and L. Li . "Literature mining on pharmacokinetics numerical data: A feasibility study". Journal of Biomedical Informatics. 42 (4): 726-735. doi:10.1016/j.jbi.2009.03.010
The full text and pdf re-print are available from the Journal of Biomedical Informatics site. Due to mathematical notation and graphics, only the abstract is presented here.
A feasibility study of literature mining is conducted on drug PK parameter numerical data with a sequential mining strategy. Firstly, an entity template library is built to retrieve pharmacokinetics relevant articles. Then a set of tagging and extraction rules are applied to retrieve PK data from the article abstracts. To estimate the PK parameter population-average mean and between-study variance, a linear mixed meta-analysis model and an E–M algorithm are developed to describe the probability distributions of PK parameters. Finally, a cross-validation procedure is developed to ascertain false-positive mining results. Using this approach to mine midazolam (MDZ) PK data, an 88% precision rate and 92% recall rate are achieved, with an F-score = 90%. It greatly out-performs a conventional data mining approach (support vector machine), which has an F-score of 68.1%. Further investigate on 7 more drugs reveals comparable performances of our sequential mining approach.
Keywords:Clearance; Data mining; Entity recognition; Information extraction; Linear mixed model; Midazolam; Pharmacokinetics.