Sun Kim
Faculty Title
Associate Professor of Informatics
Research Statement
My current research interests lie in string-pattern matching, combinatorial search and data mining with primary emphasis on their applications to bioinformatics. Problems in biology are computationally hard and require handling of a huge amount of data. Thus, it is crucial to develop efficient and scalable algorithms to tackle problems in biology. For years, I have been developing new algorithms in string-pattern matching, combinatorial search, and data mining, and I have successfully combined these algorithms and existing algorithms to tackle hard problems in biology, the shotgun sequence assembly problem and the multiple genome comparison problem.
These two problems can be formulated as search problems over a huge amount of biological sequence data. Techniques from string-pattern matching and data mining, if properly used, can significantly reduce the search space. User interactions are essential for the biological data mining process. Thus, to tackle these biological problems, tools and techniques should be integrated in an exploratory framework, where user interactions can guide searches. My research will focus on building exploratory search frameworks for these two problems in biology while developing new algorithms in string matching and data mining as component tools for the frameworks.
Selected Publications
- A special issue on Advanced Signal Processing Techniques for Bioinformatics in EURASIP Journal on Applied Signal Processing, Guest Editor (with Xue-wen Chen, Vladimir Pavlovic, and David Casasent), 2006
- Sun Kim
, Zhiping Wang, and Mehmet Dalkilic, ``iGibbs: Improving Gibbs Motif Sampler for Proteins by Sequence Clustering and Iterative Pattern Refinement,'' Proteins: Structure, Function, and Bioinformatics, in press
- Guangyu Chen, Jeong-Hyeon Choi, Bin Song, John Chmura, GQ Zhang, Anthony K.H. Tung, Jaewoo Kang, Sun Kim
, and Jiong Yang
, ``ARCS: An Aggregated Related Column Scoring Scheme for Aligned Sequences,'' Bioinformatics, in press - Lang Li, Alfred S. Cheng, Victor X. Jin, Henry H. Paik, Meiyun Fan, Xiaoman Li, Wei Zhang, Jason Robarge, Ramana V. Davuluri, Sun Kim, Tim H.-M. Huang, Kenneth P. Nephew, ``A Mixture Model Based Discriminate Analysis for Identifying New Ordered Motif Pairs in Gene Motif Modules Directly Regulated by Estrogen Receptor-alpha,'' Bioinformatics, in press
- Sun Kim
and Jason Lee, ``BAG: A Graph Theoretic Sequence Clustering Algorithm,'' International Journal of Data Mining and Bioinformatics, Vol 1 No 2, 2006, in press
- Susan H. Wei, Curtis Balch, Henry H. Paik, Yoo-Sung Kim, Rae Lynn Baldwin, Sandya Liyanarachchi, Lang Li, Zailong Wang, Joseph C. Wan, Ramana V. Davuluri, Beth Y. Karlan, Gillian Gifford, Robert Brown, Sun Kim
, Tim H-M. Huang, and Kenneth P. Nephew ``DNA Biomarkers Possessing Methylation-Predictive Sequence Patterns in Ovarian Cancer,'' Clinical Cancer Research, 2006 12: 2788-2794.
- Tsukahara Takuma, Sun Kim
, and Milton Taylor, ``REFINEMENT: An Iterative Discriminant Model Refinement Approach to Regulatory Sequence Detection,'' Journal of Computational Biology and Chemistry , Volume 30, Issue 2, April 2006, Pages 134-147
- Kwangmin Choi, Youngik Yang, and Sun Kim
``CGAS: a Comparative Genome Annotation System,'' Comparative Genomics in the Methods in Molecular Biology series, Nicholas Bergman (edited), to appear, Humana Press, 2006
- Sun Kim
, JeoungHyeon Choi, Amit Saple, and Jiong Yang, ``A Hybrid Gene Team Model and Its Application to Genome Analysis,'' To appear Journal of Bioinformatics and Computational Biology Imperial College Press
- DoHoon Lee, Jeong-Hyeon Choi, Mehmet Dalkilic, and Sun Kim
, ``COMPAM :Visualization of Combining Pairwise alignment for Multiple Genomes,'' Bioinformatics, 26(2):242-244, January 2006.
- Jeong-Hyeon Choi, Hwan-Gue Cho and Sun Kim
, ``GAME: A Simple and Efficient Whole Genome Alignment Method Using Maximal Exact Match Filtering,'' Journal of Computational Biology and Chemistry , 29(3):244-253, July 2005
- JeongHyeon Choi, Kwangmin Choi, Hwan-Gue Cho, and Sun Kim
,
``Multiple Genome Alignment by Clustering Pairwise Matches,''Jens Lagergren (Ed.): Comparative Genomics, RECOMB 2004 International Workshop, RCG 2004, Bertinoro, Italy, October 16-19, 2004, Revised Selected Papers. Lecture Notes in Computer Science 3388 Springer 2005
- Kwangmin Choi and Sun Kim
, ``Comparative Genome Annotation Systems,'' Chapter 16 in Advanced Data Mining Technologies in Bioinformatics, edited by Hui-Huang Hsu, Idea Group, Inc, 2006,
- Curtis Balch, John S. Montgomery, Hyun-il Paik, Sun Kim, Tim H-M Huang, and Kenneth P. Nephew, ``New Anti-cancer strategies: Epigenetic Therapies and Biomarkers,'' Frontiers in Bioscience, 10, 1897-1931, May 1, 2005
- Kwangmin Choi, Yu Ma, JeongHyun Choi, and Sun Kim
``PLATCOM: A Platform for Computational Comparative Genomics,'' Bioinformatics 2005 21:2514-2516 Oxford Unversity Press. - Kan Nobuta, Tom Ashfield, Sun Kim
, and Roger W. Innes, ``Diversification of non-TIR class R genes in relation to whole genome duplication events in Arabidopsis,'' Molecular Plant-Microbe Interactions, Vol. 18, No. 2, 2005, pp. 103 109.
Old stuff - Derek W. Wood, Joao C. Setubal, Rajinder Kaul, Dave E. Monks, Joao P. Kitajima, Vagner K. Okura, Yang Zhou, Lishan Chen, Gwendolyn E. Wood, Nalvo F. Almeida Jr., Lisa Woo, Yuching Chen, Ian T. Paulsen, Jonathan A. Eisen, Peter D. Karp, Donald Bovee Sr., Peter Chapman, James Clendenning, Glenda Deatherage, Will Gillet, Charles Grant, Tatyana Kutyavin, Ruth Levy, Meng-Jin Li, Erin McClelland, Anthony Palmieri, Christopher Raymond, Gregory Rouse, Channakhone Saenphimmachak, Zaining Wu, Pedro Romero, David Gordon, Shiping Zhang, Heayun Yoo, Yumin Tao, Phyllis Biddle, Mark Jung, William Krespan, Michael Perry, Bill Gordon-Kamm, Li Liao, Sun Kim, Carol Hendrick, Zuo-Yu Zhao, Maureen Dolan, Forrest Chumley, Scott V. Tingey, Jean-Francois Tomb, Milton P. Gordon, Maynard V. Olson, and Eugene W. Nester, ``The Genome of Agrobacterium tumefaciens C58: Insights into the evolution and biology of a natural genetic engineer,'' Science , Dec 14 2001: 2317-2323
- Sun Kim
, ``A New String Matching Algorithm Using Partitioning and Hashing Efficiently,'' The ACM Journal of Experimental Algorithmics, Vol 4, 1999
http://www.jea.acm.org/1999/KimString - Sun Kim
and Alberto Maria Segre ``AMASS: A Structured Pattern Matching Approach to Shotgun Sequence Assembly,'' Journal of Computational Biology, Vol 6 (2), Mary Ann Liebert Press (Summer 1999), pp. 163-186.
- Sun Kim and Hantao Zhang, ``ModGen: Theorem Proving by Model Generation'' Proceedings of National Conference on Artificial Intelligence (AAAI), 1994, Seattle, WA. MIT Press, pp. 162-167, VERY OLD, BUT I STILL WANT TO EXPLORE, BUT NO TIME!
From recent papers