Indiana University Bloomington

School of Informatics and Computing



Research
Faculty Research Profiles

Sun Kim

Faculty Title

Associate Professor of Informatics

 

Research Statement

My current research interests lie in string-pattern matching, combinatorial search and data mining with primary emphasis on their applications to bioinformatics. Problems in biology are computationally hard and require handling of a huge amount of data. Thus, it is crucial to develop efficient and scalable algorithms to tackle problems in biology. For years, I have been developing new algorithms in string-pattern matching, combinatorial search, and data mining, and I have successfully combined these algorithms and existing algorithms to tackle hard problems in biology, the shotgun sequence assembly problem and the multiple genome comparison problem.

These two problems can be formulated as search problems over a huge amount of biological sequence data. Techniques from string-pattern matching and data mining, if properly used, can significantly reduce the search space. User interactions are essential for the biological data mining process. Thus, to tackle these biological problems, tools and techniques should be integrated in an exploratory framework, where user interactions can guide searches. My research will focus on building exploratory search frameworks for these two problems in biology while developing new algorithms in string matching and data mining as component tools for the frameworks.

Selected Publications

       

      From recent papers

    1. A special issue on Advanced Signal Processing Techniques for Bioinformatics in EURASIP Journal on Applied Signal Processing, Guest Editor (with Xue-wen Chen, Vladimir Pavlovic, and David Casasent), 2006
    2.  

       

    3. Sun Kim, Zhiping Wang, and Mehmet Dalkilic, ``iGibbs: Improving Gibbs Motif Sampler for Proteins by Sequence Clustering and Iterative Pattern Refinement,'' Proteins: Structure, Function, and Bioinformatics, in press

       

       

       

    4. Guangyu Chen, Jeong-Hyeon Choi, Bin Song, John Chmura, GQ Zhang, Anthony K.H. Tung, Jaewoo Kang, Sun Kim, and Jiong Yang, ``ARCS: An Aggregated Related Column Scoring Scheme for Aligned Sequences,'' Bioinformatics, in press
    5. Lang Li, Alfred S. Cheng, Victor X. Jin, Henry H. Paik, Meiyun Fan, Xiaoman Li, Wei Zhang, Jason Robarge, Ramana V. Davuluri, Sun Kim, Tim H.-M. Huang, Kenneth P. Nephew, ``A Mixture Model Based Discriminate Analysis for Identifying New Ordered Motif Pairs in Gene Motif Modules Directly Regulated by Estrogen Receptor-alpha,'' Bioinformatics, in press

       

       

       

    6. Sun Kim and Jason Lee, ``BAG: A Graph Theoretic Sequence Clustering Algorithm,'' International Journal of Data Mining and Bioinformatics, Vol 1 No 2, 2006, in press

       

       

       

    7. Susan H. Wei, Curtis Balch, Henry H. Paik, Yoo-Sung Kim, Rae Lynn Baldwin, Sandya Liyanarachchi, Lang Li, Zailong Wang, Joseph C. Wan, Ramana V. Davuluri, Beth Y. Karlan, Gillian Gifford, Robert Brown, Sun Kim, Tim H-M. Huang, and Kenneth P. Nephew ``DNA Biomarkers Possessing Methylation-Predictive Sequence Patterns in Ovarian Cancer,'' Clinical Cancer Research, 2006 12: 2788-2794.

       

       

       

    8. Tsukahara Takuma, Sun Kim, and Milton Taylor, ``REFINEMENT: An Iterative Discriminant Model Refinement Approach to Regulatory Sequence Detection,'' Journal of Computational Biology and Chemistry , Volume 30, Issue 2, April 2006, Pages 134-147

       

       

       

    9. Kwangmin Choi, Youngik Yang, and Sun Kim ``CGAS: a Comparative Genome Annotation System,'' Comparative Genomics in the Methods in Molecular Biology series, Nicholas Bergman (edited), to appear, Humana Press, 2006

       

       

       

    10. Sun Kim, JeoungHyeon Choi, Amit Saple, and Jiong Yang, ``A Hybrid Gene Team Model and Its Application to Genome Analysis,'' To appear Journal of Bioinformatics and Computational Biology Imperial College Press

       

       

       

    11. DoHoon Lee, Jeong-Hyeon Choi, Mehmet Dalkilic, and Sun Kim, ``COMPAM :Visualization of Combining Pairwise alignment for Multiple Genomes,'' Bioinformatics, 26(2):242-244, January 2006.

       

       

       

    12. Jeong-Hyeon Choi, Hwan-Gue Cho and Sun Kim, ``GAME: A Simple and Efficient Whole Genome Alignment Method Using Maximal Exact Match Filtering,'' Journal of Computational Biology and Chemistry , 29(3):244-253, July 2005

       

       

    13. JeongHyeon Choi, Kwangmin Choi, Hwan-Gue Cho, and Sun Kim,

       

      ``Multiple Genome Alignment by Clustering Pairwise Matches,''Jens Lagergren (Ed.): Comparative Genomics, RECOMB 2004 International Workshop, RCG 2004, Bertinoro, Italy, October 16-19, 2004, Revised Selected Papers. Lecture Notes in Computer Science 3388 Springer 2005

       

    14. Kwangmin Choi and Sun Kim, ``Comparative Genome Annotation Systems,'' Chapter 16 in Advanced Data Mining Technologies in Bioinformatics, edited by Hui-Huang Hsu, Idea Group, Inc, 2006,

       

       

       

    15. Curtis Balch, John S. Montgomery, Hyun-il Paik, Sun Kim, Tim H-M Huang, and Kenneth P. Nephew, ``New Anti-cancer strategies: Epigenetic Therapies and Biomarkers,'' Frontiers in Bioscience, 10, 1897-1931, May 1, 2005

       

       

       

    16. Kwangmin Choi, Yu Ma, JeongHyun Choi, and Sun Kim ``PLATCOM: A Platform for Computational Comparative Genomics,'' Bioinformatics 2005 21:2514-2516 Oxford Unversity Press.
    17. Kan Nobuta, Tom Ashfield, Sun Kim, and Roger W. Innes, ``Diversification of non-TIR class R genes in relation to whole genome duplication events in Arabidopsis,'' Molecular Plant-Microbe Interactions, Vol. 18, No. 2, 2005, pp. 103 109.

       

       



      Old stuff

    18. Derek W. Wood, Joao C. Setubal, Rajinder Kaul, Dave E. Monks, Joao P. Kitajima, Vagner K. Okura, Yang Zhou, Lishan Chen, Gwendolyn E. Wood, Nalvo F. Almeida Jr., Lisa Woo, Yuching Chen, Ian T. Paulsen, Jonathan A. Eisen, Peter D. Karp, Donald Bovee Sr., Peter Chapman, James Clendenning, Glenda Deatherage, Will Gillet, Charles Grant, Tatyana Kutyavin, Ruth Levy, Meng-Jin Li, Erin McClelland, Anthony Palmieri, Christopher Raymond, Gregory Rouse, Channakhone Saenphimmachak, Zaining Wu, Pedro Romero, David Gordon, Shiping Zhang, Heayun Yoo, Yumin Tao, Phyllis Biddle, Mark Jung, William Krespan, Michael Perry, Bill Gordon-Kamm, Li Liao, Sun Kim, Carol Hendrick, Zuo-Yu Zhao, Maureen Dolan, Forrest Chumley, Scott V. Tingey, Jean-Francois Tomb, Milton P. Gordon, Maynard V. Olson, and Eugene W. Nester, ``The Genome of Agrobacterium tumefaciens C58: Insights into the evolution and biology of a natural genetic engineer,'' Science , Dec 14 2001: 2317-2323
    19. Sun Kim$^*$, ``A New String Matching Algorithm Using Partitioning and Hashing Efficiently,'' The ACM Journal of Experimental Algorithmics, Vol 4, 1999
      http://www.jea.acm.org/1999/KimString
    20. Sun Kim and Alberto Maria Segre ``AMASS: A Structured Pattern Matching Approach to Shotgun Sequence Assembly,'' Journal of Computational Biology, Vol 6 (2), Mary Ann Liebert Press (Summer 1999), pp. 163-186.

       

       

       

    21. Sun Kim and Hantao Zhang, ``ModGen: Theorem Proving by Model Generation'' Proceedings of National Conference on Artificial Intelligence (AAAI), 1994, Seattle, WA. MIT Press, pp. 162-167, VERY OLD, BUT I STILL WANT TO EXPLORE, BUT NO TIME!

 

More Information

Sun Kim

Our faculty research profiles highlight the research interests and accomplishments of a select faculty member from the IU School of Informatics. View all