CATPA (Curation and Alignment Tool for Protein Analysis) at www.catpa.org
is a standalone Java application that allows for management, visualization, and querying of protein families with the ability to annotate on the residue level, including deletions with images, text, and URLs. The project is with Andrew Albrecht and James Costello.
Indigene (Integration and Discovery in Gene Networks) at www.indigene.org
is a systems biology project that brings together high-throughput gene product screenings for Drosophila
to build complex networks. The project is with James Costello, Justen Andrews, John Colbourne, Brian Eads, from the DGRC (Drosophila Genome Resource Center) and Rupali Patwardhan, Sumit Middha from the CGB (Center for Genomics and Bioinformatics) and others.
In this work we are studying how to detect semantically meaningless documents from meaningful ones (inauthentic vs. authentic). Please visit www.inauthentic.org
to read about and use our system. We have had more than 70K hits to our website. The project is with Wyatt Clarke, James Costello, and Predrag Radivojac.
This application discovers motifs using a form of De Bruijn graphs--what we call "approximate" De Bruijn graphs. Protein sequences are used to build an approximate De Bruijn graph (ADBG) and then conserved paths are identified that correspond to motifs. This work is supported in part by Microsoft
. The results are currently under submission. We have extended this to deal with a
. The project is with Rupali Patwardhan, Haixu Tang, and Sun Kim.
BioKnOT (BioKnOT: Biological Knowledge through Ontologies and TFIDF) is a search tool based on "semantic thumbnails". Users can quickly scan through summaries of information based on the micro-ontologies that capture both major thrusts of documents and the user's interests. This site is temporarily down for upgrade. This is James Costello's Masters Thesis, a web-based text retrieval system that would allow users to accumulate the most timely (meaning up- to-date), most relevant search (articles within their search criteria). The system downloads journal articles from the the web and then, based on term frequency and citation information, constructs a hierarchical time-based graph. From this graph, we can then find the most timely and relevant articles to the user's search.
A biodiversity database brings together information about organism so that better conservation can be enacted. The challenge is to encourage data sharing among scientists who traditionally have no impetus to share data. This project, headed by Sukamol Skirwan Jakobsson, is builing such a database with incentives to share data. The project is with Sukamol Jakobsson, Markus Jakobsson, and Andrew Albrecht.
Paleoinformatics is the application of IT to paleontology. Our group, Dr. Claudia Johnson, Dr. Erika Elswick, and I are currently in the first phase of a system that uses RFID tags attached to type collection specimens to make them "smart fossils". We received a small multidisciplinary grant from Indiana University and have been begun building the system with this seed money. The project also began with a bioinformatics capstone project whom we three supervised by Troy Campbell. See the presentation here: PPT
. This project is with Claudia Johnson, Erika Elswick, and Andrew Albrecht.
MS spectra are mass/charge signals that are generated from molecular fragments. We are currently studying how to predict these signals a priori
from linear structures (smile strings). The work is funded by Microsoft using SQL Server 2005 and .NET technologies.
. Our main server is Portal
. The project is with Haixu Tang, Andy Lin, Yehia Mechref, and Helena Soini.
Circle is a probabilistic classifier. Codes and information about running Circle can be found here
. For the time being, Circle binaries can be downloaded and the README file in the distribution should have all the information you need to run the program. The zip file includes the current binary and documentation for the command line interface. We are currently in the process of developing a web-based interface for Circle, and you can try it out at the pre-production Circle site. We are running Circle using SQL Server 2005, C#, and .NET technologies