Bioinformatics The life sciences have relied on the so-called scientific method—discovery through a process of observation, hypothesis formulation, data generation, more observation and so on. In these well-established domains, like biology, data play an ancillary role to the hypotheses (and are actually often called “hypothesis driven”). There are long standing traditional as well as practical reasons why data has been subordinate, and we only present a few here: data need not be present during the observation and hypothesis formulation phases of discovery; a surplus of data does usually little to enhance this process of discovery; data are often prohibitively expensive to either produce or gather once, let alone many times; data, once available and then used, seldom are used again. But the life sciences have been experiencing fundamental changes in the last decade brought about by the recent and rapid advancements both in information technology and technology in the broader sense. These changes have culminated in scores of massive information science projects tied together through the internet that share life science data. From the general public's perspective, the most recognizable of these projects is the Human Genome Project led by the National Human Genome Research Institute (NHGI) which has produced drafts of the human genome with its promise, for example, of understanding, anticipating, and treating diseases. A consequence of these changes has been the emergence of new interdisciplinary sciences that form from a confluence of existing parent disciplines and technology—the most notable being ‘bioinformatics’ that combines biology, computer science, statistics, database, and the internet. Like traditional biology, bioinformatics seeks to make discoveries about life, but is remarkable for at several reasons. First, bioinformatics brings with it new challenges that seem to cut across all emerging, technologically driven disciplines—how to cohesively bring together mature disciplines that have not had any history of deep connections. Second, bioinformatics has had profound changes on its parent disciplines—from making reductionists of biologists, to moving from algorithmic design to problem formalization in computer science, to rethinking of the database as a mix of quantitative methods and logic, rather than only the latter. Third, there are pressing problems having to do with the data itself: the enormity 2 of the amount of data, its heterogeneity (both in terms of structure and kind), provenance, noise, integration, resolution, rate of generation and collection, its meaning and usefulness, and management. But perhaps the most profound reason has to do with how science itself is being conducted—bioinformatics has turned the scientific method on its head; data are generated and collected without any explicit preceding hypotheses (often called “technologically driven” data). In fact, even the “flow” breaks down—no longer linear, but from data, to observation, to hypothesis, to data, and so forth. And yet significant biological discoveries are being made. And data has become paramount. What we believe is occurring is that a life science, biology, is being transformed into an information science ushering in wonderful possibilities and equally difficult challenges. As a scientist—an information scientist—it is certainly an exciting time.