Indiana University Bloomington

School of Informatics and Computing



Colloquia

Back to Colloquia Archive

Machine learning and Pasteur’s Quadrant in CS: Research with Relevance as well as Rigor

by David Waltz

Columbia University

Date
Friday, April 24, 2009
Time
3:00 p.m. — 4:00 p.m.
Place
Lindley Hall

Abstract: CS research has traditionally been curiosity-driven, emphasizing rigor over relevance. Application-driven CS research, emphasizing engineering for specific tasks, carries far less academic prestige. In his 1997 book, Pasteur’s Quadrant: Basic Science and Technological Innovation, Donald Stokes argues that, contrary to usual assumptions, these two alternatives are not the ends of a spectrum—characterized at one end by Edison (relevance without rigor) and Bohr (rigor without relevance)—but that relevance and rigor are in fact orthogonal dimensions. Using this insight he makes the case for research that is high in both these dimensions (Pasteur’s Quadrant). Much of the research in Machine Learning (ML) has in fact been from Pasteur’s Quadrant—e.g. algorithms for character recognition, recommender systems, web search, protein structure prediction, etc. This talk will present ML research from CCLS - Columbia’s Center for Computational Learning Systems—that strives for both rigor and relevance in three main areas: 1) learning systems for predictive maintenance for the electric power grid, largely done in conjunction with Con Edison, 2) learning to translate natural language, with a concentration on translating to and from Arabic-standard as well as dialects, and 3) predicting epileptic seizures using implanted electrode arrays. This talk will use these examples—along with others from CS—to argue for exploring Pasteur’s Quadrant in CS. It will also discuss some of the special challenges such work entails. For example, challenging applications typically involve large amounts of data that requires large—and less academically rewarding—efforts in data cleaning and systems engineering in addition to driving research on understanding, new algorithms and valuable applications.

Biography: David L. Waltz has been Director of the Center for Computational Learning Systems (CCLS) at Columbia University since 2003. Dr. Waltz received all his degrees from MIT, including his Ph.D. for work at the MIT AI Lab. His thesis on computer vision originated the field of constraint propagation, and with Craig Stanfill, he originated the field of memory-based reasoning branch of CBR (Case-Based Reasoning). He was formerly President of the NEC Research Institute in Princeton, and from 1984-1993 was Director of Advanced Information Systems at Thinking Machines Corporation and Professor of Computer Science at Brandeis University. He had also been Professor of Electrical and Computer Engineering at the University of Illinois (CSL and ECE Department) for 11 years. Waltz served as president of AAAI (American Association for Artificial Intelligence) from 1997-1999, and is a Fellow of AAAI and ACM (Association for Computing Machinery), a Senior Member of IEEE (Institute for Electrical and Electronics Engineers), and former Chairman of ACM SIGART (Special Interest Group on Artificial Intelligence). He is on the Advisory Board for IEEE Intelligent Systems, and the Computing Community Consortium Board of the CRA (Computing Research Association), and NSF Computer Science Advisory Board. His current primary research interest is in machine learning applications, especially to the electric power grid. His research interests have also included massively parallel information retrieval, data mining, learning and automatic classification with applications protein structure prediction, and natural language processing.

Colloquium Provided By:

the School of Informatics