The aim of this research is to establish a coherent framework for data mining in the relational model.
Observing that data mining depends on two partitions, the classifier and the estimator, this paper
defines the classifier=estimator (CE) framework. The classifier indicates the target of the data mining
investigation. The classifier may be diffcult to express from the relational instance or may involve
an oracle beyond the extant data. The estimator is typically simply expressible using the relational
instance. The degree to which the estimator refines the classifier partition can be used to measure how
well the data instance matches the concept being investigated.
The CE framework is shown to generalize a variety of data mining and database concepts, including
rough sets, functional dependency, multivalued dependency, and association rules. Furthermore, the CE
framework suggests a wider range of data mining questions. The CE framework is shown to naturally
express qualitative and quantitative measures of the quality of approximation. Additionally, the CE
framework allows a question to be posed at a number of different conceptual scopes from local to global