Citation: Rocha, Luis M. ."TalkMine: A Soft Computing Approach to Adaptive Knowledge Recommendation". In: Soft Computing Agents: New Trends for Designing Autonomous Systems. Vincenzo Loia and Salvatore Sessa (Eds.). Series on Studies in Fuzziness and Soft Computing. Physica-Verlag, Springer, pp. 89-116. LAUR-00-4914
The full paper is available in Adobe Acrobat (.pdf) format. Due to mathematical formalisms not suitable to an HTML format, only the first introductory sections (and references) are available here in HTML.
We present a soft computing recommendation system named TalkMine, to advance adaptive web and digital library technology. TalkMine leads different databases or websites to learn new and adapt existing keywords to the categories recognized by its communities of users. It uses distributed artificial intelligence algorithms and soft computing technology. TalkMine is currently being implemented for the research library of the Los Alamos National Laboratory under the Active Recommendation Project
TalkMineis based on the integration of distributed knowledge networks using Evidence Sets, an extension of fuzzy sets. The identification of the interests of users relies on a process of combining several fuzzy sets into evidence sets, which models an ambiguous "and/or" linguistic expression. The interest of users is further fine-tuned by a human-machine conversation algorithm used for uncertainty reduction. Documents are retrieved according to the inferred user interests. Finally, the retrieval behavior of all users of the system is employed to adapt the knowledge bases of queried information resources. This adaptation allows information resources to respond well to the evolving expectations of users.
In this article the distributed architecture of TalkMine is presented together with a description of its implementation in the Active Recommendation Project. In particular, the characterization of information resources as interacting distributed memory banks is presented. Evidence sets and the operations to produce them from several fuzzy sets are detailed. The conversation and adaptation algorithms used by TalkMine to interact automatically with users is described.
Keywords: Recommendation Systems, Information Retrieval, Web-related technologies, Fuzzy Set Theory, Evidence Sets, Measures of Uncertainty, Collaborative Systems, Adaptive Systems, Distributed Artificial Intelligence, Human-machine Interaction, Communities of Agents, Knowledge Representation, Soft Computing.
Distributed Information Systems (DIS) are collections of electronic networked information resources (e.g. databases) in some kind of interaction with communities of users; examples of such systems are: the Internet, the World Wide Web, corporate intranets, databases, library information retrieval systems, etc. DIS serve large and diverse communities of users by providing access to a large set of heterogeneous electronic information resources. Information Retrieval (IR) refers to all the methods and processes for searching relevant information out of information systems (isolated or part of DIS) that contain extremely large numbers of documents. As the complexity and size of both user communities and information resources grows, the fundamental limitations of traditional information retrieval systems have become evident in modern DIS.
Traditional IR systems are based solely on keywords that index (semantically characterize) documents and a query language to retrieve documents from centralized databases according to these keywords - users need to know how to "pull" relevant information from passive databases. This setup leads to a number of flaws (Rocha and Bollen, 2000), which prevent traditional IR processes in DIS to achieve any kind of interesting coupling with users. The human-machine interaction observed in these systems is particularly rigid: Most cannot pro-actively "push" relevant information to its users about related topics that they may be unaware of, there is typically no mechanism to exchange knowledge, or crossover of relevant information among users and information resources, and there is no mechanism to recombine knowledge in different information resources to infer new linguistic categories of keywords used by evolving communities of users. In other words, traditional IR keeps DIS as static, passive, and isolated repositories of data; no interesting human-machine co-evolution of knowledge or learning is achieved.
The limitations of traditional IR and DIS are even more dramatic when contrasted with biological distributed systems such as immune, neural, insect, and social networks. Biological networks function largely in a distributed manner, without recourse to central controllers, while achieving tremendous ability to respond in concerted ways to different environmental necessities. In particular, they are typically endowed with the ability to elicit appropriate responses to specific demands, to transfer and process relevant information across the network, and to adapt to a changing environment by creating novel behaviors (often from recombination of existing ones). These abilities are precisely what has been lacking in IR.
Biological networks effectively evolve in an open-ended manner; we are interested in endowing DIS with a similar open-ended capacity to evolve with their users - to achieve an open-ended semiosis with them (Rocha, 2000). In biology, open-ended evolution originates from the existence of material building blocks that self-organize non-linearly (e.g. Kauffman, 1993) and are combined via a specification control, such as the genetic system (Rocha, 1998). In contrast, computer systems were constructed precisely with rigid building blocks constrained in such a way as to allow minimum dynamic self-organization and maximum programmability, which results in no inherent evolvability (Conrad, 1990). Therefore, to attain any evolvability in current digital computer systems, we need to program in some "softer" building blocks that can be used to realize the kind of dynamical richness we encounter in biological systems (Rocha and Bollen, 2000).
The ultimate goal of IR is to produce or recommend relevant information to users. It seems obvious that the foundation of any useful recommendation should be first and foremost based on the identification of users and subject matter. In this sense, the goal of recommendation systems can be seen as similar to that of most biological systems, in particular immune systems: to recognize agents (users) and elicit appropriate responses from components of the distributed information network. Furthermore, the information network should learn and adapt to the community of agents (users) it interacts with - its environment.
Nevertheless, traditional IR does not identify users and classifies subjects only with unchanging keywords and categories. To build more flexible IR and evolving DIS, we need to design recommendation systems endowed with:
Below I describe efforts to include these design requirements for recommendation systems using Soft Computing technology. I also discuss how a useful and more natural knowledge management of DIS is achieved with these soft computing designs. Let us start with some background on IR and recommendation systems.
New approaches to IR have been proposed to improve its inflexible algorithms. Active recommendation systems, also known as Active Collaborative Filtering (Chislenko, 1998) or Knowledge Self-Organization (Johnson et al, 1998) are IR systems which rely on active computational environments that interact with and adapt to their users. They effectively "push" relevant information to users according to previous patterns of IR or individual user profiling.
Recommendation systems are typically based on human-machine interaction mediated by intelligent agents, or other decentralized components, and come in several varieties:
Content-based systems depend on single user profiles, and thus cannot effectively recommend documents about previously unrequested content to a specific user. That is, these systems cannot compare and recommend related documents characterized by keywords not previously collected into a given user's profile. Conversely, pure collaborative systems, match only the profiles of users that (to a great extent) have requested exactly the same documents; for instance, different book editions or movie review web sites from different news organizations may be considered distinct documents.
The shortcoming of structural approaches is that they assume that the existing, often static, structure of an information resource contains all the relevant knowledge to be discovered. However, it is often the case that such structure is very poorly designed. On the web in particular, the hypertext links are often not created between important documents, due perhaps to the hurried way in which web sites are created. Indeed, the Web is often more a repository of isolated documents, than a good example of a hypertext fabric. The same applies to the keyword/document relations necessary for LSI.
Collective approaches have the important advantage of adapting to the collective behavior of users, even as it develops in time. This way, a poor initial structure can improve, by creating, strengthening or weakening associations among documents or between documents and keywords. Furthermore, collective recommendation systems can operate without storing individual profiles, thus offering a more private platform for recommendation. Indeed, recommendations are issued according to the adapted structure of the information resources, not according to user profiles. Users can be seen as anonymous social agents. Furthermore, as we shall discuss later, the adapted information resources allow us to capture the knowledge traded by a community of agents. Nonetheless, a disadvantage of collective approaches is that they implement a positive feedback with their communities of users, possibly leading to an excessive adaptation to the interests of a majority of users, thus reducing the diversity of knowledge by recommending only the most retrieved documents in a given area: e.g. the "best of" lists found at Web sites such as Amazon.com - this is the so-called "curse of averages".
It is clear that good recommendation systems require aspects of all approaches to avoid the shortcomings of each individual one. This is the case, for instance, of Fab (Balabanovi and Shoham, 1997) and Amalthaea (Moukas and Maes, 1998), which are both content and collaborative recommendation systems. This way they can discover similar users who have not simply retrieved many of the same exact documents, but documents characterized by many of the same keywords. Furthermore, keywords from documents that users have not actually retrieved, may be added to their profiles because they belong to the profiles of other similar users.
Still, neither Fab nor Amalthaea (nor similar systems) adapt the structure of their information resources with collective user behavior, nor do they use the data-mining techniques of structural algorithms to characterize the knowledge those store. In this sense, they cannot capture the evolving nature of the knowledge of communities of users. In other words, even though they are able to characterize the interests of individual users (both with documents and keywords), the structure of information resources (e.g. Web hyperlink structure or document/keyword matrix) remains unchanged. Furthermore, they rely on individual user profiles, and there is also not an explicit means to discover the knowledge categories that particular communities of users employ. Next I describe the Active Recommendation Project (Rocha and Bollen, 2000) which is building a hybrid Collective/Structural/Content recommendation system designed precisely to tackle these issues. Namely, to adapt information resources to their evolving communities of users, to characterize the knowledge stored in these information resources, and to preserve diversity while not accumulating private user profiles.
The Active Recomendation Project (ARP), part of the Library Without Walls Project, at the Research Library of the Los Alamos National Laboratory is engaged in research and development of recommendation systems for digital libraries. The information resources available to ARP are large databases with academic articles. These databases contain bibliographic, citation, and sometimes abstract information about academic articles. Typical databases are SciSearch® and Biosis®; the first contains articles from scientific journals from several fields collected by ISI (Institute for Scientific Indexing), while the second contains more biologically oriented publications. We do not manipulate directly the records stored in these information resources, rather, we created a repository of XML (about 3 million) records which point us to documents stored in these databases (Rocha and Bollen (2000).
We have compiled relational information between records (1) and keywords and among records: the semantics and the structure respectively. The semantics is formalized as a very sparse Keyword-Record Matrix A. The structure is formalized as the very sparse Citation Matrix C, which is a record-record matrix (details in Rocha and Bollen, 2000). From these matrices, we have calculated additional matrices holding measures of closeness between records and between keywords: the Inwards Structural Proximity Matrix or co-citation (Small, 1973), the Outwards Structural Proximity Matrix or bibliographic coupling (Kessler, 1963), the Record Semantic Proximity Matrix (for any two records it is defined by the number of keywords that qualify both, divided by the number of keywords that qualify either one), and the Keyword Semantic Proximity Matrix (for two keywords, it is the number of records they both qualify, over the number of records either one qualifies).
These matrices holding measures of closeness, formally, are proximity relations (Klir an Yuan, 1995; Miyamoto , 1990) because they are reflexive and symmetric fuzzy relations. Their transitive closures are known as similarity relations (Ibid). The collection of this relational information, all the proximity relations as well as A and C, is an expression of the particular knowledge an information resource conveys to its community of users. Notice that distinct information resources typically share a very large set of keywords and records. However, these are organized differently in each resource, leading to different collections of relational information. Indeed, each resource is tailored to a particular community of users, with a distinct history of utilization and deployment of information by its authors and users. For instance, the same keywords will be related differently for distinct resources. Therefore, we refer to the relational information of each information resource as a Knowledge Context. We do not mean to imply that information resources possess cognitive abilities. Rather, we note that the way records are organized in information resources is an expression of the knowledge traded by its community of users. Records and keywords are only tokens of the knowledge that is ultimately expressed in the brains of users. A knowledge context simply mirrors some of the collective knowledge relations and distinctions shared by a community of users.
In (Rocha and Bollen, 2000) we have discussed how these proximity relations are used in ARP. However, the ARP recommendation system described in this article (TalkMine) requires only the Keyword Semantic Proximity (KSP) matrix, obtained from A by the following formula:
The semantic proximity between two keywords, ki and kj, depends on the sets of records indexed by either keyword, and the intersection of these sets. N(ki) is the number of records keyword ki indexes, and N(ki, kj) the number of records both keywords index. This last quantity is the number of elements in the intersection of the sets of records that each keyword indexes. Thus, two keywords are near if they tend to index many of the same records. Table I presents the values of KSP for the 10 most common keywords in the ARP repository.
From the inverse of KSP we obtain a distance function between keywords:
d is a distance function because it is a nonnegative, symmetric real-valued function such that d(k, k) = 0. It is not an Euclidean metric because it may violate the triangle inequality: d(k1, k2) ≤ d(k1, k3) + d(k3, k2) for some keyword k3. This means that the shortest distance between two keywords may not be the direct link but rather an indirect pathway. Such measures of distance are referred to as semi-metrics (Galvin and Shore, 1991).
Users interact with information resources by retrieving records. We use their retrieval behavior to adapt the respective knowledge contexts of these resources (stored in the proximity relations). But before discussing this interaction, we need to characterize and define the capabilities of users: our agents. The following capabilities are implemented in enhanced "browsers" distributed to users.
Regarding point 2, the history of IR, notice that the same user may query information resources with very distinct sets of interests. For example, one day a user may search databases as a biologist looking for scientific articles, and the next as a sports fan looking for game scores. Therefore, each enhanced browser allows users to define different "personalities", each one with its distinct history of IR defined by independent knowledge contexts with distinct proximity data (see Figure 1).
Because the user history of IR is stored in personal browsers, information resources do not store user profiles. Furthermore, all the collective behavior algorithms used in ARP do not require the identity of users. When users communicate (3) with information resources, what needs to be exchanged is their present interests or query (1), and the relevant proximity data from their own knowledge context (2). In other words, users make a query, and then share the relevant knowledge they have accumulated about their query, their "world-view" or context, from a particular personality, without trading their identity. Next, the recommendation algorithms integrate the user's knowledge context with those of the queried information resources (possibly other users), resulting in appropriate recommendations. Indeed, the algorithms we use define a communication protocol between knowledge contexts, which can be very large databases, web sites, or other users. Thus, the overall architecture of the recommendation systems we use in ARP is highly distributed between information resources and all the users and their browsing personalities (see Figure 2).
The collective behavior of all users is also aggregated to adapt the knowledge contexts of all intervening information resources and users alike. This open-ended learning process (Rocha, 2000) is enabled by the TalkMine recommendation system described below.
TalkMine is both a content-based and collaborative recommendation system based on a model of linguistic categories (Rocha, 1999), which are created from conversation between users and information resources and used to re-combine knowledge as well as adapt it to users. The model of categorization used by TalkMine is described in detail in (Rocha, 1997a, 1999, 2000). Basically, as also suggested by Clark (1993), categories are seen as representations of highly transient, context-dependent knowledge arrangements, and not as model of information storage in the brain. In this sense, in human cognition, categories are seen as linguistic constructs used to store temporary associations built up from the integration of knowledge from several neural sub-networks. The categorization process, driven by language and conversation, serves to bridge together several distributed neural networks, associating tokens of knowledge that would not otherwise be associated in the individual networks. Thus, categorization is the chief mechanism to achieve knowledge recombination in distributed networks leading to the production of new knowledge (Rocha, 1999, 2000).
TalkMine applies such a model of categorization of distributed neural networks driven by language and conversation to DIS and recommendation systems. Instead of neural networks, knowledge is stored in information resources, from which we construct the knowledge contexts with respective proximity relations described in section 2. TalkMine is used as a conversation protocol to categorize the interests of users according to the knowledge stored in information resources, thus producing appropriate recommendations and adaptation signals.
A knowledge context of an information resource (section 2.1) is not a connectionist structure in a strong sense since keywords and records are not distributed as they can be identified in specific nodes of the network (van Gelder, 1991). However, the same keyword indexes many records, the same record is indexed by many keywords, and the same record is typically engaged in a citation (or hyperlink) relation with many other records. Losing or adding a few records or keywords does not significantly change the derived semantic and structural proximity relations (section 2) of a large network. In this sense, the knowledge conveyed by such proximity relations is distributed over the entire network of records and keywords in a highly redundant manner, as required of sparse distributed memory models (Kanerva, 1988). Furthermore, Clark (1993) proposed that connectionist memory devices work by producing metrics that relate the knowledge they store. As discussed in section 2, the distance functions obtained from proximity relations are semi-metrics, which follow all of Clark's requirements (Rocha, 2000). Therefore, we can regard a knowledge context effectively as a distributed memory bank. Below we discuss how such distributed knowledge adapts to communities of users (the environment) with Hebbian type learning.
In the TalkMine system we use the KSP relation (formula (1)) from knowledge contexts. It conveys the knowledge stored in an information resource in terms of a measure of proximity among keywords. This proximity relation is unique to each information resource, reflecting the semantic relationships of the records stored in the latter, which in turn echo the knowledge of its community of users and authors. TalkMine is a content-based recommendation system because it uses a keywords proximity relation. Next we describe how it is also collaborative by integrating the behavior of users. A related structural algorithm, also being developed in ARP, is described in (Rocha and Bollen, 2000).
Balabanovi, M. and Y. Shoham (1997)."Content-based, collaborative recommendation." Communications of the ACM. March 1997, Vol. 40, No.3, pp. 66-72.
Berry, M.W., S.T. Dumais, and G.W. O'Brien (1995)."Using linear algebra for intelligent information retrieval." SIAM Review. Vol. 37, no. 4, pp. 573-595.
Bollen, J. and F. Heylighen (1998)."A system to restructure hypertext networks into valid user models." The New Review of Hypermedia and Multimedia. Vol. 4.
Brusilovsky, P., A. Kobsa, and J. Vassileva (Eds.) (1998). Adaptive Hypertext and Hypermedia Systems. Kluwer Academic Publishers, Dordrecht, The Netherlands.
Chakrabarti, S. et al (1999)."Mining the Web's link structure." Computer. Vol. 32, No.8, pp. 60-67.
Chislenko, Alexander (1998)."Collaborative information filtering and semantic transports." In: . . WWW publication: http://www.lucifer.com/~sasha/articles/ACF.html.
Clark, Andy (1993). Associative Engines: Connectionism, Concepts, and Representational Change. MIT Press.
Conrad, Michael (1990)."The geometry of evolutions." BioSystems. Vol. 24, pp.61-81.
Eklund, J. (1998)."The value of adaptivity in hypermedia learning environments: a short review of empirical evidence." In: Proceedings of the 2nd Workshop on Adaptive Hypertext and Hypermedia (Hypertext 98). P. Brusilovksy and P. de Bra (Eds.). pages 13-21, Pittsburgh, USA, June 1998.
Galvin, F. and S.D. Shore (1991)."Distance functions and topologies." The American Mathematical Monthly. Vol. 98, No. 7, pp. 620-623.
Good, N. et al (1999)."Combining collaborative filtering with personal agents for better recommendations." In: Proceeding of the National Conference on Artificial Intelligence. . Orlando, Florida, July 1999. AAAI, pp. 439-446.
Harman, D. (1994)."Overview of the 3rd Text Retrieval Conference (TREC-3). ." In: Proceedings of the 3rd Text Retrieval Conference. . Gaithersburg, Md, November 1994..
Herlocker, J.L. (1999)."Algorithmic framework for performing collaborative filtering." In: Proceedings of the 22nd International Conference on Research and Development in Information Retrieval. . Berkeley, California, August 1999. ACM, p.. 230-237.
Heylighen, Francis (1999)."Collective Intelligence and its Implementation on the Web: Algorithms to Develop a Collective Mental Map." Computational & Mathematical Organization Theory. Vol. 5, no. 3, pp. 253-280.
Hill, W. et al (1995)."Recommending and evaluating choices in a virtual community of use." In: Conference on Human Factors in Computing Systems (CHI'95). Denver, May, 1995.
Johnson, N., S. Rasmussen, C. Joslyn, L. Rocha, S. Smith, and M. Kantor (1998)."Symbiotic intelligence: self-organizing knowledge on distributed networks, driven by human interaction." In: Proceedings of the 6th International Conference on Artificial Life. C. Adami, R. K. Belew, H. Kitano, C. E. Taylor (Eds.). MIT Press, pp. 403-407.
Kanerva, P. (1988). Sparse Distributed Memory. MIT Press.
Kannan, R. and S. Vempala (1999)."Real-time clustering and ranking of documents on the web." Unpublished Manuscript.
Kauffman, S. (1993). The Origins of Order: Self-Organization and Selection in Evolution. Oxford university Press.
Kessler, M.M. (1963)."Bibliographic coupling between scientific papers." American Documentation. Vol. 14, pp. 10-25.
Kleinberg, J.M. (1998)."Authoritative sources in a hyperlinked environment." In: Proc. of the the 9th ACM-SIAM Symposium on Discrete Algorithms. . pp. 668-677.
Klir, G.J. and B. Yuan (1995). Fuzzy Sets and Fuzzy Logic: Theory and Applications. Prentice Hall.
Klir, George, J. (1993)."Developments in uncertainty-based information." In: Advances in Computers. M. Yovits (Ed.). Vol. 36, pp. 255-332.
Kosko, B. (1992). Neural Networks and Fuzzy Systems: A Dynamical Systems Approach to Machine Intelligence. Prentice-Hall.
Kostan, J.A. et al (1997)."GroupLens: applying collaborative filtering to usenet news." Communications of the ACM. V. 40, No. 3, pp. 77-87.
Krulwich, B. and C. Burkey (1996)."Learning user information interests through extraction of semantically significant phrases." In: Proceedings of the AAAI Spring Symposium on Machine Learning in Information Access. . Stanford, California, March 1996.
Lang, K. (1995)."Learning to filter news." In: Proceedings of the 12th International Conference on Machine Learning. . Tahoe City, California, 1995.
Miyamoto, S. (1990). Fuzzy Sets in Information Retrieval and Cluster Analysis. Kluwer.
Moukas, A. and P. Maes (1998)."Amalthaea: an evolving multi-agent information filtering and discovery systems for the WWW." Autonomous agents and multi-agent systems. Vol. 1, pp. 59-88.
Nakamura, K. and S. Iwai (1982)."A representation of analogical inference by fuzzy sets and its application to information retrieval systems." In: Fuzzy Information and Decision Processes. M.M. Gupta and E. Sanchez (Eds.). North-Holland, pp. 373-386.
Resnick, P. et al (1994)."GroupLens: An open architecture for collaborative filtering of netnews." In: Proceedings of the ACM Conference on Computer-Supported Cooperative Work. . Chapel Hill, North Carolina, 1994.
Rocha, Luis M. and Johan Bollen (2000)."Biologically motivated distributed designs for adaptive knowledge management." In: Design Principles for the Immune System and Other Distributed Autonomous Systems. Cohen I. And L. Segel (Eds.). Santa Fe Institute Series in the Sciences of Complexity. Oxford University Press. In Press.
Rocha, Luis, M. (1994)."Cognitive categorization revisited: extending interval valued fuzzy sets as simulation tools concept combination." Proc. of the 1994 Int. Conference of NAFIPS/IFIS/NASA. IEEE Press, pp. 400-404.
Rocha, Luis M. (1997a). Evidence Sets and Contextual Genetic Algorithms: Exploring Uncertainty, Context and Embodiment in Cognitive and biological Systems. PhD. Dissertation. State University of New York at Binghamton. UMI Microform 9734528.
Rocha, Luis M. (1997b)."Relative uncertainty and evidence sets: a constructivist framework." International Journal of General Systems. Vol. 26, No. 1-2, pp. 35-61.
Rocha, Luis M. (1998)."Selected self-organization and the Semiotics of Evolutionary Systems." In: Evolutionary Systems: Biological and Epistemological Perspectives on Selection and Self-Organization. S. Salthe, G. Van de Vijver, and M. Delpos (eds.). Kluwer Academic Publishers, pp. 341-358..
Rocha, Luis M. (1999)."Evidence sets: modeling subjective categories." International Journal of General Systems. Vol. 27, pp. 457-494.
Rocha, Luis M. (2000)."Adaptive Recommendation and Open-Ended Semiosis ." International Journal of Human -Computer Studies. (In Press).
Shafer, G. (1976). A Mathematical Theory of Evidence. Princeton University Press.
Shardanad, U. and P. Maes (1995)."Social information filtering: Algorithms for automating 'word of mouth'." In: Conference on Human Factors in Computing Systems (CHI'95). . Denver, May, 1995.
Small, H. (1973)."Co-citation in the scientific literature: a new measure of the relationship between documents." Journal of the American Society for Information Science. Vol. 42, pp. 676-684.
Turksen, B. (1986)."Interval valued fuzzy sets based on normal forms." Fuzzy Sets and Systems. Vol. 20, pp. 191-210.
Turksen, I.B. (1996)."Non-specificity and interval-valued fuzzy sets." Fuzzy Sets and Systems. Vol. 80, pp. 87-100.
van Gelder, Tim (1991)."What is the 'D' in 'PDP': a survey of the concept of distribution." In: Philosophy and Connectionist Theory. W. Ramsey et al. Lawrence Erlbaum.
Watts, D. (1999). Small Worlds: The Dynamics of Networks between Order and Randomness. Princeton University Press.
Yager, R.R. (1979)."On the measure of fuzziness and negation. Part I: membership in the unit interval." Int. J. of General Systems. Vol. 5, pp. 221-229.
Yager, R.R. (1980)."On the measure of fuzziness and negation: Part II: lattices." Information and Control. Vol. 44, pp. 236-260.
Zadeh, Lofti A. (1965)."Fuzzy Sets." Information and Control. Vol. 8, pp. 338-353.
1. Records contain bibliographical information about published documents. Records can be thought of as unique pointers to documents, thus, for the purposes of this article, the two terms are interchangeable.