Copyright 2000 Wallace Koehler All Rights Reserved

|Innovative Cataloging|concept|relationship|charateristic|

Relationships

Site Maps

A site map is provided. Site maps come in two general forms. The first maps the files or folders on the site, just as this one does:

site map

The second map type is less common. It maps the hypertext links from and to the pages on the site. It can tell a very different story than the folder map. Rather than mapping the placement of general concepts, it shows the interconnections created by the Web author between and among specific ideas. For example, a folder map may identify the "location" within the site of a bibliography but the link map connects to specific citations within that bibliography.

Both of these are important. For this site, the bibliography is placed at the top level. It is housed in the top folder. That may indicate that the author feels that the bibliography is either all site inclusive or that it is particularly important as a site tool. There are also links from various pages to specific citations within the bibliography. These links give no indication of the "general" importance of the bibliography per se, but are testimony to the importance of the specific reference.
 
 

Degrees of Separation

There is an interesting game that argues that no one human being is any further from another than by six degrees of separation. This is the classic situation of: "I know someone who knows someone who knows someone....." For example, everyone on the University of Oklahoma campus is probably no more than three degrees of separation from the President of the United States. "How is that?" you might ask. Every student at OU knows somebody on its faculty. Some of those people know David Boren. And if they do not know the University president, they know someone who does. Boren knows that other President. As a result, everyone on the OU campus is no further than four degrees of separation from nearly every leader in most countries of the world. That and a couple of dollars will get you a latte ....

Note that for the degrees of separation concept to work, you don't have to know the next contact well or even like him or her. By some rules, having met that person once is enough.

The WWW can also be classified according to degrees or separation. It is estimated that Web pages are separated by no more than nineteen "clicks of separation" (see http://www.seattlep-i.com/local/clik07.shtml). Distance is ambiguous in cyberspace. We could measure the physical distance (kilometers or "klicks"). But that means little. We can measure "clicks" or the minimum number of hypertext links required to move from page A to page B.

I suggest we can also measure "cliques." By that I mean, how are Web pages or Web sites clustered with each other. Re-examine the site map for this course. Is there any degree of "cliquishness" or similarity of subject matter by which the site is organized?

Following this logic, the concept of degrees of separation grows more complex. "I know someone who knows someone" is a legitimate as "I know an old woman who swallowed a frog...." It suggests that any associational process is a legitimate organizing principle and that distances between those concepts can be captured and used for information classification and retrieval.

Heterogeneous Concept Maps

Michael Buckland and others point to " an increasing population of heterogeneous repositories." These necessarily require maps across thesauri or vocabularies to permit cross interpretation and information retrieval. Work is being done to develop methodologies that permit the use of metadata dictionaries and index terms across fields. For example, the Entry Vocabulary Modules (EVMs) concept (Norgard) are described as "...association dictionaries that facilitate use of unfamiliar metadata vocabularies such as the classification schemes, thesauri, and vocabularies that are used to index bibliographic and other types of databases ..."
 

Concept Clustering

Not only can words be captured and indexed, so can concepts or "By that I mean ..." The search engine Northernlight
is probably the first successful commercial venture to categorize Web documents by concept. Northernlight creates on the fly folders for its Web page retrieval set. These are grouped according to several different criteria including subject, domain, date, and  so on. Their stated and somewhat tongue-in-cheak goal is
"As a leading Internet search engine and the developer of the world's first research engine, our modest ambition, according to our CEO, David Seuss, is 'to index and classify all human knowledge to a unified consistent standard and make it available to everyone in the world in a single integrated search.'" [cite]
If they were to succeed, Northernlight would become the index to World Brain, that global encyclopedia first envisaged by H.G. Wells in 1938.

Northernlight does several things other search engines do not. First, they provide access not only to Web documents but also to their Special Collection of digitized articles and reports. This latter group is provided on a fee paid basis. Their second innovation is the folders. NorthernLight categorizes the return set from a search into folders. These folders are relevance arranged according to hit set subject, type, source, and language. [cite]

Classification by concept is not new. What if I said to you "I'm feeling blue." Does that mean I'm a Druid? Or, I'm depressed. Or I'm suffering from inadequate oxygenation? Or that I'm reaching for the sky? In order to understand the phrase "I'm feeling blue" additional context is needed. KWIC (key word in context) is one way to provide the context. Capture terms surrounding the target term and perhaps meaning will emerge. That often works but is by no means perfect. "When I feel blue [depressed] I go outside because when I feel blue [the sky] I'm in the pink. But when skies are gray I cannot breathe and I feel blue [hypoxia]." Go forth and mix that metaphor.
 

Conceptual Browsing

Kent and Neuss have proposed a search technology called WAVE. WAVE is described as a referential classification system, a type of faceted classification. Referential classification is defined as "... a pragmatic and empirical system in which objects are related with reference to a chosen collection of conceptual scales. In referential classification, various external relations, user preferences, and the environment, are all important to the act of interpretation and classification." [cite] They discuss various metadata elements WAVE might capture and utilize.

WAVE is not the only conceptual browser proposed, considered, evaluated, accepted or rejected. See, for example:

Cooperative Agents for Conceptual Search and Browsing of World Wide Web Resources: Local Characterizing Agent Available: http://homer.ittc.ukans.edu/website/agents/lca/architecture.html.

The points to understand here is that (1) most search and retrieval systems are based on traditional methodologies, (2) technology permits us to easily extend that knowledge to practical applications, (3) but real innovation may come in the way the information is presented and interpreted. We've come a long way from the card catalog in just over a decade. And, it's not over yet.


|Innovative Cataloging|concept|relationship|charateristic|