
|Innovative Cataloging|concept|relationship|charateristic|
The University of Waterloo Scholarly Societies Project is to be noted for a second of their information management policies. Until November 1998, the Project maintained an alphabetical index of society names. It has since abandoned that in favor of a search engine limited to the site. The Project has retained its subject guide that spans agriculture to women's issues (http://www.lib.uwaterloo.ca/society/subjects_soc.html).
Finally, the Project published a Stability Index. It is built by assigning values to the form of the URL: a canonical URL receives the value 1.0, an association URL that does not contain the association name in any form is given a zero. Variations between these two receive values based on their perceived stability. Values by subject area are then calculated by deriving the mean value [cite]. According to the Web site, the actual stability and predictive value of the index has yet to be tested; but a protocol has been developed and a test is planned. Library and information science associations have been accorded an index value of 61.2% and are found in the second half of the associational listings. The Project reports a pattern for all associational groupings: "professions are most likely to have permanent URLs, followed by the sciences and social sciences, and then by the humanities." [cite]
The stability reported by the Scholarly Societies Project results from two factors. The first is the assumption that organizations that have obtained a canonical URL are more likely to continue their existence. The second is that canonical URLs are inherently more stable than others. There are two reasons for this assumption. The first is once an association has adopted a URL for itself in the most abbreviated form (www.ala.org is as simple a URL that the American Library Association might have and still indicate identity). Remember too that URLs are portable. They can be moved from host to host. The underlying IP number may change, but the URL need not. remember too that PURLs can lend an additional element of stability.
Finally, my own research suggests that there is some truth to the stability assumptions made by the Scholarly Societies Project. Canonical or near canonical URLs do appear to have greater lasting power than do others.
The same holds for ccTLDs. They draw upon ISO3166 which defines (among others) two-letter codes for countries and other regions of the world. Thus Australia is indicated by .au, Chile by .cl, France by .fr, Malaysia by .my, United States by .us, and South Africa by .za. Again there is some erosion, but the ccTLD gives good indication of the geographic source of the information.
The second URL fragment, second from the right is the second level domain or 2LD. These take several forms. In the ccTLD environment they may represent in some countries functional tags (that parallel the gTLD nomenclature). Thus academic institutions in the United Kingdom are designated ac.uk, Japanese organizations are or.jp, Mexican government pages carry .gob.mx (gob for gobierno) and so on. They and gTLD 2LDs may also carry various names identifying the entity. These include trademarks, initials, and names. Thus, the University of Oklahoma is identified as ou.edu.
This is a fairly simplistic explanation of domain names. But suffice it to say that these fragments can be dissected and used for cataloging purposed. They may not be precise, but they do give some sense of what the site is about and often where it originated.
URLs also carry file information. These files were named by the site creator and may be anything. These file names often carry with them a certain logic and it has been suggested that these names might also be captured as part of the cataloging process. Take a look at the file structure for this course. It is reflected in the site map. Is there any sense to it?
This is discussed also under URx.
For further discussion see:
Marisa Urgo, "A Shape for Internet Information: An Alternative Metaphor for Web Site Information" a paper presented at the ASIS Annual Meeting, Pittsburgh, PA, October 1998.
Wallace Koehler and Logan Barnett, "Domain Name Searching and World
Wide Web Search Tactics," Searcher 6, 2,
(February1998), pp 54-62.
Wallace Koehler, "Unraveling the Issues, Actors, & Alphabet Soup of the Great Domain Name Debates" Searcher 7, 5 (May 1999). Available: http://www.infotoday.com/searcher/may99/koehler.htm
Wallace Koehler "Classifying Websites and Webpages:
The Use of Metrics and URL Characteristics as Markers," Journal of
Librarianship and Information Studies 31, 1 (March
1999), pp 21-31.
Usability is the subject of an important new book. I suggest you take
a look at Jakob Nielsen's Designing Web Usability: The Practice of Simplicity
before
designing Web documents.
I think there are three not mutually exclusive ways to handle this change from the cataloging perspective. The first is to catalog in a very general way. By that I mean, avoid specific characterization of the page or site content. Abstracts should be just that, very abstract and general.
The second option I have suggested is to capture data on the degree of change specific Web documents undergo and to provide information as part of their bibliographic record. This is not a particularly difficult exercise, but it does require frequent monitoring of the Web document, data capture, and repopulating of the bibliographic rate of change metadata field. I have, for lack of a better term, labeled this change rate "omega."
The third option is to perform extensive recataloging
as frequently as Web document changes or based on some other change criterion.
Further research is needed to establish what those other change criteria
and whether it is feasible to follow that strategy.