course logo


MODULE 3
INTRO
PAGE 1
Spam Indexing
PAGE 2
Mark Up
PAGE 3
MetaTags
PAGE 4
SGML/XML
PAGE 5
Dublin Core
PAGE 6
PICS
PAGE 7
Metametadata

home logo  ..RDF RDF RDF RDF RDF site map
 

"Metametadata"

"Metametadata," to coin a term, provides a vehicle, a mechanism, a crosswalk, to permit one metadata system to "talk" to another.  They are mapping and translation systems. They support interoperability and intersystem translation. Metametadata schemes are not “perfect” in that when mapping from one system to another, frequently information is lost in the translation.

Resource Description Framework (RDF)

 
learning objective
What is metadata? Metametadata?

What is W3C, what do they do? 

What are some current metametadata applications?

Given what you know of metadata, cataloging, and indexing - what are the strengths and weaknesses of the system?

RDF is metametadata. It permits the incorporation of various metadata schemes into a single format. RDF is written in XML and uses the html format. Note Table 1, the Dublin Core and metatag markup for the index page for this course is presented. Table 2 presents the the Dublin Core but not metatag data in RDF markup. Table 3 presents us with the triples for each metadata element. Triples restate in a different format the metadata catagory (creator, date, format, etc), the source URL analyzed, and the value.
 
 
 
"Resource Description Framework (RDF) is a foundation for processing metadata; it provides interoperability between applications that exchange machine-understandable information on the Web."
W3C. http://www.w3.org/TR/PR-rdf-syntax/#intro

The purpose of RDF is for the "metamessage" to carry with it  enough defining information about the metadata (hence to coin a phases metametadata) for cross communication. It is therefore a crosswalk between various metadata systems in addition to a defining metadata scheme.

RDF is defined by "schemas." Dublin Core is one such schema, and note in Table 2 that Dublin Core metadata are so identified, for example the rights statement is rendered "dc:rights="Copyright Wallace Koehler 2000 All Rights". It is concievable that a second metascheme might use the same "rights" label. That second metadata language -- let's call it OUCore -- might use the "rights" field not for a copyright statement, but to define property easements. RDF would interpret the ou:rights and dc:rights statements as representing different metadata catagories and would map them appropriately. Had my Web page carried PICS classifications, RDF would interpret the ratings codes.

Much of the above is derived from: http://www.w3.org/Metadata/Activity.html

Resource Description Framework
According to the World Wide Consortium (W3Ca) "Resource Description Framework (RDF) is a foundation for processing metadata; it provides interoperability between applications that exchange machine-understandable information on the Web." RDF is metametadata. It permits the incorporation of various metadata schemes into a single format. RDF is written in XML and uses the html format.

The purpose of RDF is for the "metamessage" to carry with it enough defining information about the metadata (hence to coin a phases metametadata) for cross communication. It is therefore a crosswalk between various metadata systems in addition to a defining metadata scheme.

RDF is defined by "schemas." Dublin Core is one such schema. For example a rights statement might be rendered in Dublin Core: "dc:rights="Copyright John Doe 2002 All Rights Reserved". It is conceivable that a second metascheme might use the same "rights" label. That second metadata language -- let's call it ACore -- might use the "rights" field not for a copyright statement, but to define property easements. RDF would interpret the a:rights and dc:rights statements as representing different metadata categories and would map them appropriately.
 

RDF Based Systems

Content Management Framework - see http://cmf.zope.org/

UniSys - see http://www.unisys.com/content/default-02.asp

Prowler - see http://www.infozone-group.org/prowlerDocs/html/proposal.html

XCM (aka eXended Content Management) - a cooperative venture among corporate vendors that includes metadata standards for the B-2-C push market. see - http://www.vignette.com/CDA/Site/0,2097,1-1-30-1458-1146-1743,00.html or http://www.planetit.com/techcenters/docs/enterprise_apps_systems-enterprise_apps/news/PIT20001114S0020

RDF Examples

 
RDF- Table 1
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
   <meta name="GENERATOR" content="Mozilla/4.7 [en] (Win95; U) [Netscape]">
   <meta name="DESCRIPTION" content="Web Management Catalog and Management">
   <meta name="KEYWORDS" content="catalog index pic doi dc quality control web management">
   <meta name="Author" content="Wallace Koehler">
   <meta name="DC.Title" content="Web Document Management">
   <meta name="DC.Title.Alternative" content="LIS 5990 Summer 2000">
   <meta name="DC.Creator.PersonalName" content="Koehler, Wallace">
   <meta name="DC.Creator.PersonalName.Address" content="wkoehler@ou.edu">
   <meta name="DC.Subject" content="(SCHEME=LCCS) Z">
   <meta name="DC.Description" content="A library school web-based graduate course on bibliographic management of the WWW">
   <meta name="DC.Publisher" content="School of Library and Information Studies, University of Oklahoma">
   <meta name="DC.Date" content="(SCHEME=ISO8601) 2000-05-01">
   <meta name="DC.Type" content="Text.Index">
   <meta name="DC.Format" content="(SCHEME=IMT) text/html">
   <meta name="DC.Identifier" content="http://www.ou.edu/">
   <meta name="DC.Language" content="(SCHEME=ISO639-1) en">
   <meta name="DC.Coverage" content="metadata">
   <meta name="DC.Rights" content="Copyright Wallace Koehler 2000 All Rights Reserved">
   <meta name="DC.Date.X-MetadataLastModified" content="(SCHEME=ISO8601) 2000-02-05">
   <title>Web  Document Management</title>

 
 
RDF Table 2
<rdf:RDF
  xmlns:biblink="http://biblink.ukoln.ac.uk/metadata/"
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:dc="http://purl.org/dc/elements/1.0/">
  <rdf:Description about="http://www.ou.edu/ "
    dc:title="Web Document Management"
    dc:creator="Wallace Koehler"
    dc:subject="(SCHEME=LCCS) Z"
    dc:description="A library school web-based graduate course
    on bibliographic management of the WWW"
    dc:publisher="School of Library and Information Studies,
    University of Oklahoma"
    dc:date="(SCHEME=ISO8601) 2000-05-01"
    dc:type="Text.Index"
    dc:format="(SCHEME=IMT) text/html"
    dc:language="(SCHEME=ISO639-1) en"
    dc:coverage="metadata"
    dc:rights="Copyright Wallace Koehler 2000 All Rights
    Reserved"
    biblink:Extent="11810 bytes"
  />

 
 
Table 3: Triples of the data model
triple('http://purl.org/dc/elements/1.0/coverage',
       'http://www.ou.edu/',
       'metadata').
triple('http://biblink.ukoln.ac.uk/metadata/Extent',
       'http://www.ou.edu/',
       '11810 bytes').
triple('http://purl.org/dc/elements/1.0/date',
       'http://www.ou.edu/',
       '(SCHEME=ISO8601) 2000-05-01').
triple('http://purl.org/dc/elements/1.0/creator',
       'http://www.ou.edu/',
       'Wallace Koehler').
triple('http://purl.org/dc/elements/1.0/format',
       'http://www.ou.edu/',
       '(SCHEME=IMT) text/html').
triple('http://purl.org/dc/elements/1.0/subject',
       'http://www.ou.edu/',
       '(SCHEME=LCCS) Z').
triple('http://purl.org/dc/elements/1.0/description',
       'http://www.ou.edu/',
       'A library school web-based graduate course  on bibliographic management of the WWW').
triple('http://purl.org/dc/elements/1.0/type',
       'http://www.ou.edu/',
       'Text.Index').
triple('http://purl.org/dc/elements/1.0/title',
       'http://www.ou.edu/',
       'Web Document Management').
triple('http://purl.org/dc/elements/1.0/language',
       'http://www.ou.edu/',
       '(SCHEME=ISO639-1) en').
triple('http://purl.org/dc/elements/1.0/publisher',
       'http://www.ou.edu/',
       'School of Library and Information Studies,  University of Oklahoma').
triple('http://purl.org/dc/elements/1.0/rights',
       'http://www.ou.edu/',
       'Copyright Wallace Koehler 2000 All Rights  Reserved').

RDF consists of three elements: resources, properties and statements. The resource is the thing being described and may include Web pages or parts of Web pages.The follow in Table 4 describes these elements for my ethics page. The rdf  and /rdf headers are marked in red. The resource definition is in orange. This defines the source of the information. Properties, shown in green, describe the resource. Statements are the combination of properties and resources.This is everything between the red mark up. This metadata statement can be read by looking at the page source code (including Dublin Core mark up) at http://www.ou.edu/cas/slis/ethics/EthicsBibOrg.htm
 
Table 4. RDF and the ethics page.
<HTML>
<!--This file created 11:00 AM  2/10/00 by Claris Home Page version 3.0-->
<HEAD>
   <TITLE>Ethics Links to Librarian and Information Manager Associations</TITLE>
   <META NAME=GENERATOR CONTENT="Claris Home Page 3.0">
   <X-CLARIS-WINDOW TOP=0 BOTTOM=435 LEFT=0 RIGHT=787>
   <X-CLARIS-TAGVIEW MODE=minimal>

Note that RDF ignored the metatags, and converted  only the Dublin Core (not shown here).

<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<META name="description" content="library and information science professional organization ethics statements and supporting material">
<META name="keywords" content="professional ethics, standard of practice, librarian, information scientist">
<META name="author" content="Wallace Koehler">
<META name="distribution" content="global">

<rdf:RDF

  xmlns:biblink="http://biblink.ukoln.ac.uk/metadata/"
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:dc="http://purl.org/dc/elements/1.0/"><rdf:Description about="http://books/valosta.edu/mlis/ethics/EthicsBibOrg.htm"
dc:title="Ethics Links to Librarian and Information Manager
    Associations"
    dc:creator="Wallace Koehler"
    dc:subject="professional ethics; standards of practice;
    librarian; information professional"
    dc:description="A catalog of library and information
    science professional associations and related ethics
    statements."
    dc:publisher="Valdosta State University"
    dc:contributor="Kara Whatley"
    dc:date="2000-01-25"
    dc:type="Text"
    dc:format="text/html"
    dc:language="en-us"dc:rights="copyright Wallace Koehler 2000 All Rights
    Reserved"
    biblink:Extent="239400 bytes"
/></rdf:RDF>

Schemas

It appears that RDF will support a variety of XML based Web document metadata schemes. Most descriptions are limited and it should be noted that RDF is not a fully promulgated standard. It is a standard in development. For example, Jenkins et al developed the Wolverhampton Core to describe documents but migrated it to Dublin Core, then converted that to RDF.

A W3C publication describes RDF Schemas for two mark ups: XML Serialization and Dublin Core. We must therefore consider RDF a standard in progress, but one that will be more fully developed over time. If indeed it becomes a metametadata standard, it could significantly improve the transfer of metadata and therefore document description among Web systems.

An RDF converter is available at http://www.ukoln.ac.uk/cgi-bin/dcdot.pl?biblinkmode=on

Z39.50

The Z39.50 standard (ISO 23950: "Information Retrieval (Z39.50): Application Service Definition and Protocol Specification and ANSI/NISO Z39.50) is one of the first metametadata standards to support information markup interoperability. Z39.50 therefore defines system characteristics. It also provides a crosswalk between different markup schemes, as between USMARC and UNIMARC for example.

ZSTARTS is a simplified variation on the Z39.50 standard designed for Web-based applications (Denenberg 1996). It can be used to support searches in metasearch engines and results filtering.
 

Metadata Encoding and Transmission Standard

The Metadata Encoding and Transmission Standard (METS) is a US Library of Congress supported standard to facilitate document description including markup language. A METS record consists of four sections (1) document metadata, (2) administrative metadata, (3) file groups, and (4) the structural map. Document metadata – metadata that describe document content – may be stored remotely or imbedded in the document. The administrative metadata describe rights, provenance, and other related information. File groups list all electronic files that define the document. The structural map defines the hierarchy of the digital object and its relationship to other objects. For more information see http://www.loc.gov/standards/mets/METSOverview.html.
 

Open Archives Initiative (OAI)

“The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content.” (Open Archives Initiative). Like RDF, it is being developed to promote interoperability among metadata markup schemes, or as OAI puts it “metadata harvesting.” The specifics are defined by Herbert Van de Sompel and  Carl Lagoze (2001). OAI Records provide metametadata on the markup schema used by document authors that contain at minimum Dublin Core and perhaps others. OAI Records are html/xml documents resident on OAI servers. They are provided from a repository in response to a metadata protocol request.
 

Stanford InfoBus

The Stanford University Digital Library Initiative I project addressed interoperability among various services running different metadata protocols (Paepcke et al 1996).  The InfoBus, a metametadata engine is capable of translating one metadata standard to the InfoBus standard, then translating the InfoBus standard to the second metadata standard.
 

Semantic Web

One of the more recent and exciting initiatives using RDF to catalog the Web is the "Semantic Web" concept. The idea is described in a Scientific American article in May 2001 by Tim Berners-Lee, James Hendler and Ora Lassila entiled "The Semantic Web". Berners-Lee, you will recall, is the father of the WWW.

The semantic Web concept carries interesting baggage - a combination of relational databases and concept thasauri interlinked via hypertext to provide very specific meanings and to interrelate Web documents. The semantic Web differs from what Berners-Lee, Hendler, and Lassila call the "traditional knowledge-representation systems." Traditional knowledge-representation systems are "stifling," they say, because these systems require specific and agreed definitions for the thesaurus terms. These systems are also inherently centralized and bureaucratized. Bureaucratic and centralized systems are inherently slow to change and frought with debate and delay. Berners-Lee et al. are right. Centralized systems are slow to change - there is an ample literature that points to the experiences of say Dublin Core as it adapts and changes. Or, consider the history of change and modification for DDC or LCSH.

Instead, the semantic Web, accoding to Berners-Lee et al.,  will accept "...that paradoxes and unanswerable questions are a   price that must be paid to achieve versatility." However, "[t]he challenge of the Semantic Web, therefore, is to provide a language that expresses both data and rules for reasoning about the data and that allows rules from any existing knowledge-representation system to be exported onto the Web." [cite] The Semantic Web, therefore, is conceived of as a middle ground between "stifling bureaucracy" and the "Wild Wild Web."

The power of RDF and, because it is an RDF based system, the Semantic Web lies in its use of "ontologies" -- the metametadata containers. Ontologies serve as metadata and data "containers" as well as providing definitional guidance to standardized definititional crosswalks to operate. Berners-Lee et al. provide the example of differences between "zip codes" and "postal codes" as metadata and the need for a common framework to understand their commonalities. Ontologies can be used to define much broader concepts.

So, what's good about the Semantic Web? It offers further elaboration of metadata and metametadata definition and exchange. What's wrong with it? It continues to be dependent on Web creator participation and assumes an expertise and sophistication one the part of those same Web creators that may be unwarrented.

That said, this one bears watching. It might even be smart to begin to participate in the formulation of Semantic Web standards.
 

Readings and References

W3C, Resource Description Framework (RDF) Model and Syntax Specification W3C Proposed Recommendation, PR-rdf-syntax-19990105, 05 January 1999. Available: http://www.w3.org/TR/PR-rdf-syntax/

Charlotte Jenkins, Mike Jackson, Peter Burden, Jon Wallis. (n.d.) Automatic RDF Metadata Generation for Resource DiscoverySchool of Computing & IT, University of Wolverhampton. Available: http://www.scit.wlv.ac.uk/~ex1253/rdf_paper/

Paepcke, Andreas, Steve B. Cousins, Hector Garcia-Molina, Scott W. Hassan, Steven P. Ketchpel, Martin Röscheisen, and Terry Winograd (1996). “Using Distributed Objects for Digital Library Interoperability.” Computer. Available: http://computer.org/computer/dli/r50061/r50061.htm

Van de Sompel, Herbert and Carl Lagoze (2001). The Open Archives Initiative Protocol for Metadata Harvesting, Protocol Version 1.1 of 2001-07-02. Available: http://www.openarchives.org/OAI/openarchivesprotocol.htm#Record

W3C, Resource Description Framework (RDF) Schema Specification, W3C Proposed Recommendation 03 March 1999, Available: http://www.w3.org/TR/1998/WD-rdf-schema/

SemanticWeb.org Portal. http://www.SemanticWeb.org/

Ontology.org.  http://www.ontology.org/ (B2B Resource)

Berners-Lee, Tim, James Hendler and Ora Lassila  "The Semantic Web"Scientific American,  May 2001. http://www.scientificamerican.com/2001/0501issue/0501berners-lee.html

Markup Languages and Ontologies, http://www.SemanticWeb.org/knowmarkup.html

W3C, Semantic Web Activity, http://www.w3.org/2001/sw/

The SHOE [Simple HTML Ontology Extensions] FAQ. http://www.cs.umd.edu/projects/plus/SHOE/faq.html#q1.1

SHOE Ontologies. http://www.cs.umd.edu/projects/plus/SHOE/onts/index.html#base