|
INTRO |
Spam Indexing |
Mark Up |
MetaTags |
SGML/XML |
Dublin Core |
PICS |
Metametatags |

What is the Standard Generalized Markup Language (SGML)? Library
of Congress defines it as: "SGML is a set of rules for defining and expressing
the logical structure of documents thereby enabling software products to
control the searching, retrieval, and structured display of those documents."
Cite
SGML and XML are a parent mark up language and a metalanguage.A markup language like SGML and its derivative langauge XML are used to encode or give specific machine readable meaning to a body of material. The unique characteristic of SGML is that it indicates document type and it is therefore known as a document type definition (DTD). Thus documents have form and characteristics that define their type. HTML is an SGML based language.
Because SGML is DTD, it is an electronic variant of diplomatics. Diplomatics is, according to Duranti , the bibliographic art of comprehending the purpose of a document by examining its form or format. Legal documents, for example, have an appearance or "feel." The National Enquirer and the New York Times each have their own form. A great deal can be known about the content and the quality of that content at a glance and without extensive perusal of the documents. SGML is an electronic definition.
An SGML marked up document can be used in a numbers, assuming one has the appropriate formater or editor. SGML can be used to navigate within documents, it can be used to modify document templates for specific applications, and its terms and entities can be used to classify document types and content.
SGML supports entities.
Entities
are defined in SGML as substitutes or short-hands for terms or concepts.
An SGML senstitive interpreter will read the SGML entities and substitute
the defined string for the entity. Entities are in a sense electronic versions
of jargon, abbreviations, or acronyms. These latter terms are interpreted
mentally rather than electronically.
"7.Metadata InterchangeXML consists of a complex set of data types (W3Cb). These are defined by the 3-tuple or tri-tuple: value space, lexical space, and value space facets.
There is growing interest in the interchange of metadata (especially for databases) and in the use of metadata registries to facilitate interoperability of database design, DBMS, query, user interface, data warehousing, and report generation tools. Examples include ISO 11179 and ANSI X3.285 data registry standards, and OMG's proposed XMI standard." (W3Ca)
For an interesting and somewhat critical review of XML as Web mark up,
and therefore an indexing language, see Elliott Pritchard's
MSc
thesis at the University of Sheffield. He
concludes
that there will likely be more corporate support than personal for the
standard. And while there are drawbacks to XML, its advantages outweigh
them.
L. Duranti. "Diplomatics: New uses for an old science." Archivaria 28, 1 (1989) pp 7-17.
Robin Cover, The SGML/XML Web Page Extensible Markup Language (XML) http://www.oasis-open.org/cover/xml.html#overview. Last modified 31 January 2000.
W3C(a), XML Schema Requirements, W3C Note 15 February 1999, NOTE-xml-schema-req-19990215. Available: http://www.w3.org/TR/NOTE-xml-schema-req
W3C(b), XML Schema Part 2: Datatypes, W3C Working Draft 17 December 1999. Available: http://www.w3.org/TR/1999/WD-xmlschema-2-19991217/datatypes.html
Elliott Pritchard, XML: the future of web markup? MSc Thesis in Information Management, 1998/1999. University of Sheffield. Available: http://panizzi.shef.ac.uk/elecdiss/edl0003/index.html