course logo
MODULE 5
Catalogs
PAGE 1
Innovative
Characteristics
Concepts
Relationships
PAGE 2
MARC
PAGE 3
GILS
PAGE 4
Issues
PAGE 5
OCLC
PAGE 6
Pathfinder & Bookmarks

 
 

.................site map
home logo
dublin core logo
Dublin Core Discussed
W3C logo
SGML discussed
GILS
GILS logo

MARC21

The MARC (aka Machine Readable Catalog) record is a complex set of metadata describing any given document. There are many MARC flavors: MARC21, UNIMARC, UKMARC,  DANMARC,  and so on. Each has much in common but all have  variations to reflect local language and usage differences. All are Z39.50 compliant and talk to each other and to other Z39.50 compliant systems.
 
 
We will not go into a long discussion of what MARC is and the meaning of its fields. It is assumed that students are already familiar with MARC and the structure of the MARC record.

For students needing refreshing, the Library of Congress provides extensive documentation at " MARC 21 Concise Formats" http://www.loc.gov/marc/concise/concise.html#general_intro
and "Classification Data" at http://lcweb.loc.gov/marc/classification/ as well as MARC Standards at http://www.loc.gov/marc/ 

MARC is a metadata coded template. It is a complex set of rules, easy to understand in the abstract but difficult to apply in the concrete.  It is a set of numbers, letters, and characters that allow very careful and precise description of a "knowledge product." For a taste of MARC's complexity and subtlety, see MARC 21 Format for Bibliographic Data, National Level Record and Minimal Level Record Requirements.

MARC is an evolving standard. In order to use MARC to catalog an electronic document, it is necessary to understand the uses and applications of the standards. Students should acquire this competency in courses like the VSU MLIS 7300 and others. Our chief concern is with the description of electronic documents. This is accomplished in the 856 field (UKMARC has yet to incorporate the 856 field). It should be noted that the 856 field is a very recent innovation as part of USMARC and therefore of MARC21. Modification to the 856 field are being made to accommodate electronic serials, see for example: Minutes, Committee to Study Serials Cataloging Serials Section 2001 Annual Conference at San Francisco, CA  Monday, June 18, 2001 .

MARC21, as updated,  represents the coordination of USMARC and CanMARC into a single standard. A MARC record consists of three parts or elements:  the record structure, the content designation, and the data content of the record. This course does not delve into the intricacies.

MARC21 also makes use of the Leader field. The Leader field consists of the first 24 characters of the record (0-23). The "06" indicates the record type. Based on a Library of Congress publication, the "06" codes are as follows:
 
 

a - Language material
Includes printed, microform, and electronic language material. 
c - Notated music
Includes microform and electronic notated music. 
d - Manuscript notated music
Includes microform manuscript music. 
e - Cartographic material
Includes maps, atlases, globes, digital maps, and other cartographic items. 
f - Manuscript cartographic material
Includes microform manuscript maps. 
g - Projected medium
Examples include: motion pictures, videorecordings (including digital video), filmstrips, slides, transparencies, or material specifically designed for projection. 
i - Nonmusical sound recording
Examples include: speech. 
j - Musical sound recording
Examples include: phonodiscs, compact discs, or cassette tapes. 
k - Two-dimensional nonprojectable graphic
Examples include: activity cards, charts, collages, computer graphics, drawings, duplication masters, flash cards, paintings, photonegatives, photoprints, pictures, photo CDs, postcards, posters, prints, spirit masters, study prints, technical drawings, photomechanical reproductions, and reproductions of any of these. 
m - Computer file
Includes the following classes of electronic resources: computer software (including programs, games, fonts), numeric data, computer-oriented multimedia, online systems or services. For these classes of materials, if there is a significant aspect that causes it to fall into another Leader/06 category, the code for that significant aspect is used instead of code m (e.g., vector data that is cartographic is not coded as numeric but cartographic). Other classes of electronic resources are coded for their most significant aspect (e.g., language material, graphic, cartographic material, sound, music, moving image). In case of doubt or if the most significant aspect cannot be determined, consider the item a computer file. 
o - Kit
Contains a mixture of components from two or more types of items, none of which is the predominant constituent of the kit. 
p - Mixed material
Indicates that there are significant materials in two or more forms that are usually related by virtue of their having been accumulated by or about a person or body. Includes archival fonds and manuscript collections of mixed forms of materials, such as text, photographs, and sound recordings. 
r - Three-dimensional artifact or naturally occurring object
Includes man-made objects, such as models, dioramas, games, puzzles, simulations, sculptures and other three-dimensional art works and their reproductions, exhibits, machines, clothing, toys, and stitchery, and naturally occurring objects, such as microscope specimens and other specimens mounted for viewing. 
t - Manuscript language material
 
                  Source: MARC 21 Concise Bibliographic: Leader and Directory, http://lcweb.loc.gov/marc/bibliographic/ecbdldrd.html#mrcbLEA

MARC is not limited to any given media, language, or concept. That MARC in its various manifestations "talks" with other versions of itself allows bibliographers, catalogers, and indexers to cross communicate and share information and records.

MARC is divided into three parts: record structure, content designation, and data content. MARC record structure consists of a set of  fields. Each field is labeled with a three digit number. The characters, spaces, and punctuation that follow that three digit number are all pregnant with meaning. That meaning is translated as content designation -- what does the 100 field mean or represent? Data content is the field populated with data.

The following description of the 856 Field is copied from the Library of Congress 856 - ELECTRONIC LOCATION AND ACCESS (R)
 
 

Indicators

     First - Access method
     A value that defines how the rest of the data in the field will be used. The value in this indicator position determines which
     subfields are appropriate. 
          # - No information provided
          0 - Email
          Indicates that access is through the Mail Transfer Protocol (MAILTP).
          1 - FTP
          2 - Remote login (Telnet)
          3 - Dial-up
          Indicates that access to the electronic resource is through a conventional telephone line. 
          4 - HTTP
          Indicates that access to the electronic resource is through the Hypertext Transfer Protocol. 
          7 - Method specified in subfield $2
     Second - Relationship
     A value that identifies the relationship between the electronic resource at the location identified in field 856 and the entity
     described in the record. Only value # (no information provided) is used for classification records. 
          # - No information provided
          0 - Resource
          1 - Version of resource
          2 - Related resource
          8 - No display constant generated

Subfield Codes

     $a - Host name (R)
     The fully qualified domain (host name) of the electronic location. It contains a network address which is repeated if there
     is more than one address for the same host. 
     $b - Access number (R)
     The Internet Protocol (IP) numeric address associated with a host. This data changes frequently and is generated by the
     system, rather than statically stored.
     $c - Compression information (R)
     Information about the compression of a file, in particular, whether a specific program is required to decompress the file.
     $d - Path (R)
     $f - Electronic name (R)
     $h - Processor of request (NR)
     The username, or processor of the request; generally the data which precedes the "@" in the host address.
     $i - Instruction (R)
     An instruction needed for the remote host to process a request.
     $j - Bits per second (NR)
     $k - Password (NR)
     $l - Logon (NR)
     Characters needed to connect (i.e., logon, login, etc.) to an electronic resource or FTP site. This subfield is used to
     record general-use logon strings which do not require special security.
     $m - Contact for access assistance (R)
     $n - Name of location of host (NR)
     The full name of the location of the host in subfield $a, including its geographical location.
     $o - Operating system (NR)
     $p - Port (NR)
     The portion of the address that identifies the process or service in the host.
     $q - Electronic format type (NR)
     An identification of the electronic format type, which is the data representation of the resource, such as text/HTML,
     ASCII, Postscript file, executable application, or JPEG image. Electronic format type may be taken from enumerated lists such as registered Internet Media Types (MIME types).
     $r - Settings (NR)
     $s - File size (R)
     $t - Terminal emulation (R) 
     $u - Uniform Resource Identifier (R)
     The URI, which provides standard syntax for locating an object using existing Internet protocols. Field 856 is structured to allow for the creation of a URL from the concatenation of other separate 856 subfields. Subfield $u may be used
     instead of those separate subfields or in addition to them. 
     $v - Hours access method available (R)
     $w - Record control number (R)
     $x - Nonpublic note (R)
     $y - Link text (R)
     Used for display in place of the URL in subfield $u (Uniform resource identifier). When subfield $y is present,
     applications should use the contents of it as the link instead of subfield $u linking to the destination in subfield $u.
     $z - Public note (R)
     $2 - Access method (NR)
     $3 - Materials specified (NR)
     $6 - Linkage (NR) See Control Subfields 
     $8 - Field link and sequence number (NR) See Control Subfields 
 

It should be noted that MARC21 is supported in XML. New MARC DTDs were released in 2001. For a conversion utility and documentation see Library of Congress at http://www.loc.gov/marc/marcsgml.html

UNIMARC

UNIMARC (aka Universal Machine Readable Catalogue) was developed in the 1970s  to permit national bibliographic institutions like the Library of Congress to transfer or communicate records to other national bibliographic institutions. As such, it represents a metastandard. UNIMARC is maintained by the International Federation of Library Associations and Institutions (IFLA).

Crosswalks

Z39.50 Crosswalks

Because MARC is Z39.50 complaint, it can be translated to Dublin Core and to GILS formats and vice versa.

There are limits to the translation service. For example, the index page for this course contains Dublin Core elements. Translated to several different MARC formats using  beta-version automatic translation software (from BIBSYS at: http://www.bibsys.no/meta/d2m/), we have:
 
 

MARC21

245  $a Web Document Management
260  $b MLIS Program Valdosta State University
     $c 2000
500  $a metadata
540  $a Copyright Wallace Koehler 2002 All Rights Reserved
856  $u http://book.valdosta.edu/mlis
     $q text/html

FINMARC

245  $a Web Document Management
519  $a A library school web-based graduate course on bibliographic
     management of the WWW
260  $b MLIS Program Valdosta State University
     $c 2000
500  $a Text.Index
500  $a metadata
856  $u http://book.valdosta.edu/mlis
     $q HTML

ISMARC

24510$a Web Document Management
720  $a Koehler 
     $h Wallace
720  $a wkoehler@valdosta.edu
65004$a Z
513  $a A library school web-based graduate course on bibliographic
     management of the WWW
260  $b MLIS Program Valdosta State University
     $c 2000
655  $a Text.Index
546  $a en
500  $a metadata
500  $a Copyright Wallace Koehler 2002 All Rights Reserved
856  $u http://http://book.valdosta.edu/mlis
     $q text/html

NORMARC

245  $a Web Document Management
100  $a Koehler, Wallace
260  $b MLIS Program Valdosta State University
     $c 2000
500  $a metadata
500  $a Copyright: Copyright Wallace Koehler 2002 All Rights Reserved
856  $u http://book.valdosta.edu/mlis
     $q text/html

SWEMARC

245  $a Web Document Management
650  $a Z
520  $a A library school web-based graduate course on bibliographic
     management of the WWW
260  $b MLIS Program Valdosta State University
     $c 2000
500  $a Text.Index
856  $q text/html
506  $a en
500  $a metadata
500  $a Copyright Wallace Koehler 2002 All Rights Reserved
856  $u http://http://book.valdosta.edu/mlis

The Dublin Core elements from which this was derived are  more detailed:
 
 
   <meta name="DC.Title" content="Web Document Management">
   <meta name="DC.Title.Alternative" content="LIS 5990 Summer 2000">
   <meta name="DC.Creator.PersonalName" content="Koehler, Wallace">
   <meta name="DC.Creator.PersonalName.Address" content="wkoehler@valdosta.edu">
   <meta name="DC.Subject" content="(SCHEME=LCCS) Z">
   <meta name="DC.Description" content="A library school web-based graduate course on bibliographic management of the WWW">
   <meta name="DC.Publisher" content="MLIS Program Valdosta State University">
   <meta name="DC.Date" content="(SCHEME=ISO8601) 2000-05-01">
   <meta name="DC.Type" content="Text.Index">
   <meta name="DC.Format" content="(SCHEME=IMT) text/html">
   <meta name="DC.Identifier" content="http://www.ou.edu/">
   <meta name="DC.Language" content="(SCHEME=ISO639-1) en">
   <meta name="DC.Coverage" content="metadata">
   <meta name="DC.Rights" content="Copyright Wallace Koehler 2002 All Rights Reserved">
   <meta name="DC.Date.X-MetadataLastModified" content="(SCHEME=ISO8601) 2000-02-05">

Why have some of the fields mapped from Dublin Core and others did not? Most prominently missing from the USMARC and FINMARC record is the author field, but not in NORMARC. Note that the Dublin Core element most like the MARC 100 field is the Creator.PersonalName field. Do these translate consistently? Each time?

SGML

MARC DTDs support conversion for MARC to the SGML standard and vice versa. The Library of Congress believes that there is growing interest in the use of MARC DTDs and that this may become a common conversion practice.
 

856 Field and MARC21

The 856 field is but a small part of the document description. If present in the MARC record it describes the electronic access and status of a resource. A non electronic resource will not be described using an 856 field. There are many electronic resources. Web documents are a subset of those documents and therefore the 856 field describes electronic resources in addition to Web resources. In fact, Web resources are a relatively late comer to the electronic landscape. Other electronic resources have preceded the Web by decades.

The MARC21 856 field is used to describe electronic resources, including Internet resources. The 856 field supports inclusion of descriptive data for both the URL and the URN as well as for the means of transmission or transfer protocol. The MARC 856 field does not support URL fragment information.

The MARC21 856 field for the VSU MLIS home page could be rendered:
856  4#$uhttp://books.valdosta.edu/mlis/

Reading MARC21  is like reading any other code. At first blush, it is incomprehensible. There is, however, a logic to it, although only those thoroughly indoctrinated have any inkling what it is. There is plenty of on-line help to decipher and create MARC records, particularly for the evolving 856 field. A MARC record consists of  "the record structure, the content designation, and the data content of the record." cite

A MARC field contains several elements or parts. The first following the field number is the indicator. For the 856 field, the first indicator is transmission method, the second is the relationship of the item at the specified URL or other address to the whole record indicated.

Where:
The 856 indicates the field tag:  856  4#$uhttp://books.valdosta.peachnet.edu/mlis/
The "4" that the transfer medium is http  856 4#$uhttp://books.valdosta.peachnet.edu/mlis/, the # indicates no relational information provided.

The field also contains subfields. These are indicated by the "$" sign, followed by a letter indicating the specified subfield. In this case, the subfield is "u" or URL. The "$u" is then followed by the resource's URL.

The Library of Congress Web page "MARC 21 Concise Classification: Location and Alternate Graphics (8XX)" at http://www.loc.gov/marc/classification/eccdloca.html  describes the subfields and the element information each describes. Pay particular attention to the text in red. These indicate additions and modifications to MARC since 2000.

As remarkable as MARC is, is it remarkable enough to manage the Web? Or could MARC be augmented to better describe electronic on-line documents? In a 1999 article, McDonnell, Koehler, and Carroll  [MKC] suggest that existing MARC fields can be used to better describe that material. For example, the MARC 505 field is defined for tables of contents. Web sites may provide table of content analogs in one of two ways (or both). The first is to list all subordinate pages, together with their linked  URLs on one or more pages. The second is to provide a "hot" site map. Site maps or link lists can be incorporated in the bibliographic record for Web documents just as tables of contents can be included for other media types. And while the MARC 856 field captures transfer protocols, it does not provide for either PURLs (it does take URLs and URNs) or URL fragments.
 
 
learning objective
It was our suggestion [MKC] that  MARC be expanded to incorporate those elements unique to the WWW and other Internet documents not already incorporated into MARC. 

Is this a useful strategy. If so, why?  If not, why not?

What recommendations for MARC modification do you have?

Page References and Required/Recommended Reading


[LC] Library of Congress, Network Development and MARC Standards Office, Guidelines for the Use of Field 856,  Revised August 1999. Available: http://www.loc.gov/marc/856guide.html. And 856 - Electronic Location and Access, Available: http://lcweb.loc.gov/marc/bibliographic/ecbdhold.html#mrcb856

[MKC] Janice McDonnell, Wallace Koehler, and Bonnie Carroll, "Cataloging Challenges in an Area Studies Virtual
Library Catalog (ASVLC): Results of a Case Study,"  Journal of Internet Cataloging 2, 2 (1999).

For a MARC overview, see: Betty Furrie, Understanding MARC Bibliographic: Machine-Readable Cataloging. http://lcweb.loc.gov/marc/umb/

MARC 21 Format for Bibliographic Data, National Level Record and Minimal Level Record Requirements. Available: http://lcweb.loc.gov/marc/bibliographic/nlr/

[Library of Congress] Network Development and MARC Standards Office, Dublin Core/MARC/GILS Crosswalk,  Last updated: 14 October 1999. Available: http://lcweb.loc.gov/marc/dccross.html

GILS, Version 2 of "APPLICATION PROFILE FOR THE GOVERNMENT INFORMATION LOCATOR SERVICE (GILS)" This document was last updated on November 24, 1997. Available: http://www.gils.net/prof_v2.html

Library of Congress, Network Development and MARC Standards Office, MARC DTDs Background and Development. Dated May 22, 1998. Available: http://lcweb.loc.gov/marc/marcdtd/marcdtdback.html

International Federation of Library Associations and Institutions, Universal Bibliographic Control and   International MARC Core Programme. UNIMARC: An Introduction [:]   Understanding the UNIMARC format. Latest Revision: March 3, 1999. Available: http://www.ifla.org/VI/3/p1996-1/unimarc.htm