Using Dublin Core
With information from the Dublin Core Metadata Initiative
This chapter describes the history of the Dublin Core Metadata Initiative and the development of the Dublin Core metadata scheme, and provides brief information on changes made to the metadata scheme over time. It includes a discussion of the main principles underlying the standard, definitions of individual properties within the metadata scheme, and guidance in use of the properties for metadata creation. The chapter includes an example of a Dublin Core metadata record, exercises allowing the reader to create Dublin Core metadata, and examples of metadata that could be produced for the exercises.
Dublin Core is one of the best known and most widespread metadata initiatives. The initial element set, which consisted of 15 core elements, was developed at a joint conference between OCLC (Online Computer Library Center) and NCSA (National Center for Supercomputing Applications) in 1995 in Dublin, Ohio. The impetus for the creation of the element set was the rapid expansion of the internet, the plethora of information that was subsequently becoming available, and recognition that “the Internet … [would] contain more information than professional abstractors, indexers, and catalogers can manage using existing methods and systems” (Weibel et al., 1995).
The aim and result of the conference was the development of a set of metadata elements that were simple enough for web authors to incorporate into their HTML without needing extensive training in cataloging or indexing (Lagoze, 2001; National Information Standards Organization, 2004, p. 3). The conference also laid the groundwork for the development of the Dublin Core Metadata Initiative: an open, nonprofit organization that has maintained and further developed the initial set of metadata elements.
The initial set of metadata elements (there were 15) functioned in the context of three principles: the One-to-One (1:1) principle, the Dumb-Down principle, and the Appropriate Values principle. The One-to-One principle stated that a Dublin Core metadata record should only describe one manifestation or version of a resource. Therefore, in the case of a photo of the Mona Lisa, the metadata record would describe the photo and not the Mona Lisa itself.
The Dumb-Down principle stated that the user should be able to look at the information in a metadata field with refinements or qualifications and still be able to make sense of the information if the refinements or qualifications were stripped away. For example, the information contained in a title.alternative field should still make sense to the user if the “alternative” refinement were taken away.
The Appropriate Values principle stated that the metadata producer could never assume that their metadata would only be seen by a certain audience or in a certain context, so metadata should always be produced so that it would be understandable by any user in any context. For example, an early twentieth-century picture of the New Mexico State University Marching Band with the title “The Band” would be perfectly understandable for a user looking at a metadata record with an image on campus or in the nearby area; but someone in Michigan who finds the harvested metadata with no accompanying image may have no idea which band or what is being referred to. Good, harvested metadata needs to stand on its own, independent of context.
Since the development of the initial Dublin Core metadata elements, the World Wide Web has grown and changed the way we live, produce, and consume information. Dublin Core has changed as well. Where it was once intended solely for the description of electronic resources created and made available on the web, it is now used for “any object that can be identified, whether electronic, real-world, or conceptual” (Dublin Core Metadata Initiative, 2011a). The simplicity of the original 15 elements led to Dublin Core becoming a common, minimum standard for the transmission of metadata, such as with the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). However, this simplicity became as much of a hindrance for some metadata producers as it was an advantage for others. Over time, additional elements and granularity for the original elements were added to make it a more robust and useful standard for description.
Substantial changes have taken place to Dublin Core over the years. For a long time, the Dublin Core Metadata Element Set (DCMES) was composed only of the 15 initial elements, with the additional elements and refinements being added to the specification under the heading of “Other elements and element refinements” as they were developed. The DCMES came to be known as “Simple Dublin Core,” with the DCMES plus additional elements and refinements becoming known as “Qualified Dublin Core.” In 2008, the initial and additional elements were combined, redefined as “properties,” and joined with guidelines on the use of encoding schemes and controlled vocabularies into one large specification known as the DCMI Metadata Terms. Within the terms, there is still a specification for the DCMES, allowing people to continue using “Simple Dublin Core.”
There have also been changes to two of the three principles. The One-to-One principle is largely the same, but the Dumb-Down and Appropriate Values principles have changed and developed in response to technological advances just as Dublin Core has. The Dumb-Down principle has slowly made the move from concentrating on the simplification of metadata to promoting the use of “formal definitions . . . to align metadata description based on different vocabularies” (Dublin Core, 2011b).
The discussion about Appropriate Values has evolved to incorporate ideas on domains and ranges from the Semantic Web, specifically the Resource Description Framework (RDF). More information about RDF is provided in the last chapter in the book. Essentially, domains and ranges provide each of the DCMI properties with more information that can be interpreted by a machine to link data together across the Semantic Web. While this is an important next step for keeping Dublin Core relevant in the changing online environment, for the purposes of this book we’ll focus on the DCMI element definitions and uses.
There are four terms that serve as the building blocks for resource description in Dublin Core. These terms are properties, classes, datatypes, and vocabulary encoding schemes. Together, these building blocks allow a complete and standardized description of a resource. The following definitions come from the DCMI_MediaWiki User Guide (DCMI, 2011a).
Classes ‒ Classes in Dublin Core are ways of grouping resources that have properties in common. In many cases, these classes are defined by the DCMI Type Vocabularies (http://dublincore.orgldocuments/dcmi-terms/#H7), making a class something like a collection, moving image, or physical object.
Datatypes ‒ Datatypes were previously known as Syntax Encoding Schemes (SES). These are rules that govern how the information in certain properties is structured. These rules are used in properties such as dates, type, and format.
Vocabulary Encoding Scheme ‒ Vocabulary Encoding Schemes were previously known as Concept Schemes. They are vocabularies whose terms should be used to structure the information in properties such as creator, contributor, and subject.
Understanding the use of the properties will be the most important part of creating Dublin Core metadata for many professionals in the cultural heritage community. For a simple starting point, this text will begin with the properties in the DCMES, listed below. It will then move on to definitions of the other properties contained in the DCMI Metadata Terms. The definitions come from the DCMES namespace (http://dublincore.orgldocumentsldcmi-termsl#H3) and the DCMI_Media Wiki User Guide for Creating Metadata (http://wiki.dublincore.orglindex.php/User_GuidelCreating_Metadata).
Comment: Examples of a contributor include a person, an organization, or a service. Typically, the name of a contributor should be used to indicate the entity. Recommended best practice is to use a controlled vocabulary such as the Library of Congress Name Authority File (LCNAF) or the Getty Union List of Artist Names (ULAN).
Comment: Spatial topic and spatial applicability may be a named place or a location specified by its geographic coordinates. Temporal topic may be a named period, date, or date range. A jurisdiction may be a named administrative entity or a geographic place to which the resource applies. Recommended best practice is to use a controlled vocabulary such as the Thesaurus of Geographic Names (TGN). Where appropriate, named places or time periods can be used in preference to numeric identifiers such as sets of coordinates or date ranges.
Comment: Examples of a creator include a person, an organization, or a service. Typically, the name of the creator should be used to indicate the entity. Recommended best practice is to use a controlled vocabulary such as the LCNAF or the ULAN.
Comment: Examples of a publisher include a person, an organization, or a service. Typically, the name of a publisher should be used to indicate the entity. Because of the One-to-One principle, the publisher may not be the publisher of a physical object portrayed by a digital object, but the party for making the digital object itself available. Recommended best practice is to use a controlled vocabulary such as the LCNAF when possible.
Comment: Recommended best practice is to identify the related resource by means of a string conforming to a formal identification system. Relationships may be described reciprocally, but it is not required.
Comment: The described resource may be derived from the related resource in whole or in part. Recommended best practice is to identify the related resource by means of a string conforming to a formal identification system.
Comment: Recommended best practice is to use a controlled vocabulary such as the DCMI Type Vocabulary (DCMITYPE). To describe the file format, physical medium, or dimensions of the resource, use the format element.
Comment: Instructional Method will typically include ways of presenting instructional materials or conducting instructional activities, patterns of learner-to-learner and learner-to-instructor interactions, and mechanisms by which group and individual levels of learning are measured. Instructional methods include all aspects of the instruction and learning processes from planning and implementation through evaluation and feedback.
Definition: A statement of any changes in ownership and custody of the resource since its creation that are significant for its authenticity, integrity, and interpretation. Comment: The statement may include a description of any changes successive custodians made to the resource.
Figure 3.1 is an example of a Dublin Core metadata record:
Description: Image shows the NMA&MA Aggies Band, with instruments, on the steps of a building. Digital image was created using Adobe Photoshop CS3 Macintosh, at 8 bits and 300 dpi. Format: image/jpeg Identifier: NMA&MA Aggies Band Is Format Of: NMA&MA Aggies Band Is Part Of: Hobson-Huntsinger University Archives Language: eng
The production of metadata can sometimes be very subjective, and the fullness of the metadata produced will depend on the amount of information available to the metadata cataloger. However, the following items provide an example of the metadata that could be produced for the above exercises.
Description: Caption on image reads, “Camel Rock near Santa Fe, New Mexico. SF-16.” Image shows Camel Rock, surrounded by bushes. Note in an older database reads, “Sold by Old Trail News Agency, Santa Fe, New Mexico Colour Picture Publication, Boston 15, Massachusetts, U.S.A.”
Description: Handwritten caption on photograph reads, “Motorcycle Machine Gun Corps, Las Cruces, 1913.” Image shows a number of motorcycles parked in a large, grassy area. Digital image was created using Adobe Photoshop CS3 Macintosh, at 8 bits and 300 dpi.
Description: Older database describes the image as, “Seven men sacking chile peppers for commercial sale,” and notes that the item is oversized. Digital image was created using Adobe Photoshop CS5 Macintosh, at 24 bits and 300 dpi.