What is Metadata?
Metadata is “data about data” and are descriptions that facilitate cataloguing data and data discovery. It is a standard document reporting:
- WHO created the data?
- WHAT is the content of the data?
- WHEN were the data created?
- WHERE is it geographically?
- HOW were the data developed?
- WHY were the data developed?
When you read nutrition facts on food or look up a book in the library you are reading metadata!
Why is it important?
Metadata captures information and can support not only data management but also data distribution. It helps avoid data duplication, share reliable information, and promote the work of a scientist and their contributions to a field of study. Metadata reuse saves time and resources in the long-run. It can be said that metadata completes a dataset.
Metadata gives a user the ability to:
- Search, retrieve, and evaluate dataset information from both inside and outside an organization
- Find data: Determine what data exists for a topic and/or geographic location
- Determine applicability: Decide if a dataset meets a particular need
- Discover how to acquire the dataset identified; process and use the dataset
- Understand the dataset, including definitions of column names, or expected numerical ranges found in the data
Metadata can be both at the project and data level. At the project level it would explain things like aims of the study, technology used, and who is linked to the data while at the data level it would contain more information like file type, format, and even notes about missing values.
Getting Started with Metadata
At RITMO, we are using a basic Dublin Core template for experiment level metadata and a processing journal for data level specifics.
Experiment Metadata
We are using a template based on the Dublin Core Metadata Initiative for project level metadata. DCMI is a widely used, simple, and flexible standard consisting of 15 elements. These elements might seem vague or confusing so examples have been included as well as the link to the DCMI element descriptions. The simple file will answer what the project is and who is responsible for it.
- TITLE (Your Project Name)
- Element Description:The name given to the resource
- Additional Guidelines:
- Examples: Lion King, Bohemian Rhapsody, MusicLab 4, Dance Dance Revolution (DDR)
- CREATOR (Project Lead)
- Element Description:The person primarily responsible for the intellectual content of the resource; the author.
- Additional Guidelines:The main PI of the project. Can be multiple people or organization (s) if applicable.
- Examples: Leonardo da Vinci, Marie Curie, the Jackson 5
- SUBJECT (Main Topic)
- Element Description: The topic of the resource
- Additional Guidelines: Typically, the topic will be represented as simply as possible by using keywords, key phrases, or classification codes. Choose the most significant and unique words for keywords, avoiding those too general to describe a particular item. If necessary, multiple subject keywords may be used but make sure to separate using semi-colons.
- Examples: Dogs, Motion Capture, Hovercraft maintenance, EMG;Dance
- DESCRIPTION
- Element Description: A textual description of the content of the resource
- Additional Guidelines: Description may include but is not limited to: an abstract, a table of contents, a graphical representation, or a free-text account of the resource. Best practice recommendation for this element is to use full sentences.
- Examples:
- PUBLISHER (Usually UiO)
- Element Description: The entity responsible for making the resource available
- Additional Guidelines: The intent of specifying this field is to identify the entity that provides access to the resource such as a publisher, university department or corporate entity. The recommended practice is to use Publisher for organizations, and Creator for individuals
- Examples: University of Oslo, Motown Records, Pearson Education
- CONTRIBUTORS
- Element Description: Additional person(s) or organization(s) responsible for making contributions to the project
- Additional Guidelines: Should be those who have made significant intellectual contributions to the resource but on a secondary basis.
- Examples: Advisor, Research Assistant, CoAuthor
- DATE (Beginning of project)
- Element Description:The date associated with the beginning of the project.
- Additional Guidelines: It is recommended to use the standard format YYYY-MM-DD. If the full date is unknown, month and year or just year may be used.
- Examples: 1999-10-01, 2001-04-04, 2005-07
- TYPE
- Element Description: The nature or genre of the content of the resource.
- Additional Guidelines: If the resource is composed of multiple mixed types then multiple or repeated Type elements should be used to describe the main components.Recommended best practice is to use a controlled vocabulary like the DCMI Type Vocabulary.
- Examples: Image, Dataset, Sound
- FORMAT
- Element Description: The physical or digital format of the resource
- Additional Guidelines: It is recommended to use controlled vocabulary (media types) when describing the format. Multiple entries can be used.
- Examples: video/mp4, image/tiff, text/csv, mocap/qtm
- IDENTIFIER (Optional)
- Element Description: A string or number used to uniquely identify the resource.
- Additional Guidelines: This element should be used when the project has a link to a formal identification system.
- Examples: a URL, an ISBN, DOI
- SOURCE (Optional)
- Element Description:The reference from which the resource is delivered
- Additional Guidelines: The described resource may be derived from the related resource in whole or in part.
- Examples: a URL, an ISBN, DOI, other project name
- LANGUAGE
- Element Description: The language of the intellectual content of the resource.
- Additional Guidelines: It is recommended to use language tags with optional subtags.
- Examples: eng, no, en-gb, Primarily English with some elements in Norwegian
- RELATION (Links to other projects)
- Element Description: A reference to a related resource
- Additional Guidelines: Recommended best practice is to reference the resource by means of a string or number conforming to a formal identification system.Additional description of the relation may also be added IsVersionOf, hasVersion, isReplacedBy, replaces, isRequiredBy, requires, isPartOf, hasPart, isReferencedBy, references, isFormatOf, hasFormat, conformsTo).
- Examples: IsPartOf TIME, IsBasedOn Standstill, IsVersionOf Circle of Life
- COVERAGE
- Element Description:The spatial and/or temporal scope of the resource
- Additional Guidelines: Spatial topic and spatial applicability may be a named place or a location specified by its geographic coordinates. Temporal topic may be a named period, date, or date range. A jurisdiction may be a named administrative entity or a geographic place to which the resource applies. For most simple applications, place names or coverage dates might be most useful.
- Examples: Oslo Norway, 2007-2010
- RIGHTS MANAGEMENT (If applicable)
- Element Description: Information about rights held in and over the resource.
- Additional Guidelines: Typically a Rights element will contain a rights management statement for the resource, or reference a service providing such information. A textual statement or URL to a copyright notice, IP or other applicable category.
- Examples: cc-by license, access limited to members, public domain
Processed Data Metadata
Metadata about the processed data. This txt document is a processing journal. It allows for simple and easy updates whenever the data is altered noting what was done, by whom, when it was done, and how it was done.
Processing Journal for Experiment XX
---Original Files---
Created by:
Date(s) Created:
Restrictions:
Notes:
---Processing Updates----
Files(s):
Edited by:
Date:
Software Used:
What was done: