Metadata Overview:
In order to facilitate access to ocean data and information, it is important to consistently describe and classify data through the implementation of metadata schemes. Creating metadata is like library cataloguing, except that the metadata creator needs to understand the scientific information behind the data in order to properly document the datasets. Most ocean data has a spatial component (a geographic location), and spatial metadata is used to describes spatial datasets to provide a consistent approach to the storage and retrieval of spatial data. In order to manage ocean data effectively it is important to implement metadata systems that are consistent and interoperable.
Metadata is "data about data". INSPIRE further defines metadata as "information describing spatial datasets and spatial data services and making it possible to discover, evaluate and use them."[1]. Metadata can describe just about anything. In environmental sciences like oceanography, metadata describe the information that scientists collect, telling users about the "who, what, when, where, why, and how" of a data set or data item.[2]
Metadata describe the data, but are not the actual data itself. For example, the records in a card catalogue in a local library give brief details about the actual book. It provides enough information to know what the book is called, its unique identification number, how and where to find it. These details are metadata. In the case of a library card catalogue these would be bibliographic elements such as Author, Title, Abstract. Metadata have many applications. They can be used to:
Metadata help people who use marine data to find the data they need and determine how best to use it. Metadata benefit the data producing organisation as well. As personnel change in an organisation, undocumented data may lose their value and new employees may have little understanding of the contents and uses of the data.
The data collected and stored by an organization are valuable assets and a substantial amount of time, money and effort has been invested in these assets. However, if users are unable to easily locate data and services, then the full value of those resources will not be realised. By investing time and effort to provide quality and consistently structured metadata, organizations can significantly increase the return on investment of their assets. In addition, lack of knowledge about other organizations' data could lead to duplication of effort.
It may seem burdensome to add the cost of generating metadata to the cost of data collection, but in the long run it is worth the effort. The information needed to create metadata is often readily available when the data are collected. A small amount of time invested at the beginning of a project may save money in the future. The initial expense of clearly documenting data outweighs the potential costs of duplicated or redundant data generation.
Metadata are used to create a searchable collection of descriptions of datasets called a Data Directory similar in concept to a library catalogue. The data directory contains details about all datasets that are available and who to contact to access or acquire the data. In some cases, the directory may also contain direct links to online datasets. There are some characteristics of spatial datasets that would make them difficult to describe in conventional bibliographic terms. A data directory includes information on the geographic extent of the datasets, which permits spatial searches on the datasets stored in the data directory. A data directory will allow users to evaluate which information is relevant to their needs and to provide directions for accessing the desired datasets. A data directory can also be used as a vehicle for advertising products and services.
Metadata has three main functions: Discovery, Evaluate, Access. These three functions of metadata allow users to determine what data is available, to evaluate the suitability of the data for use, to access the data, and to transfer and process the data.