Markup Language Formats:

 

 

Background

A markup language is an artificial language using a set of annotations to text that give instructions regarding how text is to be displayed. Markup languages have been in use for centuries, and in recent years have also been used in computer typesetting and word-processing systems. [From Wikipedia: Markup language (see below)]

Generalized Markup Language (GML)

The grandaddy of all the modern markup languages, GML was developed by IBM in the 1960's. GML introduced the use of "tags" to indicate document sections and important visual elements. Subsequently, documents could be printed by devices that were programmed to used desired fonts, document layout, etc., based on these tags. The method was adopted by industry in a second-generation Standard Generalized Markup Language, SGML.

Hypertext Markup Language (HTML)

The official development history of HTML began in 1995, with the publication of Version 2.0 of the standard, after several years of early development based on SGML (above). There have been several new versions, resulting in the current official standard, Version 4.01, published in 2000, as ISO/IEC 15445:2000. In early 2008 drafts of Version 5 began circulating.

Hyperlinks

"In computing, a hyperlink is a reference, link, or navigation element in a document to another place, such as another section of the same document or to another document that may be on or part of a (different) domain." [From Wikipedia: Hyperlink] The earliest well-known use of hyperlinks within a PC document occurred in the HELP documentation for the QuickBasic programming software from Microsoft (ca. 1988). Hyperlinks are the essential building block in HTML, XHTML, XML, etc. (see below)

HTML Training

Extensible Markup Language (XML)

XML is an Extensible Markup Language (extensible because it is not a fixed format like HTML). XML is not a single, predefined markup language: it is a meta-language "a language for describing other languages." It is a set of rules for creating semantic tags used to describe data.

XML is fast becoming the standard for data representation and exchange on the Internet. The basic ideas underlying XML are very simple: tags on data elements identify the meaning of the data, rather than, as with HTML, specifying how the data should be formatted, and relationships between data elements are provided via simple nesting and references. Web servers and applications encoding data in XML can quickly make the information available in a simple and usable format. As the information content is separated from information rendering, it is easy to provide multiple views of the same data.

As with HTML, data is identified using tags (identifiers enclosed in angle brackets, like this: <...>). Collectively, the tags are known as "markup". Unlike HTML, XML tags describes what the data means, rather than how to display it. Where an HTML tag says something like "display this data in bold font" (...), an XML tag acts like a field name in the program. It puts a label on a piece of data that identifies it, for example, <cruise id >...</cruise id>.

XML allows anyone to design a new, custom-built language. However, before a new XML language can be drafted, designers must agree on three things: which tags will be allowed, how tagged elements may nest within one another, and how they should be processed. The first two - the language's vocabulary and structure - are typically codified in a Document Definition Language, or DTD. The XML standard does not compel language designers to use a DTD, but it is required to formally identify the relationships between the various elements that form the document.

Exchange of data 
A major strength and source of potential of XML is that it facilitates the exchange of data between different applications and operating systems. One of XML's strongest points is its ability to do data interchange. Because different organisations (or even different parts of the same organisation) rarely standardise on a single set of tools, it takes a significant amount of work for two groups to communicate. XML makes it easy to send structured data across the web so that nothing gets lost in translation. XML is potentially the answer for oceanographic data exchange, as long as all sides agree on the markup to use.
Extensibility 
Extensible means that it is not a fixed format like HTML. While HTML tags must follow pre-set standards, new XML tags can be created by anyone at any time. XML will allow groups of people or organisations to create their own customized markup languages for exchanging information in their domain. Examples of existing industry-specific XML include music, chemistry, electronics, linguistics, engineering and mathematics.
Plain Text 
Since XML is not a binary format, files can be created and edited with a standard text making it useful for storing small amounts of data. At the other end of the spectrum, an XML front end to a database makes it possible to efficiently store large amounts of XML data. XML provides scalability for anything from small configuration files to an industry-wide data repository.
Data Identification 
The XML standard specifies how to identify data, not how to display it. HTML, on the other hand, describes how things should be displayed without identifying the content. Because the different parts of the information have been identified, they can be used in different ways by different applications.
Stylability 
When display is important, the style sheet standard, XSL, can dictate how to portray the data. Since XML is inherently style-free, different style sheets can be used to produce output in postscript, PDF, or any other format.
Hierarchical 
XML documents are hierarchical in structure. Hierarchical document structures are, in general, faster to access because you can drill down to the part you need, like stepping through a table of contents.

XML in 10 Points

Well Formed XML

XML text is only considered "well formed" if it obeys all of XML's syntax rules. Tags in the text must be spelled correctly, paired in the usual stop-start sequences, and provided with all the arguments they require. If text is not well formed, then it cannot be read by XML-compatible programs or parsers.

Valid XML

XML text is only considered "valid" if it conformed to the rules set up by the original markup language creator as "semantic rules." These rules are are contained in XML schema (see below) or in the document type definition (DTD; see below). They often limit or constrain the types of data that can be placed in various dataset fields, for example. In essence, XML documents obtain meaning and usability from the existence and content of schema and DTD's

XML Examples

XML Training

Extensible Hypertext Markup Language (XHTML)

XHTML can be thought of as the intersection of HTML and XML in many respects, since it is a reformulation of HTML in XML. [from Wikipedia: XHTML (see below)]

XHTML Training

Wikitext

Wikitext language or wiki markup is a markup language that offers a simplified alternative to HTML and is used to write pages in wiki websites such as Wikipedia. [From Wikipedia: Wikitext (see below)] It was used in the writing of OceanTeacher, but it does not play any role in marine data management, per se.