BUFR and GRIBFormats:

 

Background

Oceanographic analyses often involve merging of observations collected by different ships and organizations in different areas and years. Since these organizations often use different data formats, merging of the data is difficult. Data format changes cause delays and occasionally, errors. Also, if observations are reported in “delayed-mode“ via IODE, one set of data formats are used while observations reported in “real-time” via IGOSS are stored in another set of formats. Many data and format conversions are required to assemble global data sets, such as the World Ocean Database. To understand and monitor climate changes in a useful time period, these data errors and delays must be minimized.

Moreover, observations may be made in one set of units while later processing may require a change of units. Each time data are converted from one set of units to another, errors may be introduced. As an example of data errors, consider a conversion of a temperature observation from one unit to another. The conversion can cause a loss of precision if the observation is recorded to whole Fahrenheit degrees and converted to whole degrees Celcius or an artificial gain of precision if converted to degrees and tenths of a degree Celcius. Conversions of temperature are relatively simple compared to processing chemical and biological observations which entail more complex definitions of parameters.

Ocean Data Formats

Many ocean observations data formats are “fixed”, that is, they can not be easily expanded to contain additional parameters beyond those for which it was designed. Examples are the archive formats World Ocean Database, ICES Profile Format, and US NODC SD2 Format. Special software is required to read each format. As described above, merging different data sets causes data delays and possible errors.

IOC has long recognized the need for a flexible format for ocean observations data that is self-describing and table-driven so that different data sets can be read with a single set of software. The first IOC table-driven format, General Format 3 (GF-3), was designed for use with magnetic tapes but is not convenient for use on networks. Later table-driven formats for ocean data include Blueprint and TADE developed by IDOE and FLEX developed for the IGOSS program. Blueprint, TADE, and FLEX formats were never widely used, however. Rather than develop special codes for ocean observations, it would be preferable for oceanographers to adopt observations codes used by meteorologists to promote integration and global exchange of ocean and weather observat

Meteorological Data Formats

Meteorologists have long recognized the need for fast global data exchange. Tomorrow’s weather here depends on today’s weather somewhere else. Large amounts of weather and satellite data are exchanged each day routinely in real-time around the world. Many weather observations are transmitted to analysis and weather forecast centers in “real-time”, that is, within 3 hours of observation time so that the data can be used in operational weather forecasts. In contrast, most research and ocean observations are submitted to data collection centers in “delayed-mode” with delays of years to decades.

Alphanumeric Data Formats

Exchange of weather data started with telegraph communications and was expanded with telex communications during World War II. In 1951 the World Meteorological Organization (WMO) was formed and started development of a global telex data network, the Global Telecommunications System (GTS). With the advent of numerical forecasting in the 1950s and 60s, the GTS was greatly expanded.

GTS telex "bulletins" of meteorological observations consist of a header record and a series of reports of observed values at individual stations. The telex reports consist of a series of five-character alphanumeric code groups where the position of a character is “fixed” in a group to indicate its function. The code groups are typically five characters long because telex companies charge by the word and words average five characters in length.

WMO developed approximately 100 alphanumeric codes over the years. Codes for oceanographic observations include:

A special alphanumeric code, GRID (FM-47), was developed for transmission of gridded, numerical analysis and forecast fields as alphanumeric characters. All these codes are described in the WMO Manual of Codes. Readers are encouraged to glance through the Manual to realize the variety and complexity of the alphanumeric codes.

Although the alphanumeric codes supported the development of today's global weather forecasting, they are cumbersome and inflexible. Changes to the codes require international agreement and thus years of preparation for suitable software changes to be made, tested and implemented at all national weather forecasting centers. Garbling of telex data is common and requires extensive software at the weather centers to detect and correct errors in the codes. The cost of developing and maintaining this software at major weather forecasting centers is comparable to the cost of developing the weather forecasting models themselves.

Table-Driven Code Forms (TDCF)

In the 1970s, when the development of error-checking and correcting transmission lines (such as the Internet) allowed error-free transmission of data files, WMO began developing two new codes to replace the alphanumeric codes. These are the Table-Driven Code Forms (TDCF) codes: BUFR for observations and GRIB for gridded fields. These codes were designed with the following elements in mind:

Although the alphanumeric codes are still used at small regional weather centers, the TDCF codes have now largely replaced the alphanumeric codes at most of the major weather forecasting centers globally, including ECMWF, FNMOC, JMA, NCEP, and UKMO. No new changes to the alphanumeric codes are being allowed. Many of these centers archive data in the TDCF codes to minimize format conversions between data transmission and archive functions.

Although the TDCF formats are used for operational meteorology, both oceanic and atmospheric researchers use the netCDF and HDF self-describing formats. These formats were designed for use with grid data and may be more convenient for large, multi-grid data sets than GRIB. These formats are, however, not ideal formats for observations. [EDITOR'S NOTE: The author's statement is very true for HDF, but perhaps somewhat less so, as it concerns NetCDF.]

Binary Universal Form for the Representation of meteorological data (BUFR FM-94)

The Binary Universal Form for Representation of meteorological data (BUFR) is the primary format used operationally on the World Meteorological Organization (WMO) Global Telecommunications System for real-time global exchange of weather and satellite observations. BUFR is a replacement code for a variety of alphanumeric codes which are still used by small regional weather centers.

BUFR is a self-describing and is table-driven, that is, it is a single format that uses tables to encode a wide variety of meteorological data: land and ship observations; aircraft observations; wind profiler observations; radar data; climatological data, etc. The primary BUFR tables are; Table B, which defines codes for common weather elements (element name, range, precision, bit lengths, etc.) and Table D, which defines codes for encoding common sequences of observations, such as surface weather observations from a ship.

A BUFR message consists of 6 sections of which the first and last sections are header information. Section 3 of the format provides a template of the bit stream of observations data which follows in Section 4. The template consists of a series of 16 bit data descriptors: F and X Y where F (2 bits) indicates if observations are replicated in blocks, X (6 bits) denotes the class of observation in Table B, and Y (8 bits) denotes the element within that class. A good description of BUFR messages is given in Wikipedia. More detailed descriptions and template examples are available from WMO, FNMOC, and NCEP. Sample BUFR tables can be downloaded from EUMETNET OPERA.

WMO Manual on Codes (including BUFR tables and templates) - from WMO

BUFR templates are used within the oceanographic community to meet the requirements of specific ocenaographic observations such as sub-surfaceing profile floats (ARGO), XBT temperature profile data, and wave data from Buoys.

Character form for the Representation and EXchange of data (CREX FM-95)

While packed binary formats such as BUFR are efficient for computers and data transmission, they are not convenient for manual encoding and decoding. WMO has developed a character version of BUFR, CREX, for manual use with small data sets and as a interim format until all weather forecasting centers implement BUFR software.

GRIdded Binary (GRIB FM-92)

The GRIdded Binary format, GRIB, is the primary World Meteorological Organization format for the storage and transmission of two-dimensional weather and climate grids, including the all-important numerical weather forecasts. GRIB replaces an older alphanumeric code GRID. Like BUFR, GRIB is table-driven: tables define weather elements, grid dimensions, scaling factors, map projections, bit lengths, etc. The original version of GRIB, GRIB1, has largely been displaced by the newer GRIB2. Like BUFR, GRIB is little used within the oceanographic community, except to the modelers who have used it for many years as both input and output from models that couple the ocean and atmosphere. They necessarily used GRIB as input, because that was how they obtained surface forcing data; and they used GRIB as output to use powerful graphics display programs that are available.

Detailed information on GRIB is available at many web sites.

GRIB1

Guide to WMO Code Form GRIB1 - from WMO

WMO Manual on Codes, Volume 1.2, Part B (Binary Codes), Part C (Common Features to Binary and Alphanumeric Codes)

NOAA Office Note 388, GRIB (Edition 1)

GRIB2

FM 92 - GRIB Edition 2 - from WMO

Guide to the WMO Table Driven Code Form Used for the Representation and Exchange of Regularly Spaced Data in Binary Form (PDF) - from WMO

NCEP Additions to GRIB Edition 2 Pre-Operational Approved by CBS for Full Implementation on 7 November 2007

TDCF Software

While writing software to encode and decode BUFR and GRIB is not a simple task, it is something that can be done once by specialists and then used widely by others. BUFR and GRIB encode\decode software is available from many sources, including the following: David Taylor, EUMETNET OPERA, ECMWF, US Navy Master Environmental Library, NOAA NCEP, and the NOAA National Digital Forecast Database. The IGES Grads program and the Unidata Integrated Data Viewer (IDV) program are more complex programs for plotting and analysis of BUFR and GRIB data, along with netCDF and many types of satellite data.

Use of TDCF Codes in Oceanography

Expanded use of the TDCF codes for oceanographic observations would have several benefits. If oceanographers were to use the WMO TDCF codes for ocean data, it would:

1. Allow local and regional ocean observations to be semi-automatically merged into global operational data streams so that local conditions can be seen in a larger time-space context,

2. Promote cooperation between oceanographers and meteorologists,

3. Promote “End-to-End” data management using common data formats for both data reporting and archive and thus reduce data delays and errors,

4. Promote common use of communications lines to conserve funds and software development, and,

5. Promote operational oceanography, real-time applications of ocean data, and development of integrated models of the atmosphere and ocean.

As a step towards use of TDCF codes for ocean observations, the GF-3 ocean data descriptors and other ocean variables, such as XBT fall rate parameters, have been added to the BUFR tables. http://www.ices.dk/ocean/codes/parameter.tx