EL EGOÍSMO
4.3 · REFLEXIONES A FAVOR DEL IMPARCIALISMO
Semantic elements in form of metadata describe the meaning of data, as explained by Nativi et al. (2008). For this purpose, the NetCDF data model has been expanded by a set of conventions in order to fully describe data for different scientific communities that take advantage of NetCDF (Nativi et al., 2008). Naming and attribute conventions were established to make NetCDF data self describing not only for humans, but also for machines, as Caron (2004) and Rew et al. (2010) explain. NetCDF conventions define important names for dimensions and variables, and specify relevant standard attributes with defined values. The scope of agreeing on conventions in NetCDF is the
Figure 2.6: UML diagram of the NCML Metadata Object Model (Caron et al., 2002)
advantage of having standardized data that contains the desired metadata information and that can easily be shared within a scientific community (Hartnett, 2009).
Conventions determine not only the meaning of variables and dimensions, but specify in particular also the employed coordinate system so that data can be correctly georeferenced. The coordinate system of a NetCDF dataset is defined by a set of coordinate variables (Caron, 2004). However complex integrated georeferencing information is not part of the data models and of the APIs of NetCDF and OPeNDAP, and usually not declared as for example in the form of attributes. Since this information is inferred, it depends thereof on general agreements within conventions (Caron, 2006; Caron & Domenico, 2006; Caron, 2008; Nativi et al., 2008; Rew, 2004; Rew et al., 2010). As a result, interoperability with Geographic Information Systems may suffer due to a lack of georeference metadata information in a NetCDF file. Rew et al. (2010) argue in this regard that specific additions to the data model of NetCDF would make some parts of these NetCDF conventions unnecessary or allow some defined forms of metadata, but this would complicate the model and making it less general. By identifying standardized variable names for the coordinate variables, NetCDF associates the distinct values of the coordinate variables to the corresponding coordinate axes. Except of this coordinate variable convention of Unidata, nothing depends on the names of variables. The axes can also be distinguished by NetCDF if special keywords in defined attributes of the coordinate variables are used (Caron, 2004; Gregory, 2003; Rew et al., 2010).
Table 2.1 on page 40 lists the actual conventions for NetCDF 3 that are registered at Unidata’s conventions web page (see Unidata Inc. n.d.c). Conventions for NetCDF 4 are still under development and subsequently somewhat premature (Unidata Inc. n.d.c). If a convention is employed for a NetCDF file, it is specified with the global attribute Conventions within a NetCDF file (Caron, 2004; Unidata Inc. n.d.c).
One of these NetCDF conventions is the Climate and Forecast (CF) meta- data convention (see CF Metadata Inc. n.d.b), that is one of the most pop- ular conventions within NetCDF 3, as stated by Nativi et al. (2008). This convention established in 2003 is recommended by Unidata for the use of gridded data within the climate and meteorological community and is quite loose to ensure backwards compatibility with the earlier COARDS conven- tion. Since this convention is designed facing metadata design in general, it can also be implemented within other formats, like XML for instance. CF also includes georeferencing conventions and defines various projections and transformation parameters. Any community member can suggest im- provements and modifications as well as report problems, that are publicly discussed and making CF and open standard. Features to this convention are only added when required and do not address future needs. This process is overseen by volunteer committees. The CF convention provides all necessary specifications for accessing remote data when it is used in conjunction with OPeNDAP. NetCDF conform with the Climate and Forecast Conventions and in combination with OPeNDAP was proven as recommended standard for gridded data by the Steering Team for the Data Management and Com- munications (DMAC) subsystem of the Integrated Ocean Observing System (IOOS) of the United States in 2008 (Balaji et al., 2008; Caron, 2006, 2008, 2010, 2011; Eaton et al., 2009; Gregory, 2003, 2005; Hankin et al., 2009; Hartnett, 2009; Rew, 2004; Rew et al., 2010).
The CF Convention was designed for the use of metadata, but not for dataset discovery, since it consists only in metadata information about where and how data was produced, as Rew (2004) remarks. For discovery purpose, the NetCDF Attribute Convention for Dataset Discovery was developed. It is used for discovery systems such as digital libraries like THREDDS and benefits from several elements of the CF Convention (Caron, 2011). However it does only define attributes and can not be regarded as NetCDF convention in a narrower sense, since coordinate axes and other important elements are not defined (Unidata Inc. n.d.c). For point observations in form of point, time series, trajectory, or profile observation datasets, Unidata recommends the Unidata Observation Dataset Convention (Caron, 2006). This convention however will be deprecated in favor of a new CF Convention for Point Ob- servations, as soon as it will be released (Unidata Inc. n.d.c). The COARDS
(C ooperative Ocean/Atmosphere Research Data S ervice) convention from 1995 is an older established NetCDF standard that is still widely used for global atmospheric and oceanographic data. The CF convention is a back- ward compatible successor of COARDS that extends and generalizes this older standard. In this context, the COARDS conventions can be seen as a subset of the CF conventions, though some features are deprecated (Balaji et al., 2008; Eaton et al., 2009; Unidata Inc. n.d.c).
The PMEL-EPIC convention was developed by NOAA for oceanographic profile or time-series in-situ data. This convention is designed for creating NetCDF files by the use of EPIC’s I/O System Library that is layered on top of the library of NetCDF. It is not intended to provide the full functionali- ties of NetCDF, but to simplify the production of standardized oceanographic NetCDF files (PMEL/EPIC Inc. n.d.). The ARGO GDAC and ARGO NODC NetCDF conventions were also developed for oceanographic in-situ data, re- lated to the ARGO broad-scale global array of temperature/salinity floats. The Argo NetCDF convention was extended to be compatible to the CF convention (Hankin et al., 2009; Unidata Inc. n.d.c).