0.85 Subsistemas de largueros
IV. RESULTADOS
4.1.1. Propiedades físicas
databases
THEORYSurname First name Home Address
Telephone No.
Lessons taken
Figure 4.3 Card index record from ski school manual database
Database management systems
The data in a computer database are managed and accessed through a database management system (DBMS). Individual application programs will access the data in the database through the DBMS. For example, to book a new client’s ski lessons, the book-ing clerk will use an application produced usbook-ing capabilities offered by the DBMS. This will instruct them to fill in a data entry form, which will then automatically update data in the database. The clerk will not need to interact with the database directly or understand how data are structured within the database. A conceptual view of the Happy Valley data as they might be organized for such a computer database, and the role of the DBMS, is shown in Figure 4.4 (adapted from Reeve, 1996).
There are many definitions of a DBMS. Dale and McLaughlin (1988) define a DBMS as a computer program to control the storage, retrieval and modi-fication of data (in a database). Stern and Stern (1993) consider that a DBMS will allow users to join, manipulate or otherwise access the data in any number of database files. A DBMS must allow the definition of data and their attributes and relation-ships, as well as providing security and an interface between the end users and their applications and the data itself. From such definitions the functions of a DBMS can be summarized as:
file handling and file management (for creating, modifying or deleting the database structure);
adding, updating and deleting records;
the extraction of information from data (sorting, summarizing and querying data);
maintenance of data security and integrity (housekeeping, logs, backup); and
application building.
The overall objective of a DBMS is to allow users to deal with data without needing to know how the data are physically stored and structured in the computer. To achieve this, DBMS usually comprise software tools for structuring, relating and querying data; tools for the design of data entry and report forms; and application generators for the creation of customized applications.
A DBMS manages data that are organized using a database data model. This is analogous to the way in which spatial data are organized in a GIS according to a spatial data model (for example, raster or vector). Database data models for GIS are similar to those used for databases elsewhere.
DATABASE DATA MODELS
There are a number of different database data models.
Amongst those that have been used for attribute data in GIS are the hierarchical, network, relational, object-relational and object-oriented data models. Of these the relational data model has become the most widely used and will be considered in detail here, whilst the models based on objects growing in popu-larity. Further details of the hierarchical and network models can be found in older GIS texts (Aronoff, 1991;
Bernhardsen, 1999; DeMers, 2002).
The Relational Database Model
At present the relational database model dominates GIS. Many GIS software packages link directly to commercial relational database packages, and others include their own custom-designed relational data-base software. Some GIS use a relational datadata-base to handle spatial as well as attribute data.
The relational data model is based on concepts proposed by Codd (1970). Data are organized in a series of two-dimensional tables, each of which con-tains records for one entity. These tables are linked by common data known as keys. Queries are possi-ble on individual tapossi-bles or on groups of tapossi-bles. For the Happy Valley data, Figure 4.5 illustrates an example of one such table.
Travel
Figure 4.4 The database approach to data handling
ApplicationData
Each table in a relational database contains data for one entity. In the example in Figure 4.5 this entity is ‘hotel’. The data are organized into rows and columns, with the columns containing the attributes of the entity. Each of the columns has a distinctive name, and each of the entries in a single column must be drawn from the same domain (where a domain may be all integer values, or dates or text). Within a table, the order of the columns has no special significance. Other characteristics are listed by Reeve (1996). There can be only one entry per cell; each row must be distinctive (so that keys that use unique row entries are possible – in GIS location is often the key); and null values are possi-ble where data values are not known.
The terminology of relational databases can be confusing, since different software vendors have adopted different terms for the same thing. Table 4.1 illustrates the relationship between relational data-base terminology and the traditional table, or simple computer file. Figure 4.6 applies this terminology to the table suggested for the Happy Valley database. A useful shorthand way of describing a table is using its
‘intension’. For the table in Figure 4.5 this would be:
HOTEL (Hotel ID, Name, Address, No. rooms, Standard).
The data in a relational database are stored as a set of base tables with the characteristics described above.
Other tables are created as the database is queried and these represent virtual views. The table structure is extremely flexible and allows a wide variety of queries on the data. Queries are possible on one table at a time (for example, you might ask ‘which hotels have more than 14 rooms?’ or ‘which hotels are luxury standard?’), or on more than one table by linking through key fields (for instance, ‘which passengers originating from the UK are staying in luxury hotels?’
or ‘which ski lessons have pupils who are over 50
Hotel ID 10 Ski School Road
Number of rooms
Figure 4.5 Relational database table data for Happy Valley
Hotel ID 10 Ski School Road
Number of rooms
Figure 4.6 Database terminology applied to Happy Valley table
TABLE 4.1 Relational database terminology
Paper version File version RDBMS
Table File Relation
Row Record/case Tuple
Column Field Attribute
Number of Number of Degree
columns fields
Number of rows Number of cases Cardinality Unique ID Primary key Index
Possible values Domain Source: Adapted from Date, 1986
years of age?’). Queries generate further tables, but these new tables are not usually stored. There are few restrictions on the types of query possible.
With many relational databases querying is facili-tated by menu systems and icons, or ‘query by example’ systems. Frequently, queries are built up of expressions based on relational algebra, using com-mands such as SELECT (to select a subset of rows), PROJECT (to select a subset of columns) or JOIN (to join tables based on key fields). SQL (standard query language) has been developed to facilitate the query-ing of relational databases. The advantages of SQL for database users are its completeness, simplicity, pseudo English-language style and wide application.
However, SQL has not really developed to handle geographical concepts such as ‘near to’, ‘far from’ or
‘connected to’.
The availability of SQL is one of the advantages of the relational database model. Additionally, the model has a sound theoretical base in mathematics, and a simple logical data model that is easy to under-stand. The relational database model is more flexible than either of the previously used hierarchical or network models. However, the model will always
produce some data redundancy and can be slow and difficult to implement. There are also problems with the handling of complex objects such as those found in GIS (and CAD and knowledge-based applications) as there is a limited range of data types, and difficul-ties with the handling of time. Seaborn (1995) considers that many of the limitations of relational databases in GIS stem from the fact that they were developed to handle simple business data, and not complex multi-dimensional spatial data. However, they have been widely adopted and successfully used.
Relational databases are predominantly used for the handling of attribute data in GIS. For example, ESRI’s ArcGIS maintains an attribute table in rela-tional database software, using a unique ID to link this to spatial data.
CREATING A DATABASE
Database design and implementation are guided by the relationships between the data to be stored in the database. The database design process is con-cerned with expressing these relationships, then
The steps involved in database creation, suggested by Oxborrow (1989) and Reeve (1996), are summarized in Table 4.2 and described below.
1 Data investigation is the ‘fact finding’ stage of data base creation. Here the task is to consider the type, quantity and qualities of data to be included in the data-base. The nature of entities and attributes is decided.
2 Data modelling is the process of forming a con-ceptual model of data by examining the relationships between entities and the characteristics of entities and attributes. This stage, like the data investigation stage, can be carried out independently of the soft-ware to be used.
3 Database design is the creation of a practical design for the database. This will depend on the data-base software being used, and its data model. This is the process of translating the logical design for the database (produced during the data modelling stage) into a design for the chosen DBMS. Field names, types and structure are decided. In practice, the design will be a compromise to fit the database design model with the chosen DBMS.
4 Database implementation is the procedure of populating the database with attribute data, and this is always followed by monitoring and upkeep, includ-ing fine tuninclud-ing, modification and updatinclud-ing.