UNIT 11 - SPATIAL OBJECTS AND DATABASE MODELS

UNIT 11 - SPATIAL OBJECTS AND DATABASE MODELS

Compiled with assistance from Timothy L. Nyerges, University of Washington

• A. INTRODUCTION

• B. POINT DATA

• C. LINE DATA
• D. AREA DATA
• E. REPRESENTATION OF CONTINUOUS SURFACES
• REFERENCES

• EXAM AND DISCUSSION QUESTIONS

• NOTES

This unit continues the development of basic concepts about representing reality as spatial data. Here we look at how the representation of reality in the form of entities is handled with the spatial objects points, lines and areas.

UNIT 11 - SPATIAL OBJECTS AND DATABASE MODELS

Compiled with assistance from Timothy L. Nyerges, University of Washington

• the objects in a spatial database are representations of real-world entities with associated attributes

• the power of a GIS comes from its ability to look at entities in their geographical context and examine relationships between entities

• thus a GIS database is much more than a collection of objects and attributes

• in this unit we look at the ways a spatial database can be assembled from simple objects
• e.g. how are lines linked together to form complex hydrologic or transportation networks
• e.g. how can points, lines or areas be used to represent more complex entities like surfaces?

B. POINT DATA

• the simplest type of spatial object

• choice of entities which will be represented as points depends on the scale of the map/study
• e.g. on a large scale map - encode building structures as point locations
• e.g. on a small scale map - encode cities as point locations

• the coordinates of each point can be stored as two additional attributes

• information on a set of points can be viewed as an extended attribute table
• each row is a point - all information about the point is contained in the row
• each column is an attribute
• two of the columns are the coordinates

overhead - Point data attribute table
• here northing and easting represent y and x coordinates

• each point is independent of every other point, represented as a separate row in the database model

C. LINE DATA

• infrastructure networks
• transportation networks - highways and railroads
• utility networks - gas, electric, telephone, water
• airline networks - hubs and routes

• natural networks
• river channels

Network characteristics

• a network is composed of:
• nodes - junctions, ends of dangling lines
• links - chains in the database model

diagram

• valency of a node is the number of links at the node
• ends of dangling lines are "1-valent"
• 4-valent nodes are most common in street networks
• 3-valent nodes are most common in hydrology

• a tree network has only one path between any pair of nodes, no loops or circuits are possible
• most river networks are trees

Attributes

• direction of traffic, volume of traffic, length, number of lanes, time to travel along link
• diameter of pipe, direction of gas flow
• voltage of electrical transmission line, height of towers
• number of tracks, number of trains, gradient, width of most narrow tunnel, load bearing capacity of weakest bridge

• examples of node attributes:
• presence of traffic lights, presence of overpass, names of intersecting streets
• presence of shutoff valves, transformers

• note that some attributes (e.g. names of intersecting streets) link one type of entity to another (nodes to links)

• some attributes are associated with parts of network links
• e.g. part of a railroad link between two junctions may be inside a tunnel
• e.g. part of a highway link between two junctions may need pavement maintenance

• many GIS systems require such attributes to be attached to the network by splitting existing links and creating new nodes
• e.g. split a street link at the house and attach the attributes of the house to the new (2-valent) node
• e.g. create a new link for the stretch of railroad which lies inside the tunnel, plus 2 new nodes

• this requirement can lead to impossibly large numbers of links and 2-valent nodes
• e.g. at a scale of 1:100,000, the US rail network has about 300,000 links
• the number of links would increase by orders of magnitude if new nodes had to be defined in order to locate bridges on links

• often need to use the network as an addressing system, e.g. street network

• address matching is the process of locating a house on a street network from its street address
• e.g. if it is known that the block contains houses numbers from 100 to 198, house #124 would probably be 1/4 of the way along that link

• points can be located on the network by link number and distance from beginning of link
• this can be more useful than the (x,y) coordinates of points since it links the points to a location on the network

• this approach provides an answer to the problem of assigning attributes to parts of links
• keep such entities (houses, tunnels) in separate tables, link them to the network by link number and distance from beginning of link
• need one distance for point entities, two for extended entities like tunnels (start and end locations)
• the GIS can then compute the (x,y) coordinates of the entities if needed

• links need not be permanently split in this scheme

D. AREA DATA

• is represented on area class maps, choropleth maps

• boundaries may be defined by natural phenomena, e.g. lake, or by man, e.g. forest stands, census zones

• there are several types of areas that can be represented

1. Environmental/natural resource zones

• examples include
• land cover data - forests, wetlands, urban
• geological data - rock types
• forestry data - forest "stands", "compartments"
• soil data - soil types

• boundaries are defined by the phenomenon itself
• e.g. changes of soil type

• almost all junctions are 3-valent

2. Socio-economic zones

• includes census tracts, ZIP codes, etc.

• boundaries defined independently of the phenomenon, then attribute values are enumerated

• boundaries may be culturally defined, e.g. neighborhoods

3. Land records
• land parcel boundaries, land use, land ownership, tax information

Areal coverage

1. entities are isolated areas, possibly overlapping

• any place can be within any number of entities, or none
• e.g. areas burned by forest fires
• areas do not exhaust the space

2. any place is within exactly one entity
• areas exhaust the space
• every boundary line separates exactly two areas, except for the outer boundary of the mapped area
• areas may not overlap

• any layer of the first type can be converted to one of the second type
• each area may now have any number of fire attributes, depending on how many times it has been burned - unburned areas will have none

Holes and islands

• areas often have "holes" or areas of different attributes wholly enclosed within them

diagram

• the database must be able to deal with these correctly
• this has not always been true of GIS products

• cases can be complex, for example:
• Lake Huron is a "hole" in the North American landmass
• Manitoulin Island is a "hole" in Lake Huron
• Manitoulin Island has several large lakes, including one which is the largest lake on an island in a lake anywhere
• some of these lakes have islands in them

• some systems allow area entities to have islands
• more than one primitive single-boundary area can be grouped into an area object
• e.g. the area served by a school or shopping center may have more than one island, but only one set of attributes

diagram

E. REPRESENTATION OF CONTINUOUS SURFACES

• examples of continuous surfaces are:
• elevation (as part of topographic data)
• rainfall, pressure, temperature
• population density

• potential must exist for sampling observations everywhere on an interval/ratio level

General nature of surfaces

• critical points
• peaks and pits - highest and lowest points
• ridge lines, valley bottoms - lines across which slope reverses suddenly
• passes - convergence of 2 ridges and 2 valleys

• faults - sharp discontinuities of elevation - cliffs

• fronts - sharp discontinuities of slope

• slopes and aspects can be derived from elevations

Data structures for representing surfaces

• traditional data models do not have a method for representing surfaces
• therefore, surfaces are represented by the use of points, lines or areas

• note: the following series of three overheads on Tiefort Mountains all represent the same area

1. points - grid of elevations overhead - Elevation represented as points

• DEM or Digital Elevation Model
• based on sampling the elevation surface at regular intervals
• result is a matrix of points
• much digital elevation data available in this form

2. lines - digitized contours overhead - Elevation represented as lines
• from DLG hypsography layer, identical to those on the printed map, plotted directly from stereo photography
• based on string object type
• a line connecting sampled points of equal elevation
• elevation is attribute
• could be done for rainfall, barometric pressure etc.

3. areas - TIN (Triangulated irregular network) overhead - Triangulation of a terrain surface

overhead - Elevation represented as areas

• note: perspective diagram is developed from the triangulated surface (TIN created by M.P. Kumler, USGS)
• sample points often located at peaks, pits, along ridges and valleys
• sampling can be varied depending on ruggedness of the surface
• a very efficient way of representing topography
• result is TIN composed of nodes, lines and triangular faces

Spatial interpolation

• frequently when using continuous data we wish to estimate values at specific locations which are not part of the point, line or area dataset
• these values must be determined from the surrounding values using techniques of spatial interpolation (see Units 40 and 41)
• e.g. to interpolate contours, a regular grid is often interpolated from an irregular scatter of points or densified from a sparse grid

REFERENCES

Burrough, P. A., 1986. Geographical Information Systems for Land Resources Assessment, Clarendon Press, Oxford. See chapter 2 for a review of database models.

Dueker, K. J., 1987. "Geographic Information Systems and Computer-Aided Mapping," American Planning Association Journal, Summer 1987:383-390. Compares database models in GIS and computer mapping.

Mark, D.M., 1978. "Concepts of Data Structure for Digital Terrain Models," Proceedings of the Digital Terrain Models (DTM) Symposium, ASP and ACSM, pp. 24-31. A comprehensive discussion of DEM database models.

Marx, R. W., 1986. "The TIGER System: Automating the Geographic Structure of the United States Census," Government Publications Review 13:181-201. Issues in the selection of a database model for TIGER.

Nyerges, T. L. and K. J. Dueker, 1988. Geographic Information Systems in Transportation, Federal Highway Administration, Division of Planning, Washington, D. C. Database model concerns in transportation applications of GIS.

Peuquet, D.J., 1984. "A conceptual framework and comparison of spatial data models," Cartographica 21(4):66-113. An

excellent review of the various spatial data models used in GIS.

1. How does a natural zone coverage differ from an enumeration zone coverage? Describe the differences in terms of (a) application areas, (b) visual appearance, (c) compilation of data.

2. Compare the various data models for elevation data. Which would you expect to be best for (a) a landscape dominated by fluvial erosion and dendritic drainage patterns, (b) a glaciated landscape, (c) a barometric weather map with fronts, (d) a map of population densities for North America.

3. What data models might be needed in a system to monitor oil spills and potential environmental damage to coastlines? Give examples of appropriate spatial objects and associated attributes.

4. Describe the differences between the data models commonly used in remote sensing, computer assisted design, automated cartography and GIS.