Help:GIS tool

From Opasnet
Jump to: navigation, search

Note: the GIS protocol and data sets in ESRI file formats are now available. Template:Release Some degree of GIS or spatial analysis functionality is desirable within the toolbox. Here we outline what essentially a GIS is, before discussing more specific issues of concern to the Intarese project with regard to spatial functionality. We finish by offering 2 alternative implementation strategies.

What is GIS?

A geographic information system (GIS) is a system for capturing, storing, analysing and managing data and associated attributes which are spatially referenced to the earth. In the strictest sense, it is a computer system capable of integrating, storing, editing, analysing, sharing, and displaying geographically-referenced information. In a more generic sense, GIS is a tool that allows users to create interactive queries (user created searches), analyse the spatial information, edit data, maps, and present the results of all these operations. GIS Science (GISc) is the science underlying the geographic concepts, applications and systems.

Spatial issues relevant to Intarese

Spatial data and descriptive statistics

There are several potential difficulties associated with the analysis of spatial data, among these are boundary delineation, modifiable areal units, and the level of spatial aggregation or scale. In each of these cases, the absolute descriptive statistics of an area - the mean, median, mode, standard deviation, and variation - are changed through the manipulation of these spatial problems.

Boundary delineation

The location of a study area boundary and the positioning of internal boundaries affect various descriptive statistics. With respect to measures such as the mean or standard deviation, the study area size alone may have large implications; consider a study of per capita income within a city, if confined to the inner city, income levels are likely to be lower because of a less affluent population, if expanded to include the suburbs or surrounding communities, income levels will become greater with the influence of homeowner populations. Because of this problem, absolute descriptive statistics such as the mean, standard deviation, and variance should be evaluated comparatively only in relation to a particular study area. In the determination of internal boundaries this is also true, as these statistics may only have valid interpretations for the area and subarea configuration over which they are calculated.

Modifiable areal units

In many cases the subdivision of spatial data has already been determined, this is evident in demographic datasets, as the available information will be grouped into their respective counties or municipalities. For this type of data, analysts must use the same county or municipal boundaries delineated in the collected data for their subsequent analysis. When alternate boundaries are possible, an analyst must take into account that any new subdivision model may create different results.

Spatial aggregation/scale problem

Socio-economic data may be available at a variety of scales, for example: municipalities, regional districts, census tracts, enumeration districts, or at the provincial/state level. When this data is aggregated at different scales, the resulting descriptive statistics may exhibit variations, either in a systematic, predictable way, or in a more uncertain fashion. If we are observing economic data, we may notice a distinct reduction in manufacturing productivity for a country (the USA) over a certain period; since this is a general model, individual states may experience these effects differently. The result of this aggregation is that the standard deviation of the data in question is increased due to the variability among states.

Why does the toolbox need spatial functionality?

Some form of spatial functionality, whilst not fundamental to the toolbox, is extremely desirable. There are two distinct types of analysis that a user might wish carry out. Firstly, a user might want to simply represent spatial information, that is to explore and understand the output from other models and components within the toolbox. Visualising results in map form adds an additional level of understanding to simply looking at results in tabular form. Secondly, users might wish to carry-out analysis of geographically referenced data, for which a more fully functioning GIS might be required.

Functionality requirements

Ideally, we see the spatial component of the toolbox including the following functionality.

  • Scalability - user ability to select and move between different scales of representation
  • Interactivity - ability to reshape outputs on the fly/dynamic links between data and output
  • Flexibility in choice of output (maps, graphs, tables, reports etc)
  • Full-chain linkage - it must be possible to visualise and link every step in the process (from issue-framing to exposure assessment to impact assessment)
  • Dynamic linkage - not just back to data but between maps, reports, tables, graphs - and between the issue-framing diagram and everything else
  • Help or guidance - on map design etc
  • MAUP and areal unit issues
  • Uncertainty mapping - ability to map uncertainty and show error distributions
  • Dynamic uncertainty analysis - ability to remap data according to user input thresholds for uncertainty etc (see Aquila)
  • Error tracking - so that uncertainties can be linked back to the issue-framing diagram
  • Remodelling tools - e.g. seamlessly to produce different types of output (eg. scattergrams into box plots; choropleth maps into cartograms)
  • Availability of ready-made templates (for all main types of output)

Implementation options within Intarese toolbox

There are a range of possible options for the provision of spatial visualisation and analysis within the Intarese toolbox. These include a fully functioning GIS, encompassing both the ability to visualise and explore spatial data and crucially, to analyse the data. Whilst desirable from a functionality perspective, implementing this within the toolbox could be difficult for software licensing reasons, but also because proprietary GIS demand an advanced skill-set which would be unrealistic to expect toolbox users to have. A compromise solution might include the provision within the toolbox of basic GIS visualisation and exploration, possibly using software from free online map servers. This wouldn’t allow all the functionality listed above, but might be a useful stepping-stone towards this. Users are becoming more spatially aware, with the emergence of (online) software such as Google Maps, Google Earth and Microsoft’s Virtual Earth, and thus are now familiar with the basic spatial functions such as pan/zoom, overlaying spatial layers and spatial queries that such pieces of software provide.

Recommended software solution(s)

There are two proposed solutions, depending on the agreed level of functionality required.

Solution 1: Web-based spatial data explorer

Implement a customised internet map server application, such as the University of Minnesota’s UMN Map Server ( This is an Open Source development environment for building spatially-enabled internet applications. In addition to being open-source, and hence free to license, a major advantage of the UMN map server is that it has cross-platform support, including Linux, Windows, Mac OS X, Solaris. MapServer is not a full-featured GIS system, but rather it renders spatial data (maps, images, and vector data) for the web. It thus would be relatively simple to implement, but only partially cover the list of functionality requirements above.

Figure 1: Example Map Server screenshot

Solution 2: GeoDa

A more fully functioning system which we should consider is Luc Anselin’s GeoDA software from the University of Illinois, USA ( This piece of software covers more comprehensively our functionality wish-list that most commercially available GIS systems, with particular strengths in spatial statistical analysis. The design of GeoDa consists of an interactive environment that combines maps with statistical graphics, using the technology of dynamically linked windows. GeoDa is freestanding and does not require a specific GIS system. GeoDa runs under any of the Microsoft Windows flavoured operating systems. It also runs under the Virtual PC windows emulator on Mac operating systems (MacOS 9 and MacOS X) as well as natively in Bootcamp or Parallels on Intel Macs. Its installation routine contains all required files and libraries. A cross-platform, open source version of the program (OpenGeoDa) is under development that will run on Windows, Mac OS, and Linux (released in the near future). All versions are written in C++. This software is free to install locally on a PC for academic purposes.

Figure 2: Example GeoDa screenshot, illustrating dynamically linked windows, maps, boxplots and scattergrams.
Figure 3:Example GeoDa screenshot, illustrating dynamically linked windows, thiessen polygon maps, 3D plots and a cartogram.