Department of Mathematical Statistics & Operational Research

University of Exeter, Exeter, EX4 4QE, UK.

Introduction

It is widely accepted that Geographical Information Systems (GIS) and related mapping technologies provide important potential benefits in the visualisation, exploration and modelling of the kind of spatial data typical in many public and environmental health applications. At the same time it is acknowledged that there are problems in using Geographical Information Systems (GIS) in conjunction with the level of analytical sophistication appropriate to this field. Here, we discuss in general terms modern statistical analysis needs of those engaged in environmental and spatial epidemiology, commenting on useful analytical tools. We then briefly review what progress has been made in fusing GIS with such techniques and on the range of other possible software environments which offer similar benefits.

Spatial statistical methods in health research

In general, 'spatial analysis' deals with the quantitative study of phenomena that are spatially-referenced. However, it is not simply that the objects analysed are located in space which characterises spatial analysis; rather it is that their spatial configuration, or relative location, call it what one will, plays a potentially prominent, if not crucial, role in the analysis. Spatial analysis in the health field is therefore concerned with the quantitative study of disease distributions and patterns of health care and service delivery, where the objects of study (diseases, access to, and provision of, health facilities) are geographically defined.

One important subset of spatial analysis which might be termed 'spatial data analysis' (Bailey and Gatrell, 1995), concerns the analysis of observational data that represent the outcome of a process operating in space. Statistical methods are sought to describe and perhaps explain such data, often by seeking relationships to other spatially-defined data. In the remainder of this paper we focus on this because in geographical health research it is almost exclusively spatial data analysis which is relevant to addressing the questions of interest, rather than other forms of quantitative spatial analysis; although there are some obvious exceptions such as work concerned with access to health care facilities, or the optimal location of such facilities.

Thinking of spatial analysis in this restricted sense as being primarily a statistical exercise, can we then 'unpack' it to focus on broad generic types of method useful in epidemiology and environmental health? One useful way may be to draw a distinction between three 'tasks' in applied statistical spatial analysis:

Visualisation

Exploration

Modelling

First, we will want to see, perhaps in rather unusual ways, the data being analysed. Clearly, the primary visualisation tool will be the map, but it is possible to use various forms of mathematical transformations to create imaginative and novel maps that shed new light on the geography being investigated (e.g. Dorling, 1993). For example, in an epidemiological context we might want methods which 'stretch' a geographical area of interest in direct relation to risk population in sub-regions. The resulting 'cartogram', or 'density equalised map' will then help to visually highlight any possible 'clustering' of mapped health events over and above that to be expected by variations in the population at risk (e.g. Selvin, 1988).

Second, we shall require methods for exploratory data analysis, techniques that involve seeking good descriptions of the data and that encourage us to formulate hypotheses about the processes that gave rise to the data. Under this heading we require techniques that help us detect pattern, investigate relationships, highlight unusual values or 'outliers', and so on. Clearly, the distinction between certain methods for visualisation and those for exploration is a fine one and we do not want to over-emphasise the differences, however it is useful to discriminate broadly between primarily cartographic methods, and those which involve more statistical data processing. In a health context we might, for example, wish to use 'empirical Bayes' methods to 'smooth' possible random distortions in standardised morbidity or mortality rates caused by small numbers of observed cases in areas of low populations, so as to more clearly discern underlying geographical patterns or trends in the true rates. Different forms of kernel density estimation also provide useful and powerful exploratory tools in a variety of contexts relevant to geographical epidemiology.

Third, we might want to fit a model to data in order to test an explicit hypothesis concerning suspected associations in the data, or to allow us to make spatial or spatio-temporal predictions about some aspect of concern. Here, the aim is to go beyond the inspection of data and explicitly formulate the nature of relationships between a phenomenon of interest and relevant covariates into a statistical model whose parameters we can then estimate from the data and use to assess the impact of various factors on that phenomenon. In spatial analysis relative location or spatial configuration may be expected to figure prominently in such models. For example, we might wish to spatially interpolate the prevalence of some environmental factor in an area of interest on the basis of observed measurements at a number of sample sites; various 'geostatistical' kriging methods are then obvious candidates in the analysis. Alternatively, the interest might be in testing the hypothesis that the incidence of disease is elevated around possible point sources of environmental pollution; in which case various forms of point pattern analysis become relevant. In other cases the issue may be modelling the relationship between disease rates and other spatially-defined data; spatial regression or spatial forms of generalised linear models will then be required.

Given these three generic types of methods in spatial data analysis, in what ways do GIS currently assist the spatial data analyst? Clearly, they provide obvious and valuable enhancements to our ability to visualise spatial data. However, in contrast to the needs identified above, 'spatial analysis' was historically interpreted in a GIS context to refer largely to standard GIS spatial database operations such as spatial query, buffering, and overlay. When it began to be interpreted more widely, the initial emphasis was in deterministic digital elevation modelling in geology and soil science, in methods for handling remotely sensed images, or in mathematical optimisation methods useful in network analysis and transportation. Therefore researchers entering the GIS arena from the standpoint of spatial epidemiology, originally found few exploratory data analysis and statistical modelling tools relevant to their needs. In the past Bailey (1994), and many others (e.g. Anselin and Getis, 1993; Openshaw, 1991; Goodchild, 1987), have repeatedly pointed out these deficiencies. However, the situation has steadily improved and it is no longer the case that all such systems are so analytically sterile in respect of spatial epidemiology. Before, coming to any conclusions on the current usefulness of GIS in spatial epidemiology, we therefore need to review such criticisms in the light of various recently proposed software environments which do indeed go some way along the path we wish to follow. Some of these involve GIS directly, others incorporate computer mapping and some useful GIS functionality, although they would not qualify as GIS per se.

Spatial data analysis and GIS or related technologies

We consider here some examples of both GIS and other software environments currently available to perform the broad classes of spatial data analysis we have identified as desirable in spatial epidemiology.

Starting with proprietary GIS software packages, there is no space here to catalogue statistical functionality in the myriad of commercial packages available. It will suffice to illustrate the current situation by reference to one or two widely used GIS systems which are fairly typical of the state of the art in terms of spatial data analysis. IDRISI (Eastman, 1990) is a low--cost, essentially raster--based GIS that is now widely used throughout the world. It certainly offers modules for what we would consider as spatial data analysis, including: quadrat--based methods for point pattern analysis; autocorrelation tests; spatial filtering; and the fitting of trend surface models; however, many analysis modules are oriented towards the processing of remotely sensed images and few of these methods are of direct relevance in a health context. The more sophisticated US raster--based system GRASS also has a modular structure in which it is possible to add ones' own routines. Using the GRASS libraries programmers can develop their own interactive utilities or link GRASS to statistical analysis software. Some researchers have, for example, reported results from performing spatial point process analysis (Maclennan, 1990), a feature that renders the software of potential relevance in a health context. MapInfo is another example of a low-cost GIS which contains useful, although fairly basic, statistical functionality and the facility to add 'macros' for more specialised requirements.

However, the industry standard in GIS systems is probably best represented by ARC/INFO. Early versions (pre 1991) had limited functionality for performing what we regard as spatial data analysis. Very simple descriptive statistics could be obtained, but these were aspatial and even to compute a correlation coefficient between two variables in a coverage was not simple. Version 6.0 offered some more sophisticated interpolation strategies, based on methods common in 'geostatistics'. It now became possible to estimate a variogram from a spatially continuous variable, to fit a variety of theoretical models to such empirical variograms, and to use a chosen model in the interpolation procedure. Within TIN the interpolated surface could be contoured and the estimation errors displayed; this formed a most welcome addition to the spatial analyst's armoury.

Release of version 6.0 of ARC/INFO also saw the appearance of GRID, the raster--processing subsystem of ARC/INFO. As with GRASS and IDRISI, GRID, offers a set of tools for spatial analysis; such tools include: overlay analysis; interpolation; distance analysis; predictive mapping; terrain analysis; watershed analysis. Each grid in the spatial database represents a single variable, which may be categorical or continuous. Operations available within GRID are classified according to the set of elements on which they operate: for example, operations may be on individual cells ('local') or small neighbourhoods ('focal'), where the neighbourhoods can take a range of shapes and sizes. Other operations are 'zonal', relating to a set of cells sharing the same value. Lastly, the entire coverage may be analysed, using 'global' operators. These are, of course, valuable tools. However, they require coverages to be in raster format and a worry is that users will seek to rasterise existing vector coverages in order to use GRID functions. While such rasterisation may, in some cases, be appropriate, for other types of data it will not. For example, very little data used in public health will be raster data.

Maguire (1993) reports the effort devoted to adding to the spatial data analysis functionality in version 7.0 of ARC/INFO. Developments within GRID include the addition of routines for multivariate classification using principal components and cluster analysis and the provision of routines for shape analysis. Also methods for dispersion modelling (predicting the pathways of water or air pollutants through space) and, within the NETWORK modules, further methods for location--allocation modelling. All of this is to be welcomed, though again it remains unclear why much of this analysis is restricted to GRID. For example, many of the applications of multivariate clustering might be to Census data, the areal basis of which is hardly a raster.

Given the recent and current state of spatial analysis in ARC/INFO what progress have researchers been able to make in extending these by 'add ons' to the basic ARC/INFO environment? Much of this progress seems, perhaps not surprisingly, to have centred on efforts to devise tests for spatial autocorrelation among area (zonal) data, with several research groups offering, independently, their own solutions (Bivand and Charlton, 1992; Ding and Fotheringham, 1992; Can, 1993). All approaches extract data from the Arc Attribute Table (AAT file), which gives topological information concerning polygon adjacency. It is easy then to construct a weights matrix for calculating an autocorrelation statistic. Ding and Fotheringham (1992) generalise this to other forms of weights matrix and also to the calculation of so--called G_i statistics; these are measures of spatial association devised by Getis and Ord (1992). As with other approaches, Ding and Fotheringham organise this so that C programs to perform the analysis are run from within ARC/INFO.

Majure and Cressie (1993) and Jefferis (1993) report on separate projects designed to add other spatial analysis capabilities to ARC/INFO. Majure and Cressie follow the path broken by Haslett and his colleagues discussed later in this section; their project is currently restricted to spatially continuous data sampled at sites; for example, groundwater or geological measurements. Map views, scatter diagrams and variograms (measuring mean square difference between values at sites separated by a given distance) are all dynamically linked; as with Haslett's system, selecting a spatial query region in one window highlights the observations in other views. Jefferis' research programme attempts to add a range of exploratory univariate statistics (e.g. histograms and box plots), variograms, simple point pattern statistics, and more, to ARC/INFO.

The discussion above relates to developments in the basic functionality of proprietary GIS packages, or to exploiting the `macro language' facilities available in such packages to construct analysis extensions. However, it is ultimately unrealistic to expect the vendors of systems to incorporate each and every specialised method for exploratory spatial data analysis, or spatial modelling, into the basic functionality of such systems. At the same time, the development of GIS macro extensions requires detailed knowledge of the GIS in question, has limited flexibility and requires considerable programming skills. We therefore also need to include in our discussion the alternative of 'stand-alone' spatial data analysis packages some of which offer GIS features.

Griffith et al (1990) have taken the widely-used package MINITAB and developed a series of 'macros' that implement various spatial regression models. We also draw attention to Bivand's work (reported in Bivand and Charlton, 1992) on embedding spatial autoregressive modelling into SYSTAT, via routines available within the GAL library (Dixon et al, 1988) for extracting contiguity information. Various spatial and environmental epidemiologists have developed their own 'one off' special-purpose packages, thought these do not always have good mapping or graphical interfaces. Perhaps a better example is the 'Geographical Analysis Machine' developed as an aid to detecting the existence of leukaemia 'clusters'; (see, for instance, Openshaw et al, 1987; Openshaw and Craft, 1991). Another major 'stand alone' package is 'SpaceStat' (Anselin, 1991) which offers options for fitting a wide range of spatial regression models.

Some 'stand alone' spatial analysis systems are more overtly graphical and interactive than those considered above. For example, EpiMap is a PC mapping program designed to link to EpiInfo, a series of computer, programs produced in conjunction with the World Health Organization, to provide public-domain software for statistical analysis and data management in public health. GS+ combines various geostatistical methods useful in the environmental sciences with mapping facilities. INFO-MAP (Bailey, 1990; Bailey and Gatrell, 1995), is another example of such systems. It was developed for teaching spatial data analysis and can perform many of the simpler variants of aspatial analyses to be found in standard statistics packages, with the added bonus of being able to immediately visualise the results map form. The INFO-MAP language also has several spatial functions (such as calculation of the areas of polygons or zones, the distances between locations, the nearest neighbours of locations, adjacencies of polygons, and so on) which may be used in conjunction with traditional statistical analyses. This offers a potentially powerful framework in which to perform a variety of spatial analyses on point data (e.g. kernel estimation, K-functions), spatially continuous data (variogram estimation, kriging, trend surface modelling) and area data (autocorrelation statistics, spatial regression modelling).

Other specialised spatial analysis packages build more upon the considerable developments in dynamic statistical graphics that have taken place in recent years and which permit the user to simultaneously examine multiple `views' of data, such as maps, histograms, scatter diagrams, numeric lists and so on. Such views are active and linked, in that such that selecting and highlighting a subset of observations in one view causes those observations to be highlighted in other views or windows. Haslett and his colleagues have developed a system called REGARD (Haslett et al, 1991) which is of considerable use in exploratory spatial data analysis. For example one could display a map of zones showing spatial variation in a variable and in a second window of the screen, a scatter diagram showing the relationship between that variable and another. If a small region of the map is highlighted the system then highlights, on the scatter diagram, the zones that have been defined spatially in the map window. There are many more advanced facilities in the system, such as looking at correlation structures in the data using linked windows displaying variograms. Similar kinds of spatial analysis systems have also been written using XLISP-STAT (Tierney, 1990), a statistical programming environment which also implements dynamic linking of graphical data views.

We conclude our general discussion of software environments which bring together GIS concepts and spatial analysis, by considering some initiatives which use a combination of these different approaches. That is they utilise a 'coupling' between the GIS and standard statistical packages or stand alone spatial analysis software (Goodchild et al, 1992; Anselin and Getis, 1993; Gatrell and Rowlingson, 1994). This amounts to having the GIS and the separate analysis software communicate through dynamic data exchange, or through files that each can read. Depending on the GIS or the analysis software in question, this may offer only limited flexibility in the kinds of data structures which may be interchanged and may be computationally very inefficient. However, a particularly promising example uses the statistical programming environment, S-Plus (Becker et al, 1990) in conjunction with ARC/INFO.

S-Plus is a modern, flexible statistical programming language with excellent graphical features, some of which are dynamic. The basic language can be used to build libraries of more complex functions which can incorporate, if necessary, special purpose routines written in lower level languages such as C or FORTRAN. The language also allows use of general operating system facilities to pass or receive data structures from other applications software. The S-Plus software environment thus allows the development of powerful and highly graphical spatial analysis tools. For example, SPLANCS (Rowlingson and Diggle, 1993) is a spatial analysis system for handling point pattern analysis based on S-Plus. This currently offers a range of functions, including: plots of point maps, spatially smooth estimates of local intensity (density) of points; options for summarising the spatial structure of the point pattern; tests for space--time clustering; and the fitting of certain models to such point data. Other collections of S-Plus functions have been developed which also incorporate various areas of spatial analysis, such as that provided with the recent text by Venables and Ripley (1994). More recently the developers of S-Plus themselves have released an add-on module known as S+SpatialStats which offers numerous functions relating to the analysis of point patterns and geostatistical and lattice data.

The potential of this modern statistical programming environment for spatial analysis is enhanced by the recent development of a 'close coupling' between S-Plus and ARC/INFO which offers a seamless bi-directional link between the GIS and the statistics and data analysis environment. It is possible now to transfer an ARC/INFO database to S-Plus for subsequent statistical analysis and results from the analysis may then be returned to ARC/INFO to form additions to the database. For those with access to both ARC/INFO and S-Plus this development offers a very powerful environment for spatial data analysis. Although this obviously requires expertise in two often demanding systems, such developments offer considerable potential for spatial analysis and perhaps represent the most realistic future tools in that regard for researchers in spatial and environmental epidemiology.

Conclusions

We have sought here to give an up-to-date review of a variety of GIS and computer mapping software environments that are of potential use in geographical epidemiology. We have demonstrated that some of the required analysis tools in spatial epidemiology are already available either within a suitably extended GIS, or via other statistical software that has added analytical and visualisation functionality and which may be `close-coupled' to GIS. We see the future need for statistical spatial analysis software as involving environments that are flexible and interactive in nature, where the end-point is not necessarily a single `correct' map or hypothesis test. Rather, the aim should be to allow the user to create a variety of representations---to explore the data in a flexible way, perhaps fit several models, and most importantly to be able to visualise the outputs from such analyses and modelling easily in a geographical context. In this way there is a genuine interaction between the different modes of understanding: visualisation, exploration, and modelling. We are doubtful that GIS systems alone will ever be able to fulfil such a role, within fields with specialised analysis demands, such as spatial and environmental epidemiology. GIS will be an important element of such software environments, acting as database management and display tools, but the future lies in interfacing GIS to modern powerful statistical languages in a dynamic graphical windowing environment and not in unrealistically attempting to develop fully integrated statistical spatial analysis functions from within GIS.

References

Anselin, L. (1991) SpaceStat: a program for the analysis of spatial data, NCGIA, Department of Geography, University of California at Santa Barbara.

Anselin, L. and Getis, A. (1993) Spatial statistical analysis and geographic information systems, pp. 35-50 in Fischer, M.M. and Nijkamp. P. (eds) Geographic Information Systems, Spatial Modelling and Policy Evaluation, Springer-Verlag, Berlin.

Bailey, T.C. (1990) GIS and simple systems for visual, interactive, spatial analysis, The Cartographic Journal, 27,
79-84.

Bailey T.C. (1994), A review of statistical spatial analysis in GIS, in Fotheringham S. and Rogerson P. (eds) Spatial Analysis and GIS, Taylor and Francis, London, 13-45.

Bailey, T.C. and Gatrell, A.C. (1995) Interactive Spatial Data Analysis, Longman, Harlow.

Becker, R.A., Chambers, J.M. and Wilks, A.R. (1988) The New S Language, Wadsworth, Pacific Grove.

Bivand, R. and Charlton, M. (1992) Integrating spatial statistics and GIS: some software issues and solutions, paper presented at 32nd European Congress of the Regional Science Association, Louvain, Belgium.

Can, A. (1993) Computation of a spatial autocorrelation statistics within a topological vector data model, in Fotheringham, A.S. and Rogerson, P. (eds.) Spatial Analysis and GIS, Taylor and Francis.

Ding, Y. and Fotheringham, A.S. (1992) The integration of spatial analysis and GIS, Computers in Environmental and Urban Systems, 16, 3-19.

Dixon, J.S., Openshaw, S. and Wymer, C. (1988) The geographical analysis subroutine library - a user guide, Northern Regional Research Laboratory, Research Report No. 7-89, Newcastle.

Dorling, D.A. (1993) Cartograms for visualising human geography, in Unwin, D.J. and Hearnshaw, H.J. (eds.) Visualization and GIS, Belhaven Press, London.

Eastman, J.R. (1990) IDRISI: User's Guide, Department of Geography, Clark University, Worcester, Mass.

Gatrell, A.C. and Rowlingson, B.S. (1994) Spatial point process modelling in a Geographical Information Systems environment, in Fotheringham, A.S. and Rogerson, P. (eds) Spatial Analysis and GIS, Taylor and Francis, London.

Getis, A. and Ord, J.K. (1992) The analysis of spatial association by the use of distance statistics, Geographical Analysis, 24, 189-206.

Goodchild, M.F. (1987) A spatial analytical perspective on geographical information systems, International Journal of Geographical Information Systems, 1, 327-34.

Goodchild, M.F., Haining, R., Wise, S. and others (1992) Integrating GIS and spatial analysis: problems and possibilities, International Journal of Geographical Information Systems, 6, 407-23.

Griffith, D.A. and others (1990) Developing MINITAB software for spatial statistical analysis: a tool for education and research, The Operational Geographer, 8, 28-33.

Haslett, J., Bradley, R. Craig, P., Unwin, A. and Wills, G. (1991) Dynamic graphics for exploring spatial data with application to locating global and local anomalies, The American Statistician, 45, 3, 234-42.

Jefferis, D.R. (1993) SPAAM: a spatial analysis and modelling system, Proceedings, Thirteenth Annual ESRI User Conference, Volume 3, ESRI, Redlands, California, 79-87.

Maclennan, M. (1990) Second--order analysis of point patterns in GRASS, NCGIA Research Report, University of California at Santa Barbara.

Maguire, D.J. (1993) Spatial analysis and ARC/INFO, presented at Spatial Analysis and ARC/INFO conference, Lancaster University, Lancaster, June 1993, ESRI, Watford, UK.

Majure, J.J. and Cressie, N. (1993) EXPLORE: exploratory spatial analysis in ARC/INFO, Proceedings, Thirteenth Annual ESRI User Conference, Volume 3, ESRI, Redlands, California, 65-77.

Openshaw, S., Charlton, M., Wymer, C. and Craft, A. (1987) A Mark 1 Geographical Analysis Machine for the automated analysis of point data sets, International Journal of Geographical Information Systems, 1, 335-58.

Openshaw, S. (1991) A spatial analysis research agenda, in Masser, I. and Blakemore, M. (eds.) Handling Geographical Information: Methodology and Potential Applications, 18-37, Longman.

Openshaw, S. and Craft, A. (1991) Using geographical analysis machines to search for evidence of clusters and clustering in childhood leukaemia and non--Hodgkin lymphomas in Great Britain, 1966--83, in Draper, G. (ed.) The Geographical Epidemiology of Childhood Leukaemia and Non--Hodgkin Lymphomas in Great Britain, 1966--83, Studies in Medical and Population Subjects, N0. 53, OPCS, HMSO, London.

Rowlingson, B.S. and Diggle, P.J. (1993) SPLANCS: spatial point pattern analysis code in S-Plus, Computers and Geosciences, 19, 627-55.

Selvin, S. (1988) Transformations of maps to investigate clusters of disease, Social Science and Medicine, 26, 215--21.

Tierney, L. (1990) LISP-STAT: An Object-Oriented Environment for Statistical Computing and Dynamic Graphics, John Wiley, Chichester.

Venables W. N. and Ripley B. D. (1994) Modern Applied Statistics with S-Plus, Springer-Verlag, London.

Abstract

Geographical Information Systems (GIS) and related technologies provide software environments that permit storage, visualisation, exploration, and modelling of geographically-referenced data. As such, these systems are of obvious potential value in spatial epidemiology and other health research of a geographical nature. However, to fully realise this value GIS have to incorporate, or be able to be used in close conjunction with, statistical methods which are sufficiently sophisticated to address the complex questions and relationships often involved in public and environmental health issues. Many such questions will involve explicit recognition of the spatial dimension of relevant data. Here we consider the general types of spatial analysis methods relevant to health research which might be valuable in conjunction with GIS. We then briefly review the current extent to which such statistical methods can easily be used in conjunction with GIS, or computer mapping facilities.

Curriculum Vitae

Dr T C Bailey, MSc PhD.

Department of Mathematical Statistics & Operational Research

University of Exeter, Exeter, EX4 4QE, UK.

Dr Trevor Bailey is a Senior Lecturer in the Department of Mathematical Statistics & Operational Research, at the University of Exeter where he was appointed in 1986. Formerly he lectured at the Australian Graduate School of Management, University of New South Wales, Sydney, Australia. He holds degrees from London and Exeter University and is a Chartered Statistician, a Fellow of the Royal Statistical Society and a Member of the Operational Research Society. His research interests include: Spatial statistics, computer mapping, locational analysis, medical geography, high dimensional classification & discrimination, applied computer modelling. He has authored or co-authored numerous academic papers on various aspects of applied statistics and Operational Research, including a recent book on Interactive Spatial Data Analysis.

Recent publications include:

Bailey T.C., Review of Anselin L. and Florax R. J.G.M. eds. New Directions in Spatial Econometrics, New York: Springer, 1995, Journal of Regional Science, 36, 2, 311-315.

Bailey T.C., Sapatinas T., Powell K., Krzanowski W. J., 'Use of wavelets in the detection of underwater sound signals', Proceedings Interface96, Sydney, Australia.

Gatrell A.C., Bailey T.C. (1996), 'Interactive spatial data analysis in medical geography', Social Science and Medicine, 42, 6, 843-855.

Gatrell A.C., Bailey T.C., Diggle P. J., Rowlingson B. (1996), 'Spatial point pattern analysis and its application in geographical epidemiology', Transactions, Institute of British Geographers, 21, 256-274.

Bailey T. C., Gatrell A.C. (1995), Interactive Spatial Dat a Analysis , Longman Scientific & Technical, Harlow, ISBN:0-582-24493-5, 436 pp.

Powell K., Sapatinas T., Bailey T.C., Krzanowski W. J. (1995), 'Application of wavelets to the pre-processing of underwater sounds', Statistics and Computing, 5, 265-273.

Gatrell A.C., Collin J.R.O., Downes R., Jones B., Bailey T.C. (1995) 'The geographical epidemiology of ocular diseases: some principles and methods', Eye, 5, 358-364.

Bailey T. C., Munford A. G. (1994), 'Modelling a large , sparse spatial interaction matrix, using data relating to a subset of possible flows', European Journal of Operational Research, 79, 3, 489-500.

Bailey T.C. (1994), 'A review of statistical spatial analysis in GIS', in Spatial Analysis and GIS, Fotheringham S. & Rogerson P. (eds), Taylor & Francis, London, ISBN: 0-7484-0103-2, 13-45.

Smith D. S., Bailey T. C., Munford A. G. (1993), 'Robust Classification of High Dimensional Data using Artificial Neural Networks', Statistics and Computing, 3, 71-81.

Bailey T. C., Weal S. (1993), 'Introducing undergraduates to the spirit of OR, whilst imparting substantive skills', Journal of Operational Research Society, 44, 9, 897-908.

Bailey T. C., Hinde J.P. (1993), 'Applications of canonical correlation and Procrustes analysis in exploratory multivariate spatial data analysis', in Proceedings 4th European Conference on Geographical Information Systems, Vol 1, EGIS Foundation, Faculty of Geographical Sciences, Utrecht, ISBN: 90-73414-11-3, 606-616.