Search for courses or information

Brimicombe, A.J. (2003) "A variable resolution approach to cluster discovery in spatial data mining" in Computational Science and Its Applications (eds. Kumar et al.), Springer-Verlag, Berlin, Vol. 3: 1-11

Spatial data mining seeks to discover meaningful patterns from data where a prime dimension of interest is geographical location. Consideration of a spatial dimension becomes important when data either refer to specific locations and/or have significant spatial dependence which needs to be considered if meaningful patterns are to emerge. For point data there are two main groups of approaches. One stems from traditional statistical techniques such as k-means clustering in which every point is assigned to a spatial grouping and results in a spatial segmentation. The other broad approach searches for 'hotspots' which can be loosely defined as a localised excess of some incidence rate. Not all points are necessarily assigned to clusters. This paper presents a novel variable resolution approach to cluster discovery which acts in the first instance to define spatial concentrations within the data thus allowing the nature of clustering to be defined. The cluster centroids are then used to establish initial cluster centres in a k-means clustering and arrive at a segmentation on the basis of point attributes. The variable resolution technique can thus be viewed as a bridge between the two broad approaches towards knowledge discovery in mining point data sets. Applications of the technique to date include the mining of business, crime, health and environmental data

Brimicombe, A.J. (2002) "Cluster discovery in spatial data mining: a variable resolution approach" in Data Mining III (eds. Zanasi et al.), WIT Press, Southampton: 625-634

Spatial data mining seeks to discover meaningful patterns from data where a key dimension of the data is geographical location. This spatial dimension becomes important when data either refer to specific locations and/or have significant spatial dependence and which needs to be taken into consideration if meaningful patterns are to emerge. For point data there are two main groups of approaches. One stems from traditional statistical techniques such as k-means clustering in which every point is assigned to a spatial grouping and results in a spatial segmentation. The segmentation has k sub-regions, is usually space filling and non-overlapping (i.e. a tessellation) in which all points fall within a spatial segment. The difficulty with this approach is in defining k centroid locations at the outset of any data mining. The other broad approach searches for 'hotspots' which can be loosely defined as a localised excess of some incidence rate. In this approach not all points are necessarily assigned to clusters. It is the mainstay of those approaches which seek to identify any significantly elevated risk above what might be expected from an at-risk background population. Definition of the population at risk is clearly critical and in some data mining applications is not possible at the outset. This paper presents a novel variable resolution approach to cluster discovery which acts in the first instance to define spatial concentrations in the absence of population at risk. The cluster centroids are then used to establish initial centroids for techniques such as k-means clustering and arrive at a segmentation on the basis of point attributes. The variable resolution technique can thus be viewed as a bridge between the two broad approaches towards knowledge discovery in mining point data sets. The technique is equally applicable to the mining of business, crime, health and environmental data. A business-oriented case study is presented here.

Brimicombe, A.J. and Tsui, P. (2000) "A variable resolution, geocomputational approach to the analysis of point patterns" Hydrological Processes 14: 2143-2155 


A geocomputational approach to the solution of applied spatial problems is being ushered in to take advantage of ever increasing computer power. The move is seen widely as a paradigm shift allowing better solutions to be found for old problems, solutions to be found for previously unsolvable problems and the development of new quantitative approaches to geography. This paper uses geocomputation to revisit point pattern analysis as an objective, exploratory means of evaluating mapped distributions of landforms and/or events. A new variable resolution approach is introduced and tested alongside more traditional approaches of nearest neighbour distance and quadrat analysis and against another geocomputational approach, the K function. The results demonstrate that firstly, the geocomputational paradigm allows new and more useful solutions to be found for old problems. Secondly, a variable resolution approach to geographical data analysis goes some way towards overcoming the problem of scale inherent in such analyses. Finally, the technique facilitates spatio-temporal analyses of event data, such as landslides, thus offering new lines of enquiry in areas such as hazard mitigation.

Tsui, P. and Brimicombe, A.J. (1997) "Hierarchical tessellations model and its use in spatial analysis" Transactions in GIS 2: 267-279

Hierarchical tessellation model is a class of spatial data models based on recursive decomposition of space. Quadtree is one such tessellation and is characterised by square cells and 1:4 decomposition ratio. To relax these constraints in tessellation, a generalised hierarchical tessellation data model, called Adaptive Recursive Tessellations (ART), has been proposed. ART increases flexibility in tessellation by the use of rectangular cells and variable decomposition ratios. In ART, users can specify cell sizes which are intuitively meaningful to their applications, or can reflect the scales of data. ART is implemented in a data structure, called Adaptive Recursive Run-Encoding (ARRE), which is a variant of two-dimensional run-encoding whose running path can vary with different tessellation structures of ART model. Given the recognition of the benefits of implementing statistical spatial analysis in GIS, the use of hierarchical tessellation models, such as ART, in spatial analysis are discussed. Firstly, ART can be applied to solve quadrat size problem in quadrat analysis for point pattern with variable size quadrats. Besides, ART can also act as data model in variable resolution block kriging technique for geostatistical data to reduce variation in kriging error. Finally, ART model can facilitate the evaluation of spatial autocorrelation for area data at multiple map resolutions and how to construct connectivity matrix for calculating spatial autocorrelation indices based on ARRE is also illustrated.

Tsui, P. and Brimicombe, A.J. (1997) "Adaptive recursive tessellations (ART) for Geographical Information Systems" International Journal of Geographical Information Science 11: 247-263 (US$1,000 CPGIS prize for Best Paper)

Adaptive Recursive Tessellations (ART) is a conceptual and generalised framework for a series of hierarchical tessellation models characterised by a variable decomposition ratio and rectangular cells. ART offers more flexibility in cell size and shape than the quadtree which is constrained by its fixed 1:4 decomposition ratio and square cells. Thus the variable resolution storage characteristic of the hierarchical tessellations can be fully utilised. A data structure for the implementation of the ART, called Adaptive Recursive Run-Encoding (ARRE), is proposed. \Then a spatial database management system specially for ART, the Tessellation Manager, is constructed based on the ARRE. Space efficiency analysis of three ART models are conducted using the Tessellation Manager. The result shows that ART models have similar space efficiency with the quadtree model. ART also has many potential applications in GIS and is suitable as a spatial data model for raster GIS.

Tsui, P. and Brimicombe, A.J. (1996) "Hierarchical tessellations model and its use in spatial analysis" First International Conference on Geocomputation, Leeds, Vol2: 815-825

Quadtree, a classical hierarchical tessellation model, has been widely adopted as a GIS spatial data model for its ability to compress raster data. Nevertheless, due to its rigid tessellation structure, a generalised hierarchical tessellation model, called Adaptive Recursive Tessellations (ART) is proposed. It has a more flexible tessellation structure than quadtree as a result of the use of rectangular cells and variable decomposition ratios. Thus users can assign specific sizes of cells to different levels of an ART model which are intuitively useful or meaningful to their applications, or can reflect the scales of data. ART is represented by a special type of two-dimensional run-encoding, Adaptive Recursive Run-Encoding (ARRE), whose running path can vary with the tessellation structure of an individual ART model. The benefits of implementing statistical spatial analysis in GIS have been recognised. As a spatial data model, the use of hierarchical tessellation models in spatial analysis are discussed in this paper. The formulae for mean and variance statistics in terms of addresses of ARRE are stated. A connectivity matrix which is an essential element in calculating spatial autocorrelation can also be constructed based on the ARRE. Finally, the application of the ART model in solving modifiable areal unit problem is discussed. Flexibility of the tessellation structure and the inherent ability in spatial aggregation of the ART model have been found useful in studying scale and aggregation effects of areal units on spatial analysis.

Brimicombe, A.J. and Yeung, D. (1995) "An object oriented approach to spatially inexact socio-cultural data" Proceedings 4th International Conference on Computers in Urban Planning & Urban Management, Melbourne, Vol. 2: 519-530

Culture, as a system of shared beliefs within a society, has an important influence on the use of land and attitudes towards the environment. Socio-cultural data describing the spatial manifestations and influences of culture should therefore be fundamental to the processes of planning and conservation. Socio-cultural data, however, are inherently inexact and topology may not be strictly determined by geometric adjacency. Thus geographical information systems currently used by planners are largely unable to represent socio-cultural phenomena, leading to its exclusion from many analyses and hence from much of the decision-making process. This paper argues for an object-oriented approach in which socio-cultural objects and classes of phenomena can be modelled initially as a-spatial. This is illustrated using a Hong Kong example of the influence of traditional landscape beliefs (feng shui) on village layout. Location and extent, as attributes of objects, can then be defined using raster and/or vector in absolute or relative space. The outcome of this approach provides a new GIS perspective of space, place and landscape.

Brimicombe, A.J. and Tsui, P.(1993) "Adaptive recursive tessellations: a versatile approach to data modeling GIS applications in planning and engineering" Third International Conference on Computers in Urban Planning and Urban Management, Atlanta, Vol. 1: 435-450

Physical planning and engineering feasibility studies progress through stages of refinement from initial site search and selection to detailed zoning or layout proposals. Whilst GIS is an appropriate tool, current data models do not easily support the refinement process. Adaptive recursive tessellations (ART) are raster-based models developed to intuitively match the usual approach to the task. Specific features include the ability to: define appropriate levels of resolution by varying the decomposition ratio; conform to existing map sheet series; locally increase the resolution of a layer as more refined data becomes available; segment the database over a distributed network between Planning Offices. A tessellation management system (TMS) has been developed to create and manage ARTs according to planners' requirements. Data input, through the TMS, can be either in vector or raster. The data structure should allow for efficient analytical operations such as overlay, Boolean mapping and multicriteria evaluation.

Tsui, P. and Brimicombe, A.J. (1993) "Adaptive recursive tessellations: an approach to data modeling in civil engineering feasibility studies" 3rd International Workshop on GIS, Beijing, Vol. 1: 41-55

Practical approaches to civil engineering pre-feasibility and feasibility studies progress through stages of refinement from initial site search and selection to outline design. Whilst GIS is an appropriate tool, current data models do not easily support the refinement process. Adaptive recursive tessellations (ART) are raster-based models developed to intuitively match the usual approach to the task. Specific features include the ability to: define appropriate levels of resolution by varying the decomposition ratio; conform to existing map sheet series; locally increase the resolution of a layer as more refined data becomes available; segment the database, if necessary, over a distributed network between site or regional offices. A tessellation manager (TM) has been developed to create and manage ARTs according to project requirements. The data structure should allow for efficient analytical operations usually used in opportunities and constraints mapping and site selection.

Tsui, P. and Brimicombe, A.J. (1992) "Adaptive Recursive Tessellations" Proceedings GIS/LIS'92, San Jose, Vol. 2: 777-786.

Quadtree, a recursive tessellation data model can compress the volume of raster data by representing a large area of same characteristic with a larger cell instead of a vast number of small cells. However, the deficiencies of quadtree are the ubiquitous use of square cells and fixed decomposition ratio (1 : 4). These make quadtree a very inflexible data model since the sizes of cells at different levels are fixed once the dimension of the whole area is known. A new Adaptive Recursive Tessellation (ART) data model which allows the use of rectangular cells and variable decomposition ratio is presented. ART offers much greater flexibility to users to cope with the needs of different applications in the aspect of data modelling. A modified two-dimensional run-encoding technique is also implemented on ART to further reduce the storage volume. In order to construct and update an ART, a Tessellation Management System (TMS) has been developed.