Modelling and integrating spatial data across geographic scales
This was the main theme of my Ph.D. thesis, investigating new methods for the multi scale analyses of spatial data. There are three reseach areas:
The research shows that most phenomena should be analysed at specific scales. Sounds obvious, but it is rarely done in practice, often resulting in poor maps, spurious results and inappropriate conclusions. The research also shows that maps based on a combination of scales will often contain more useful information than single scale maps.
Here is a cross scale analysis of the Geographically Weighted (local) Correlation between NDVI and elevation from a small watershed in Honduras. The firstmap shows the median correlation coefficient from 8 scales of analyses (0.2 hectares to 100 hectares). The local correlation varies from almost perfect negative to perfect positive across the watershed, even though the correlation for the whole watershed is close to zero. Clearly there are local patterns in the data that cannot be observed by non-spatial analysis.
The second map shows the areas that were significant (based on Monte Carlo Simulation) at one or more of these scales. The yellow/red areas indicate that these local patterns in the correlation results were unlikely to have occurred by chance.
Neural Networks and their application to geographical problems
Another part of my research involves the application of Unsupervised Learning techniques to hunt for patterns and interesting trends in massive (spatial) databases. Self Organising Maps (SOM) are a great way to achieve this, and I've been applying them to my cross scale results.
I'm also investigating ways to automatically colour the resulting classification from the SOM such that each class receives a colour that represents its relationship to every other class, i.e. similar classes get similar colours and vice versa. Automatic colouring of maps avoids prejudicial choice of colouring and class intervals which have an enormous effect on the interpretation of the map. Since most colour systems are based on 3 colour values (red-green-blue or hue-saturation-value) this colourisation is simple to do if you have 3 or less variables. In this case I often have up to 50 variables that comprise each class and I've had a lot of success in applying the Sammon Mapping technique to reduce the data dimensionality down to 2 and then using these pseudo x,y coordinates to define the colour of each class.
A map of these automatically generated and coloured classes is shown in the first map below. The method has identified 10 classes in the data set, i.e. 10 different patterns in the variation of the correlation coefficient across scale. The colouring of the classes indicates the degree of similarity between the classes. Classes 1 and 2 are similar and have similar colours, as do 9 and 10, yet classes 1 and 10 are totally different. There is a gradual change in colouration throughout the classes that helps to interpret the map since the method not only identifies classes but orders them in a sensible way.
The second map shows the error in this classification as a percentage. The average error for the map is below 10% (a 90%+ classification accuracy) with few pixels over 20%.
Spatial Analysis of biodiversity
I'm also involved with generating new methods for mapping accession data collected for rare, wild species of crops that are important for biodiversity conservation as well as for developing new strains of resistant staple crops. This research aims to develop mapping tools that accurately portray the location and appropriate scale of mapping of these species. .
A point based map of the Shannon Weaver index of biodiversity for Arachis (peanuts, groundnuts etc.) in South America is shown in the first map below. Each point represents a collection and the each species is coloured individually. The second map shows the resulting cross scale map of the species diversity, based on a local richness calculation to highlight hotspots of species diversity.
Global population mapping and modelling
I work in collaboration with UNEP, CIAT and CIESIN in developing models for redistributing polygon based population data into higher resolution raster format. Population data is acquired from two data sources, i) census data in fairly detailed administrative units and ii) urban areas. Resolving the differences between these two data sets to produce estimates of population that comply with historical UN rural and urban population estimates is achieved by a mass-conserving algorithm, GRUMPe (Global Rural Urban Mapping Programme).
Here are links to various population mapping efforts. Some that I have been involved in
A 3D map of the population density for Kenya (circa 2000 from GPW v3) is shown below.
I've been developing spatial models of accessibility in developing countries. The models are based on a cost-distance approach where we calculate the cost (in dollars or time) to travel from point A to point B, based on various modes of transport and travel conditions. The model is raster based and can incorporate a wide range of information (road quality, slope, land cover, urban areas, climate) to create accurate models of travel cost, which can then be applied to further models of population potential and 'poverty', amongst other things.
These two maps compare a simple distance map, in km (top) to an accessibility map, in hours (bottom) for two major cities in Honduras. The second map was created with the CIAT Accessibility Analyst
Hydrology and watershed modelling
I'm interested in DEM generation from contour lines, spot heights and stereo orthophotos, and the subsequent generation of terrain indices, watershed boundaries and watershed characteristics. I've spent some time with colleagues at CIAT in developing improved versions of the Shuttle Radar Topography Mission DEM data. This improved data is online here
The original research grade data from NASA/USGS/NGA are available here.
Sometime later this year (2004) the final processed data from should be available too.