GAM: Cluster hunting software (v4.0)

Centre for Computational Geography
University of Leeds

The Cluster Hunter program gives you several ways to analyse clusters in your data. This webpage will take you through one - analysing data with GAM. It should be noted that Cluster Hunter is an incomplete experimental program and behaves as such.

The first thing to do is get the data file you want to use. To follow the information on this page you need a comma separated text file with data in the following columns...

ID number, Eastings, Northings, Incidences, Total population in ID area.

The program assumes the ID number refers to an area with a total population, in which there are a number of incidences of something. The Eastings and Northings refer to the centre of the area. Note that the ID cannot include letters.

Once you have the file, open up Cluster.jar

Either double-click on the file in Windows Explorer, or if this doesn't work, open a DOS command prompt, find the file and type...

> d:\jre1.5\bin\java -jar cluster.jar

You'll have to replace the "d:\jre1.5\bin\" bit with a reference to wherever the JRE is on your machine.

(For those used to Java, the file is a self-executing jar archive)

Screenshot of Cluster

The application is composed of the following...

  • A File menu, for opening files and quiting.
  • A Select menu, for selecting the cluster hunting method.
  • A Parameters menu, for selecting parameters for the various methods.
  • A Run menu, for running once you've opened the file and selected a method.
  • A Display menu, for displaying your data, the results, and overlaying boundaries on both from a shapefile.
  • A message space below the menus.

The first thing to do is to open up the file.

Pick Load Data File from the File menu, and open your file (note that the file must have the .dat extension). This may take a few minutes. Cluster should tell you that the file has been opened and the number of points that have been loaded into the application from the file when it's finished.

Screenshot of Cluster when loaded

Once you have the files in, you can look at the data points using the Display Database item on the Display menu. You may have to adjust the size of the window that pops up in order to see the data properly. To close the window (indeed any of the windows) go to the File menu and select Quit (Note that you can't use the "X" in the corner of windows to close them).

Next you have to pick a clustering method.

Choose GAM from the Select menu. If you go back to the menu you'll see it's now got a little star next to it.

The other options are Knn a type of K-means detection, and Random, which finds clusters like GAM, but by randomly throwing circles at the problem (it's used to test the effectiveness of algorithms against a benchmark of random guessing).

You'll see that the Set Parameters item on the Parameters menu can now be selected. This would let you set the parameters. We're just going to run GAM with the defaults.

Next, run GAM

Pick the Go option from the Run menu. GAM will run and various messages will appear in the messages area. It make take several minutes for GAM to run. When it's finished, the message area should say "finished".

Screenshot of Cluster when finished.

Finally we want to display and save the results.

Pick Display Results from the Display menu. This will bring up a window showing the clusters. You can use the Load Shapefile item on the Geography menu of this window to load a boundary dataset over the results to see where the clusters are (this is a little flaky with large boundary datasets).

Once you're happy with the results you can use Save as Jpeg item on the File menu to save the clusters as an image. However, these are pretty poor quality and it won't presently save your boundaries, so it may be better to save an image of your clusters by taking a screenshot (Prt Sc button on your keyboard, then open up Microsoft Photo Editor and select Paste as New Image from the Edit menu).

Alternatively you can export the cluster file as a Arc GRID compatible file. This is an ASCII file that is suitable for conversion to a raster/GRID format using ArcToolbox.

To shut down Cluster, go to the main window, and select Quit from the File menu.

