The CCG Synthetic data set

This page gives details of a pair of synthetic data sets developed for the CCG for the testing of cluster detection tools. A full description of the way these data sets were built is given in Testing space-time and more complex hyperspace geographical analysis tools Stan Openshaw, Andy Turner, Ian Turton, James Macgill, Chris Brunsdon, which was presented at GISRUK99 at Southampton.

To use the data you need the coordinates for the population and cases, each set uses the same coordinates file. You then need either a cases file, a time file or an attribute file. The region being "studied" is Yorkshire and Humberside if you want to add background maps to it.

The coordinates file contains an ID (so you can link the other data to the locations) followed by a pair of 6 figure National Grid coordinates.

1 436370 405770
2 437050 405710
3 437660 405790
4 438400 405860
5 438110 405860
6 436760 405790
7 435910 405410
8 436210 405450
9 436350 405350
10 436460 405400

The cases file contains an ID (which matches the coordinates file) followed by the number of cases (what might be clustered) and the population at risk (which you can ignore if you want).

   1  0  370
   2  0  393
   3  0  438
   4  0  388
   5  0  596
   6  0  474
   7  0  524
   8  0  411
   9  0  415
   10  0  487
   11  0  615
   12  0  519
   13  0  599
   14  0  759
   15  0  502
   16  0  575
   17  0  585
   18  1  640
   19  0  491
   20  0  658

The time file has for each ID that has atleast one case an ID,a count of the number of items to follow and then pairs of time period and number of cases.

   18  2  33  1
   27  2  72  1
   29  2  118  1
   39  2  324  1
   51  2  230  1
   54  4  154  1 230 1
   55  2  170  1
   60  2  365  1
   61  2  128  1
   73  2  250  1

The attribute file has for each ID that has atleast one case an ID, a coount of the number of items to follow and then pairs of time period and the attribute of the case.

   18  2  33  1
   27  2  72  1
   29  2  118  1
   39  2  324  1
   51  2  230  1
   54  4  154  2 230 2
   55  2  170  2
   60  2  365  2
   61  2  128  1
   73  2  250  2

The coordinates for the data
Set 1 Cases ,Time,Attributes
Set 2 Cases ,Time ,Attributes
Set 3 Cases ,Time ,Attributes
Set 4 Cases ,Time ,Attributes
Set 5 Cases ,Time ,Attributes
Set 6 Cases ,Time ,Attributes
Set 7 Cases ,Time ,Attributes
Set 8 Cases ,Time ,Attributes
Set 9 Cases ,Time ,Attributes
Set 10 Cases ,Time,Attributes
Set 11 Cases ,Time ,Attributes
Set 12 Cases ,Time ,Attributes
Set 13 Cases ,Time ,Attributes
Set 14 Cases ,Time ,Attributes
Set 15 Cases ,Time ,Attributes
Set 16 Cases ,Time ,Attributes

Finally you'll need the answers to tell if you are right.

The coordinates for the data
Set 1 ,time,Attributes
Set 2,time,Attributes
Set 3,time,Attributes
Set 4,time,Attributes
Set 5,time,Attributes
Set 6,time,Attributes
Set 7,time,Attributes
Set 8,time,Attributes
Set 9,time,Attributes
Set 10,time,Attributes

Finally you'll need the answers to tell if you are right.