Some General Concerns You Might Have With...
1. Census Classifications
2. Sources of Error
3. Data Protection
4. Ethical Issues
1. Some problems with census classifications
- Census classification is only a data descriptive tool of unknown precision. The real results
(i.e. a correct classification of the UK's residential areas) is unknown so there is no "ground truth"
to measure quality of any specific classification against.
- Note also that it is a classification of areas (e.g. census EDs) and not a classification of people.
The people living in the EDs belonging to a particular type or cluster may have nothing in common with
the features of that type or cluster. There is an ever present risk of ecological fallacy. Some people
(occasionally many) of the key features of a cluster may not be shared by any of the people living there.
The converse also applies. Some people who match the features of cluster 27 may in fact live in areas
assigned to other clusters with very different characteristics.
- Classification is a highly subjective process but this does not mean that it will be unuseful.
- There are various methodological problems involved in large data set classification e.g. effects of
non-normality, unequal spatial representativeness of census EDs, natural versus statistical clustering,
choice of method, suboptimality of the results, etc.
2. Sources of error in a census classification
A few sources of error are potentially important:
- Clusters are too big to provide a 'good' description - generalisation error
- An ED is assigned to the 'wrong' cluster - classification error
- An ED could be assigned to two or more different clusters with only a minuscule error - all or
nothing nature of the classification
- The ED to postcode linkage is incorrect - data error
- The area covered by either ED or postcode has changed since 1991 resulting in a totally wrong
assignment, - data change error
- The census data was wrong or so uncertain as to result in a misallocation to an incorrect cluster - data error
- Mistake in processing due either to software bug, or rounding error, or pre-processing error somewhere
in any of several different steps in the processing
- Cluster labelling inaccuracies
- Errors in file being matched
- Ecological Fallacy errors: the cluster labels may not provide a good description of the people and
households living in a particular area.
- Anything else you can think of!
Classification is really a form of exploratory analysis. Maybe you may hold unrealistic expectations about
what it can do!
3. What data protection concerns are there?
These are still unclear. The simplest view is that there are none because the objects of the classifications
(1991 census EDs) are collections of 50 plus people. Census small area statistics are not personal data and
hence not subject to Data Protection legislation.
An alternative view would be to argue that once a census (or geodemographic) classification code is applied
to an individual and decisions are made that impact on that person then Data Protection principles could be
applied to that cluster code as an item of personal data. This is probably very naïve. It would also apply
to any code relating to geography added to personal data; i.e. town name or county. The point to note here
is that the cluster code is not personal data but geographical data, it is a measurement relating to an
areal object and its use as a description of an individual is a gross ecological fallacy.
4. What ethical concerns may there be?
Goss (1995) in his deconstruction of geodemographic trade literature lists a number of complaints including:
- A cluster code reduces complex data about a person living in an area is reduced to a number
- End-users and vendors widely ignore ecological fallacies and assume cluster labels really do
accurately describe reality and the features of the persons living there
- The use of geodemographic systems to target areas whether for direct mail or planning may result
in the areas slowly matching their profiles, indeed people may move to areas that match where they want to be
- It is militaristic, heterosexual, male, and part of a surveillance mentality!
Return to: Help Topics LGAS homepage