Be honest, did you glance at the title and read it as “meteorology”? Or saw metrology but assumed (quite plausibly) that I had misspelled meteorology? Given the recent weather, you can be forgiven.
Metrology is the science of measurement — a task that a great many of us at TNC do with surprising frequency. (Witness the effort demonstrated in the November 2012 issue of Chronicles alone to measure resilience.) Think of some of the things that we might measure in a conservation planning effort; disturbance, viability, condition, connectivity, intactness, risk, cost, biodiversity, threat, opportunity, service, etc. But despite the fact that assigning numbers to things is an everyday Conservancy activity, we violate basic rules of metrology almost as frequently. Before you skip a few pages on the reasonable premise that this is just Eddie banging on about planning again, consider that at the very least I’m hoping to license you to add another expertise to your resume.
Natural vs. Constructed Scales
In conservation, our main purpose for measuring things is to compare them — generally to make decisions about which activities we should prioritize and where. Sometimes the things we want to measure have natural scales — these are the easy ones. Natural scales are obvious and pre-existing ways to measure something — stream flow in volume (m3/second), populations by number of individuals, cost in dollars. Natural scales are great because they are relatively objective; two people should be able to measure the same thing and get the same number.
Frequently, however, we want to measure things — such as resilience or disturbance — that do not have natural scales. In these cases, we need to use constructed scales.
We can construct a scale to measure anything. This is where many conservation scientists demonstrate their skill as metrologists. For instance, we might assess the disturbance to different areas or habitats in a region on a scale of 1-7, or alignment of a strategy or geography with TNC’s expertise on a scale of 1-4. Constructed scales can even be simple linguistic interpretations (e.g., threat classified as “high,” “medium,” or “low”) that are subsequently related to numerical values (e.g., high = 3, medium = 2, low = 1). The basic premise of constructed scales is that the measurement reflects underlying empirical relationships in the thing we are measuring.
Constructed scales allow us to measure things for which there are neither natural scales nor established data. They also allow us to integrate data on a number of variables and from a variety of sources — including in many cases, a good degree of expert judgement. These strengths make constructed scales really useful in conservation.
The Potential Issue with Constructed Scales
But the scores assigned to things on constructed scales are essentially arbitrary — there is no objective reason why a relatively undisturbed habitat should be given a score of 4 rather than 5, for example. What these constructed scales typically represent is a set of ordinal numbers. They tell us that a score of 2 is better than a score of 1 and worse than a score of 3.
If we restrict our interpretation of such scales to simple ordinal representations between alternatives (e.g., alternative X is better than alternative Y for things Z), then the arbitrary nature of the numbers is not problematic. However, because ordinal numbers do not tell us how much better 2 is than 1, constructed ordinal scales become an issue when we try to perform any arithmetic on them, such as adding scores together or taking the mean across a number of scores. Performing this sort of math on an ordinal scale assumes a strict relationship between the numbers (that 4 is twice as good as 2) that the constructed scale might never have possessed.
Yet we perform math on our constructed scales all the time. Take the Conservation Action Planning (CAP) workbook or the software Miradi. To help compare target viability (amongst other things), both tools combine measurements of size, condition and landscape context using the following scale: Very Good = 4, Good = 3.5, Fair = 2.5 and Poor = 1. The overall rank is given by the arithmetic mean of these three categories.
To illustrate the problem with doing this, consider two habitats, A and B. Habitat A receives three scores of Fair, whereas Habitat B receives two scores of Good and one of Poor. Taking the arithmetic mean, Habitat B (score of 8) would be ranked above Habitat A (score of 7.5). But if we adjusted our choice of scale such that Good was worth 3 rather than 3.5, Habitat A (score of 7.5) would now be ranked above Habitat B (score of 7). As Wolman (2006) eloquently puts it in an article on measurement theory: the “truth or falsity of results derived from measurements should not depend on a fortuitous choice of scale.”
The above example shows how easily basic rules of metrology can be violated and the results rendered somewhat arbitrary. We should improve our science related to measurement, especially as measurement is so often the place where our great science meets actual management decisions. Here are some very simple ways to improve your measurement practices:
• Recognize that you are effectively a metrologist and take pride in your expertise.
• Be aware of the type of scale something is being measured on, what the numbers mean, and what sort of math you can admissibly perform on them.
• To check whether the math you are doing is reasonable for that scale, go back to the underlying data and ask if “4” is unambiguously (in other words everyone would agree) twice as good as “2.”
• Where possible, use natural scales. Even if data in the logical natural scale doesn’t exist (say for population numbers), ask experts to give you estimates in the natural scale rather than a constructed scale.
• If you need to construct a scale and measure things on it, do so in a way that preserves interval relationships. This might require using a more resolved scale, say 0 – 100 rather than 1 – 4.
• If things need to be combined, normalize rather than convert to constructed scales. Converting to a constructed scale usually just loses information.
• Consider multiplication rather than addition. Multiplying has the interpretation of weighting one thing by another thing and can avoid some of the issues of meaningfulness that come with adding or averaging.
So update your CV’s. And keep measuring.
Wolman, A. G. 2006. Measurement and meaningfulness in conservation science. Conservation Biology 20:1626-1634.
*Photo: Shelly S/Flickr via Creative Commons