Course Materials for Geography 481, Project Two

Geography 481 Intro to GIS
Project Two: Mapping Continuous Data

This project introduces the concept of thematic mapping using continuous data read from the attribute database. Unlike the geology map, where you set patterns for discrete categories of data, a map of continuous data requires the use of data ranges. These ranges provide a generalized view of the underlying variable where similar values are collapsed into a small number of categories. When you work with classified data you exchange the detail of the original values for a "big picture" view of the data. How you distill patterns represents a form of analysis that can either illuminate or hide important spatial patterns.

In GIS, the relationship between cartography and analysis is a close one. In many cases, you can analyze data by simplifying or classifying your map into a small number of categories so that spatial patterns become more apparent. If care is not taken by the GIS analyst/cartographer (that's you!), the resulting maps may not communicate effectively, or worse, may offer a misleading interpretation of the data. This project introduces alternative techniques for classifying continuous data, each of which conveys different information to the map user. In the exercise, you will produce different views of population density data from the U.S. You will see that there is no “correct” view; different approaches are useful for conveying different aspects of the data distribution.

Setup

Read instructions carefully
Copy the necessary files from \\Geogsrv\data\Geog481-Carroll\proj2\ to the appropriate location
Run ArcMap and open the proj2.mxd map document file.

Creating a Map of a Continuous Variable

There are three basic steps to mapping a continuous variable: 1) identify the variable to map, 2) set up the data classification ranges, and 3) assign symbology to the individual classes.

In the Table of Contents (TOC), right-click on the States layer and click Properties.
Click the Symbology tab in the Layer Properties dialog box.
Select Quantities in the Show box and make sure graduated colors is selected
Indicate Pop2020 as the Value to map on

The default classification is based on 'natural breaks' with 5 classes. We'll stay with these for now.

As you assign colors and shading to continuous data, be certain to maintain the intensity gradient from low values to high values.

Press the drop-down arrow to the right of the "Color Ramp" entry
Select Yellow to Orange to Red near the bottom of the list and press Apply

There are many default settings in ArcMap and most are adequate. However, you may wish to modify these settings. Throughout the exercises in this course you will be instructed to change various settings that might have seemed just fine to you. The purpose is to introduce you to the mind numbing array of options available. The following is one such instance.

The symbols used to display your data have certain properties, two of which are the outline and the fill. These properties can be modified for each category individually or globally, that is, all at once. We are going to try a global change. The default outline color for each category is gray. Let's make them all black.

With the Layer Properties dialog still open:

BUG ALERT: There was a bug in version 10.4 that seems to have made its way to the current version.

So, instead of this:

~~Right-click on any of the symbols~~
~~Click Properties for All Symbols to open the Symbol Selector dialog~~

Do this:

Click the top symbol
Hold shift and click the bottom symbol so all symbols are selected
Right-click on one of the symbols and select Properties for Selected Symbol(s)

The Symbol Selector dialog will look the same if you were only changing one symbol, but in this case, the change will be applied to all symbols in the layer.

Click in the Outline Color box and change the color to black. You can change the Outline Width, too if you like.
Click OK
Click Apply - notice the outline changes to black – move dialog boxes if needed

When you are finished, close the Properties dialog and view the map. The results are difficult to interpret because you didn't control for the different areas of the individual states. You will correct that shortcoming by creating a map of population density.

Adding a New Field to the Database

When polygons have different sizes, it is usually better to map "density" of a variable rather than "absolute values". But does your database contain population density data? Click on any state with the "Identify" tool (the “i” icon) and note the information in the database. You will see data for area (Sqmiles) and population (Pop2020), but none for population density. However, since density is defined as population divided by area, you can add a new density field to the database, calculate density values, and map the newly created data.

Close the Identify dialog before continuing

The first step is to add a new Popden field to the database.

Right-click on States in the TOC
Click Open Attribute Table.

Now define and add the new field:

Click the Table Options button in the upper left of the table window.
Click Add Field to add a new field with the following definition:
Name: Popden
Type: Float
Do not change Precision or Scale values
Press OK to restructure the database and save your changes

In the table, note the new field added to the right of the existing fields. The next operation is to calculate data into the new field.

Right-click the on the Popden field name and click Field Calculator
Click Yes to the warning about calculating outside an edit session.

Note: part of the expression is already provided for you: Popden = You will provide the rest of the expression by clicking on the appropriate fields and operators:

Double-click on the Pop2020 field (notice it gets added to the empty lower window)
Click on the division operator "/"
Double-click on the Sqmiles field

You have now created a formula for populating the populations density field

Press OK to perform the calculation and wait while it processes.

If the new data are missing or incomplete, try to figure out what you did wrong and rerun the operation. Since you can't "undo" you will have to delete the field.

Note: Some of you with previous GIS experience know that you can also map density by "normalizing" your data. While your map will look the same, creating a new field in the database allows you to search and sort and perform various analyses based in this new field. Sometimes you need this added capability, sometimes you don't.

Creating an Equal Interval Map of Population Density

For your first population density map, you will use an equal interval classification method. The equal interval method provides a view of the data which attempts to preserve the original data ratios. Consider the following nine data values:

4, 5, 6, 14, 15, 16, 44, 45, 46

The difference between a high value like 45 and a low one like 5 (40 units) is four times the difference between a moderately low value like 15 and a small one like 5 (10 units). Now group them into five classes: 0 - 9, 10 - 19, 20 - 29, 30 - 39, 40 - 49. The original values 4, 5, and 6 are now combined into Class One; 14, 15, and 16 are combined into Class Two; 44, 45 and 46 are combined into the Class Five. In effect, the three lowest original values are now all treated as if they have the value one, the three moderately low original values are all treated as if they have the value 2, and the three highest original values are all treated as if they have the value 5. The interval of 4 classes between the highest (class 5) and the lowest (class 1) is still four times the interval of 1 class between the moderately low (class 2) and the lowest (class 1). The moderately low values from the original distribution are still closer to the low end of the new scale than they are to the high end. For this reason, equal interval classifications are viewed as the least "biased" approach to grouping values.

What the reclassification eliminates is our ability to distinguish minor differences among nearby similar values. The original values 4, 5 and 6 are now all "ones"; 14, 15 and 16 are now all "twos", etc. We are willing to give up those minor distinctions in exchange for a clearer view of the "big picture". The original distribution had an equal number of low values (3), moderately low values (3), and high values (3). It had no middle values and no moderately high values. The reclassification clarifies that relationship, permitting us to see patterns in the data distribution that might otherwise have been hidden.

You can create an equal interval map as follows:

In the TOC, right-click on the States layer to open the Layer Properties dialog and click the Symbology tab
in the Show box, under Quantities, choose 'Graduated Colors'
Indicate Popden as the field to map on (your Value field)
Press the Classify button and select the Equal Interval method with 5 classes
Click OK (don't worry, we will come back and explore the histogram shortly)
Press Apply

Close the Layer Properties dialog and note the resulting map: it appears to be solid yellow. The original distribution of data values must be highly skewed to produce this view. Let's take a look.

Viewing the Distribution of Population Density Values

Just as a "Data View" provides a means of displaying geographic information in ArcMap, a "Table" view provides the mechanism for displaying tabular attribute information. You can see the actual distribution of data values by opening a layer's table and sorting the values from lowest to highest.

Right-click once on the States layer and click Open Attribute Table
Right-click on the Popden field name and click Sort Ascending to arrange the data from lowest to highest

Scroll down while noting the density values. The lowest value is approximately 6 while the highest (District of Columbia) is over 10,400. Are the remaining values evenly distributed between those values? Do they fall more toward the low end or the high end? Are they clustered around the middle?

It should be apparent that the data are skewed toward the low end of the data range. There are a large number of small values and one very large value. Because the area of the one very large value is so small, it failed to show up at the scale of the US map (or just barely). Thus the map appeared to be monochrome yellow. The map produced by the equal interval approach does accurately reflect the underlying data distribution. Unfortunately, it also hides important variation within the vast group of values (48 of the 49) at the low end of the equal interval scale.

Close the table

Creating a Quantile View of the Data

The concept of "Quantiles" provides an alternative method for classifying data. Instead of dividing the data into classes based on equal data intervals, the quantiles approach divides the data into classes based on equal numbers of observations. If you wanted to divide 100 observations into four quantiles, the first class would contain the 25 lowest values, the second class the next 25 lowest values, etc. Quantiles provide a useful view of skewed data; class value ranges tend to be small where data values are clustered and large where data values are spread out. This preserves some of the local variation between adjacent classes, though it also presents an unreliable picture of variation across the entire data range.

To produce a five-class quantile map:

Right-click on the States layer to open the Layer Properties dialog and click the Symbology tab
Press the Classify button

Let's take a look at some of the other items in the Classification dialog window. In the Classification Statistics box in the upper right you can see the number of observations (count), the range (min, max) and other information. The histogram is particularly useful. As you can see, with the exception of Washington, D.C. all of the states are in the lowest category.

Change the Classification Method to Quantile - 5 and observe the changes to the histogram, especially the class breaks
Click OK to close the Classification dialog
Press Apply and observe the changes to the map

Quantile classifications are often used as a first approximation for systems based on increasing (or decreasing) intervals. To produce a more readable map, you can edit the range limits so that class breaks occur at round numbers. This "Manual" classification will change the categories of some states, but may make the map and data ranges easier to interpret.

Modifying the Quantile classification using the Manual method

In this section you will manually change the range limits to create easier to read categories. Note, once you do this, your classification scheme will no longer be quantile or equal-interval. It will be, as they say, user-defined.

Highlight the appropriate class cell (Range) in the Layer Properties and enter the following range limits (note: you cannot change the lowest value in the data set. It is easiest to set the upper limit of the lowest range and proceed, in order, to the highest, setting the upper limit each time. The software will automatically set the lower range limit based on the upper limit of the previous category):

Class 1 data range: 5.898080 - 40
Class 2 data range: 30 - 80
Class 3 data range: 60 - 160
Class 4 data range: 120 - 320
Class 5 data range: 240 - 11000
Press Apply

While we are at it, we might as well get rid of all those decimal places:

Right-click on any of the symbols and click Format Labels
Change the number of decimal places to 0 (zero).
Press OK

Note that only the labels reported in the legend have been rounded. The actual values dividing the categories are unchanged.

Click OK to apply the changes and close the Layer Properties dialog

Notice how different the map looks. In the equal interval version, nearly all the states appeared to have relatively low values compared to the extremely high value for the District of Columbia. In the quantile version, the uniqueness of the District of Columbia is sacrificed so that you can better see differences among the remaining 48 states. In this third, manual classification, we have created arbitrary class breaks at neat round intervals.

Using a Logical Filter to Redefine a Layer

One reason why the District of Columbia skewed the data range is that Washington DC is not actually a "state" whereas the other 48 polygons are states. Since the District of Columbia is really not a “state”, it is appropriate to exclude it from further consideration by applying a logical filter to your data. The filter acts like a True - False test: only the observations which evaluate as True are included in subsequent mapping operations. Once the District of Columbia is removed from the analysis, you should get a clearer view of the population densities of the actual "states".

In the TOC, right-click on the States layer to open the Layer Properties dialog
Select the "Definition Query" tab
Press the Query Builder button
Use the query builder to form the expression "Govt_Unit" = 'State' (Double-click on "Govt_Unit", single-click on Equal Sign, Press the "Get Unique Values" button to populate the Unique Values window, double-click on 'State')
Press 'OK' to apply the filter
You will likely need to close and then re-open the Layer Properties dialog.
Now select the Symbology tab and reapply the Equal Interval classification. This will, of course, reassign data range values.
Notice histogram
Close the Layer Properties dialog

The revised view (minus the District of Columbia) provides a fourth valid picture of “state” densities, one that shows the cluster of higher densities in the Northeast while preserving the density ratios among the 48 states.

There is no "right" or "wrong" classification method. Your goal is to convey accurate, unbiased information about the distribution of data values. The Equal Interval approach maintains the numerical relationships among all the values. But a Quantile approach works well in situations where the effect of a small number of unusual values is to hide important variation among the vast majority of observations. And with either method, you must be certain that you have selected the correct target group of observations. It is up to you to understand your data and to choose an appropriate classification method based on the purpose of your map.

Final Product

On one (1) "printed" page, please display, in a cartographically pleasing manner, both an equal interval map and a quantile map of the 48 conterminous states. This process can be a little confusing at first, so follow along carefully. When you create a map layout, you are adding map elements to a page. One element is the data frame. It turns out you can add as many data frames as you like. Let's try it:

From the Menu, click Insert - Data Frame

Notice that your map appears blank and a new data frame, called "New Data Frame," has been added to the TOC. Your map appears blank because there aren't any data layers in the data frame... yet.

Click the Add Data button
Add proj2.shp (press OK to the spatial reference warning)

You can toggle between data frames by making them "active"

Right-click on the data frame name and click Activate

The name of the active data frame appears in bold in the TOC. Right-clicking on the data frame name also allows you to change the data frame name to something more descriptive.

Note: You can also add data frames in the Layout View. In many instances I find this easier. You can also copy/paste data frames. I find this technique especially helpful in many circumstances.

Configure your maps to meet the following criteria:

Both maps must be the corrected versions with Washington, DC filtered out.
Each map must include its own legend with the range classes rounded to whole numbers.
Each map should be labeled appropriately and the overall presentation should be given a title.
Your name ought to appear somewhere so points can be awarded.

Switch to layout view by clicking the layout view button, if you haven't already. Notice that you have two data frames on your layout this time. Selecting a data frame in the layout view also activates that data frame in the TOC. When you add legends, you need to have the appropriate data frame active.

Finally, "print" your layout to Adobe PDF and then submit it using the appropriate link on Canvas

Get "excited!"

If you would like "extra" excitement, try mapping the Census data for Fullerton included in the "Extra" directory. <optional>

Last modified 09/07/2021

Geography 481 Intro to GIS Project Two: Mapping Continuous Data

Setup

Creating a Map of a Continuous Variable

Geography 481 Intro to GIS
Project Two: Mapping Continuous Data