01 - Creating Basics Maps (Covid-19)
This tutorial will guide you through the necessary steps to create an incident-rate Covid-19 map of US counties.

Note: If you haven’t downloaded and installed the program, you can find the instructions here. You can also find QGIS’s documentation here.

Datasets

To create these maps we will be using the following three datasets:

A packaged file of the above data can be found here. Note that this package may contain partial datasets meant for this tutorial. To get the full datasets please refer to the links above.

Adding Layers

The first step in creating a basic map is to open QGIS and add the counties and state layers you downloaded:

  • To add shapefiles click on the Add Vector Layer button. Other types of data, including rasters, csv (comma separated values) and postGIS, will be added using the other buttons.

    thumb

  • Start by adding the US Counties (20m) file.

  • Under Soruce, click on the three dots and make sure you select the file with the extension .shp. Remember that a shapefile is actually made up of 5 or 6 individual files with different extensions. Normally, these individual files are the following:

    • .shp - The main file that stores the feature geometry (required).

    • .shx - The index file that stores the index of the feature geometry (required).

    • .dbf - The dBASE table that stores the attribute information of features (required).

    • .sbn and .sbx - The files that store the spatial index of features (these might get corrupted, see note at the end of this tutorial).

    • .prj - The file that stores the coordinate system information.

    • For more information on these extensions and others see this explanation by ESRI.

  • Depending on the settings of your QGIS, the program will either load the file automatically and change your project’s CRS (coordinate reference system) based on the CRS of the file you just added, or ask you to Select a Transformation for your file. If you do get the prompt to select the transformation, we suggest you choose the default transformation. Unless you know exactly what transformation you want to use, your best bet is probably to go with the one QGIS is suggesting. In any case, at any point you can change the project’s CRS to fit your needs.

  • Once you’ve chosen the transformation (if prompted) close the window where you selected the vector file to add.

  • Next, let’s change the project’s CRS to the standard US projection:

    • Click on the button that says EPSG at the bottom-right corner of the map. This will bring up the Project Properties panel with the CRS tab open.

    • In the Filter prompt, type USA Albers Equal Area. This will bring up a couple of options in the Predefined Coordinate Reference Systems section. Choose the USA_Contiguous_Albers_Equal_Area_Conic_USGS_version and click OK. This will change the way the map looks, but it won’t alter the data itself or what you can do with it.

  • Before:

    Default CRS

  • After:

    USA Contiguous Albers Equal Area Conic

  • Now add the US states layer.

  • After you’ve added the US states layer you can re-arrange your layers. In the layer panel drag the US states layer above the US counties layer.

  • Finally, so that you can actually see both layers at the same time let’s quickly change the symbology of the US states layer:

    • Double-click on the US states layer and go to the Symbology tab.

    • There, highlight the Simple Fill symbol, under the Fill symbol.

    • As Fill style choose No brush.

    • For Stroke color choose white (#ffffff in `HTML notation).

    • And for Stroke width choose 0.5 (with points as the unit).

    States style

  • You should now see the states as a white outline and the counties underneath.

Adding a CSV File

Next, let’s add the Covid-19 dataset. This is a csv (comma separated values) file, with no geographic information: it doesn’t contain coordinates or geometry. Instead, every row represents a county and is identified with a FIPS (Federal Information Processing Standards) code. We will join the Covid-19 data to the US counties, by matching the FIPS code to the GEOID code that’s already in the counties file.

  • First, we need to create a “metadata” file that will tell QGIS exactly how to read the csv file. In QGIS this file is called a csvt and it contains the type of data for each column in the csv file. The different types of data your fields can take are:

    • String - Represents text

    • Integer - Represents whole numbers

    • Real - Represents both negative and positive numbers, with decimal points

    • Date - Date in the format YYYY-MM-DD

    • Time - Time in the format HH:MM:SS+nn

    • DateTime - Date and time in the format YYYY-MM-DD HH:MM:SS+nn

  • For every column in our csv file we need to specify what type the data is in:

    • In your text editor (Sublime, Atom, or VS Code), open a new file.

    • On the new file write (all in one line):

      "Date","String","String","String","Integer","Integer","Integer","Integer","Integer","Integer"
      
    • Note that every item is in double quotes and separated by a comma. Also, even though some columns look like numbers in the original csv file (like the FIPS code), they should actually be treated as text (and maintain their leading zeros). Since we are using this field to join the Covid-19 data to the US counties, it is very important that it comes in in the right format. If we have one file with text and another with integers or real numbers, the QGIS won’t be able to match them.

  • If you are working on Mac’s TextEdit you need to format your file as ‘Plain Text’. To do this click on Format and then Make Plain Text. This will change your file from an .rtf to a simple .txt.

  • Save your file with the same name as the csv file it is “annotating” but with a different extension. It is important to do this so that QGIS understands that this csvt file corresponds to the other csv file. In both Windows Notepad and in Mac TextEdit you need to manually type the extension (.csvt) and in TextEdit you need to un-check the option that says ‘If no extension is provided, use .txt’.

  • This file should be saved as us-counties.csvt.

  • Once you have created the csvt file you are ready to import the csv file:

    • Click on the Add Delimited Text Layer button (the comma with the + sign) and select the csv file. Remember, the csvt file should be located in the same folder where he csv file is and should have the same name (the only difference being the extension).

    • Make sure the file format is CSV and that under Geometry Definition you select No geometry (attribute only table). If we were importing a csv file with coordinates or geometry we would select one of the other two options.

    Import csv options

    • Hit Add and then Close. You should now see the csv file in the layers menu.
  • To verify that the data was correctly read, right-click on the Covid-19 data table and choose Properties. Then navigate to the Fields tab. There you should see the different fields with the right Type associated with them: QString for FIPS and double for Incident_Rate.

Joining the Two Datasets

The most important thing to do before we can join the two datasets is to verify that the two columns we will use to create the join (GEOID and FIPS) match each other. However, if you open the attribute table of the US counties dataset (by right-clicking on the layer and selecting Open Attribute Table) you will see that the GEOID column always have 5 digits (sometimes the first digit is a leading zero), whereas the FIPS column sometimes has 4 and sometimes 5 digits.

Therefore, the first thing we need to do is to add a leading zero to the FIPS column in the Covid-19 dataset:

  • First, open the Covid-19 dataset Attribute Table.

  • At the top, click on the Open field calculator button (the small abacus).

  • Next, in the Output field name option, type GEOID.

  • Change the Output field type to Text (string).

  • And then in the Expression box type lpad("FIPS",5,'0'). This invokes the left pad function, taking the values from the FIPS column, and making sure they always have 5 digits, adding a 0 at the left if necessary.

Left pad function

  • Click OK. If you now move to the left hand side of the Covid-19 Attribute Table you should see the new column (called GEOID) and it should always have 5 digits (sometimes with a leading zero). Note that many rows won’t have either GEOID or a FIPS code. This is because in addition to data for US counties that dataset also contains data for the rest of the world.

Now that the two fields are in the same format, we can actually create the join:

  • Right-click on the US counties layer and choose Properties. Then navigate to the Joins tab.

  • There, click on the green plus (+) sign at the bottom of the window.

  • Select the Covid-19 data as the Join layer and GEOID as the Join field. And select GEOID as the Target field.

  • To make field names shorter check the Custom field name prefix and type Covid_ in the field below.

  • Your join menu should look something like this:

    Join menu

  • Click OK and OK again to close the Properties panel.

  • To verify that the join happened correctly, open the attribute table of the US county layer and scroll to the left. There you should see the new fields (labeled Covid_...) with data in them.

Basic Symbology

Symbology is one of the most important concepts in mapping. At its most basic level, symbology stands for changing the color, line weight, size or outline of a layer. However, and more importantly, it also means changing the appearance of a layer based on one or multiple of its attributes. In this tutorial we will classify and symbolize the data based on the values found in the Covid_Incidence_Rate field.

  • Double-click on the US counties layer and go to the Symbology tab.

  • Since the value we are using for the symbology is a continuous one, instead of a categorical one, click on the dropdown menu at the top and select Graduated.

  • Next, click on the dropdown menu for the Value field and select the one we wish to symbolize: Covid_Incident_Rate.

  • Select the color ramp you will use by clicking on the dropdown menu next to the Color ramp field. We have chosen the YlOrRd (yellow - orange - red), found in the All Color Ramps sub-menu inside the Color Ramp dropdown menu.

  • Now click on the Classify button at the bottom of the page.

  • You will notice that QGIS creates 5 classes. This default classification method is called Equal Interval. Before you hit OK click on the Symbol button, highlight the Simple Fill symbol (underneath the Fill symbol) and set the Stroke style to No Pen. This way you’ll be able to see the results of the classification method much better.

    Equal interval classification

  • Once you hit OK you will notice that with this classification method the counties with the highest Covid-19 incidence rates appear red, but that the majority of other counties are all grouped into the two lowest groups.

  • Open the Symbology panel again and click on the Histogram tab. Click on Load Values. Here, you can clearly see how this classification method is creating the buckets for your data. As you can see, most of the values are somewhere below 14,000, and they are all grouped into just two buckets, whereas the top three buckets contain very few counties.

    Equal interval histogram

  • Go back to the Classes tab and change the classification method to Natural Breaks (Jenks). You might get a warning saying that this method could take a long time. If you have a large dataset you should be careful. However, the dataset that we are using is small enough. Once the classification is done, go to the Histogram tab. You can see how QGIS has now changed how it separated the data and most of the counties with lower incident rates are spread between the first three buckets. The Natural Breaks (Jenks) methods attempts to minimize the difference within groups while maximizing the difference between groups. Many datasets will be best classified using this method.

    Natural breaks histogram

  • Click OK so you see how the map is now much more nuanced and rich. You still see the counties with the highest rates but there is differentiation among the ones with lower rates too.

  • Other methods of classification include Quantiles and Standard Deviations. Go ahead and explore them and see how they change our perception of the data on the map.

  • Finally, choose the Natural Breaks (Jenks) method. To add a bit more detail to the map, increase the number of classes to 7. This will add a few more buckets. However, note that increasing the number of classes by more than 8 doesn’t yield that much benefit as our eyes can’t really differentiate that well between too many shades of similar colors.

  • The last step is to adjust a bit the values to make them easier to understand. Double-click on each row and change the value to the following:

    • 0 - 5000

    • 5000 - 7500

    • 7500 - 10000

    • 10000 - 12500

    • 12500 - 15000

    • 15000 - 20000

    • 20000 - 35390 (or whatever the maximum value is)

    Classification adjusted

  • Click OK to close the properties panel.

Adding Labels

We will use the US states layer to add some labels to our map.

  • First, double-click on it to open its properties.

  • Under the Labels tab choose Single labels. And under Value choose the NAME column.

  • Under Text, select the appropriate font. In our case we are using Avenir

Label menu

  • Click OK to close the properties panel.

Print Layout

The Print Layout (formerly known as Print Composer) is where you will format your map for its final output. Here you will specify the output size, you will add a legend, a scale bar (if needed), a north arrow (if needed) and any additional text (titles, sources, explanations and credits). Although the Print Layout exists as its own window it will still be linked to the map Project we have been working on.

  • First, create a new Print Layout in Project, New Print Layout. Give it a custom name if you want, although this is not necessary.

  • Once you are in the Print Layout you need to add a new map. Think of it as if you had a blank piece of paper and you were adding a window onto the map you’ve been working on. That window is a link to your Project and if you change things in the Project those changes will still be reflected in the Print Layout.

  • To add a new map, click on the button Add new map on the left-hand panel and draw a rectangle on the blank page.

    Add new map button

  • Once you add the map you can adjust its size and position by dragging it from its corners.

  • You might notice that if you change the size of the map it doesn’t necessarily update. To avoid this, on the right-hand panel, where it says Main properties, click on Update preview.

  • To move the content inside the Print Layout (as opposed to the whole page) use the Move item content tool on the left-hand panel.

  • Next, you need to center and zoom in the map on the area you want to focus on. After you’ve centered the map adjust the scale by typing the desired value (18000000) on the right-hand panel, under Main properties / Scale.

  • If any of the colors or line weights seem too big or two small or not correct, you can always go back to the Project and change them there. When you return to your Print Layout you can update your preview and the changes will be reflected.

    • Currently, the label font seems too large. Go back to the Project and set it to 8 points.

    • Once you return to the Print Layout click on the Update Map Preview button on the right-hand panel (the circle arrows) to refresh the map and see the changes.

  • Since this is a national map not made for measurements we don’t really need a scale bar or a north arrow. If you were to add them you would do so by clicking on their buttons on the left-hand panel and placing them on the map.

  • To add a legend click on Add Item Add legend and then click on the map. You will notice that QGIS automatically generates a line in the legend for every layer in the map. We only need the Covid-19 one, so we need to customize the legend:

    • On the right-hand panel, under Legend items uncheck Auto update and then select the layers that you don’t want in the legend and remove them with the ‘minus’ button.

    • To hide the title of the layer right-click on it on the right-hand panel and select Hidden.

    • Next, let’s customize the actual values in the layer. Double-click on each value (on the right hand panel) and type the correctly formatted value. For most entries we suggest just adding the comma separators. For the last entry you can add a greater than (>) sign.

    • Under Fonts change the Item font to Avenir with a size of 9 points.

    • Under Symbol change the Symbol width to 4.0 mm.

    • And under Spacing change the Legend title / Space below to 0.00 mm, the Space between symbols to 2.00 mm and the Symbol label space to 1.5 mm.

    • Finally, to make the background of the legend transparent go further down and uncheck the Background option.

  • Since we did not rotate the map we don’t need to add a north arrow. If you rotate your map you should add a north arrow. If you wanted to, you could add a north arrow by clicking on Add Item Add arrow.

  • Finally, to add a title and a ‘source’ text, click on the Add new label button on the left-hand panel and click on the map. Customize these labels by changing their color, size and location.

  • The last step is to export the map as a .pdf file. Use the Export as PDF button on the top toolbar and save your map.

  • Your final map should look something like this:

    thumb

Made with Jekyll & Tachyons | Source at github.com/pointsunknown | CC BY-SA 2.0