GIS AND DATA VISUALIZTION
By Philip Elder (MS, Senior Software Engineer, Esri)
Part One: Introduction
Hello everyone, and welcome to this module on Geographic Information Systems (GIS) and data visualization. Before we get started, I'd like to introduce myself. My name is Philip Elder, and I am a senior software engineer at Environmental Systems Research Institute, better known as Esri. I received my master's degree in GIS from San Diego State University with a focus on data dissemination, visualization, and user centered design specifically related to web development. At Esri, I specialize in data engineering, data science, and user centered design.
There are two main components of this module. The first is GIS: the state of the industry today, the types of problems that it solves, and the ways that it is used both by GIS professionals and researchers in other industries in order to have a more robust understanding of the phenomenon that they're studying. The second component centers on data visualization - specifically, the how, when, and why you want to visualize your data geographically. I will go through a series of examples representing 'the good, the bad, and the ugly' in terms of map design and data dissemination through maps. My goal at the end of that section is twofold. First, I want you to have a solid understanding of the elements that make either a good or a bad map, and how to differentiate between the two in the 'real world.' The second purpose is, simply put, to evangelize GIS and data visualization through maps as a holistic, useful tool that can be easily incorporated while conducting your research as well as disseminating and displaying your results.
Part Two: What is GIS?
At its core, a Geographic Information System (GIS) is a spatial system used to create, manage, analyze, and map all types of data. Beyond that, however, there are several approaches to conceptualizing GIS, three of which are outlined below.
GIS as a System
A GIS, as stated above, is a robust spatial-based system used to extract, transform, interpret, and disseminate data. GIS connects data to a map, integrating location data (where things are) with all types of descriptive information (what things are like there). This provides a foundation for mapping and analysis that is used in science and almost every industry. GIS helps users understand patterns, relationships, and geographic context. The benefits include improved communication and efficiency as well as better management and decision making. This is the most common way of thinking of a GIS.
GIS as a Science
GIS helps researchers and scientists answer questions by examining spatial relationships using geographic statistical methods. While its main purpose is to digest and analyze data collected across scientific disciplines, there is a science to the geographic study of data that allows for a wide variety of statistical methods (Spatial Autocorrelation, Cluster/Outlier, and, especially relevant to public health, Hot Spot Analysis), theories (Spatial Kirging, Central Place), and analyses (Spatial Regression) that can only be accessed through GIS.
GIS as a Service
GIS helps to communicate complex concepts and share massive datasets through easily digestible maps, graphics, and digital services. This subset of GIS tends to be the most commonly known and used. For example, using Google Maps to find directions to the nearest Thai restaurant is consuming a GIS service. Monitoring a dashboard containing information on the status of various power stations is consuming a GIS service. Even predicting election outcomes given voter turnout and population counts across counties involves a GIS service. While most heavy analytics occur well before a service is published, it remains one of the best ways to disseminate data and communicate results to the general public, to publication reviewers, or to fellow team members. Thus, GIS can prove to be essential to effective public health and social change communication.
Part Three: How is GIS used?
Now that we've covered a few days in which one can contextualize GIS, let's discuss how it is used. Geography as a field touches most industries and fields of study, but there are six distinct ways we can use GIS in public health.
GIS is an excellent tool for identifying problems that might otherwise be difficult to pinpoint. A famous public health example of this, and perhaps the first published use of GIS in the modern western world, was Dr. John Snow's cholera outbreak maps. By overlaying cholera cases across London with locations of water pumps in the city, he was able to correctly recognize the source of the spread of the disease and influence relevant local policy, effectively curbing the spread of cholera. GIS is central to illuminating issues that are driven by geography.
Analysis Activity 1
Review this story map on the correlation between various manifestations of environmental inequality and redlining policies in U.S. cities. Think of a social- or health-related environmental phenomenon in your hometown that you might like to analyze. How would you do it? Does this phenomenon have a spatial element, and if it does, how could you leverage that spatial element to help you understand it better?
GIS can also be used to monitor changes in a system over time. Income distribution and inequality across southern California, algae blooms in reservoirs along the Colorado River, and the spread of the popularity of hip-hop across neighborhoods in New York City in the 1970s are all examples of spatial elements that can be tracked and better understood through the use of GIS as a visualization and analysis tool. There are three main criteria used to assess change using GIS:
Does this phenomenon happen over an area?
Does it change over time?
Do those changes have consequences?
Analysis Activity 2
Perhaps the most compelling example of a present-day phenomenon that meets all three of these criteria is global warming. To get a better idea of the effectiveness of GIS in monitoring change, review this report on climate change. How might the different variables outlined in this storymap (e.g., carbon dioxide levels, global temperatures) be incorporated onto two maps, one that showcases the causes of climate change, and another that summarizes the consequences? Which variables would you choose, and how would you display them?
Manage and Respond to Events
GIS can deliver real-time or near-real-time situational awareness, which is crucial for public health and other community services, from fire and emergency response, to disease prevention, to maritime and land transport logistics. For any professional capacity in which rapid response is a key component of continued success, GIS plays an integral role in the identification and analysis of, and response to, a potential problem. In early 2020, geography and developed geographic systems were indispensable tools in helping epidemiologists and policy makers understand the initial spread of the COVID-19 virus, as illustrated by these maps.
Because an ounce of prevention is worth a pound of treatment, forecasting the effects of a phenomenon has become increasingly important over the last few decades in various fields. The ability to respond to a potentially catastrophic event swiftly and effectively remains crucial, but using GIS as a tool to predict the likelihood of an event and plan accordingly is essential to the preparation for that response and how helpful it will be in mitigating the worst effects of the event. For example, being able to model coastal flooding given projected sea level rises allows policy makers and response organizations to establish contingency plans and allocate resources to areas that will be most affected, potentially saving many more lives than would otherwise be the case.
As alluded to in the previous section, GIS can facilitate priority setting for finite resources available to an organization. For example, a fire department in a city has to send two trucks from a station to an incident in the area. How long will tat incident take, given historical data? What are the chances that a second incident will occur while those trucks are occupied? Should the department send a truck from another station to cover that area in the event of a hypothetical second incident? Through the use of event management and forecasting through a geographic information system, these issues can be confidently addressed. For another example of GIS as a tool to optimize prioritization and aid in decision making, take a look at this story map on the use of geospatial data to inform agricultural strategies toward sustainability and yield maximization.
Finally, GIS can help its users understand trends in their data that might otherwise be overlooked in table format. Having the ability to visualize datasets over space can give the observer immediate insight into correlative relationships between phenomena. What happens to a population of a neighborhood when public transportation is expanded or contracted? How is climate change affecting biodiversity in different ecosystems? How, exactly, does avian migration look when mapped over time? While some trend analysis can be, and is, accomplished through traditional tabular data analysis, adding and analyzing the geographic element can provide an essential window into the root driver, or drivers, of the question being studied.
Analysis Activity 3
Think of the phenomenon you came up with in Analysis Activity 1 and expand on it. How would you like to glean insight into this phenomenon? Are you monitoring it as it changes over time, or potentially using your analysis to help respond to future events related to it? Would you attempt to use the data available to forecast future trends and set priorities accordingly, or are you simply trying to understand trends as they unfold? Think about how you would use the information available to you to enhance your understanding of this particular circumstance.
Part Four: How does GIS work?
Maps are the base component of any geographic information system, and the core upon which all other analysis, visualization, and dissemination is based. They serve as the geographic container for the data layers, analytical tools, and interactive components to be applied in your research. There are two main benefits to the basemap:
Accessibility: Maps are, comparatively speaking, a simple and easily understandable method of displaying data.
Shareability: Maps simplify data dissemination by being shareable across all platforms - from classic paper maps, to mobile devices, to immersive virtual environments, and every technological medium in between.
Data is the driving force behind the utility of GIS and, of course, the primary component in any visualization or analysis done within the geographic system. A GIS can incorporate and geo-locate (where applicable) many different data structures and types, but these structures tend to fall within three main categories:
Raster data: Raster data, simply put, is any dataset with pixels containing values. Satellite imagery or drone fly-overs, hot-and-cold spot analysis results, elevation values detected using SONAR, Radar, or LiDAR, are all examples of raster data.
Vector data: The vector data category contains anything 'drawn' on a map. While technological advancements in the industry have led to increasingly complex and fascinating data types, vector information tends to fall into three buckets: points (e.g., vehicle locations, maple trees, grocery stores), lines (e.g., roads, migration paths, jurisdictional borders), and polygons (e.g., neighborhoods, animal habitats, wildfire damage areas).
Table data: While tabular data may not necessarily have a spatial component, tables can be incorporated easily into a GIS through fields that may be joined to a geographic element. This incorporation is essential for a wide variety of applications, from field work analysis, to census data visualization, to sharing population survey results.
Spatial analysis lets you evaluate suitability and capability, estimate and predict, interpret and understand, and much more, lending new perspectives to your insight and decision making. The depth and complexity of this analysis depends heavily on the project at hand, and can range from simple applications (visualizing your data on a map and identifying problem areas), to more moderate (feeding a cloud of field-collected points into a Hot Spot Analysis tool to identify clusters of similar values), to complex *running Global Moran's I analysis on a vector dataset to establish a spatial autocorrelation value).
Apps provide focused user experiences for getting work done and bringing GIS to life for everyone. GIS apps work virtually everywhere: on your mobile phones, tablets, in web browsers, and on desktops. These are the engines behind your data dissemination and, while modern focus tends to be on interactivity and web-based mapping, can also include more traditional methods, such as embedding static maps in research papers or printing them for field work.
Analysis Activity 4
Back to your hometown. What kind of maps would you use as your base - road networks, land use/classification, artistic grayscale? Would you incorporate any satellite imagery, or rely on vector data such as census parcels or roads to tell your story? Do you think this question could be best answered using complex geospatial analysis, or is the message you're trying to convey sufficiently communicated through data visualization? How would you get your message out to the public - interactive web applications, mobile apps, or static visuals? What kind of benefits and drawbacks exist for each of these approaches, given the phenomenon at hand?
Part Five: Data Visualization
Visualizing data is the most important element to communicating a spatial phenomenon to your audience. When done properly, a map can be a powerful tool in conveying a message; when done improperly, a map can be a misleading, deceptive, and even dangerous tool. Because the visualization component of geospatial analysis is designed to communicate a conclusion instantly and without detailed explanation to the end user, even simple decisions like feature size and color can have a massive effect on the conclusion said user reaches after examining the map. To alarm or frighten an audience, a cartographer might include hot tones, like red and orange, as frequently as possible throughout their finished product. To relax or invite an audience to interact with their map, they might not only include cool tones, but slow-moving animations and a simplistic layout, so as not to overwhelm the observer. To inflate the importance or pre-eminence of a particular phenomenon, they might increase the size of each point that represents it (election result mapping is a frequent victim of this last approach - see the first linked example in the "How to Visualize Data Geographically" section for an example). Because it is so easy to mislead or deceive an audience through data visualization, it is essential for the map creator to understand the nature of their data, establish a message they wish to send with their analysis, and carefully choose their colors, symbols, and overall design of their end product, so as to properly represent the phenomenon with which they're tasked. As we diver further into data visualization as a field, let's review three questions partially addressed in the above section: why, when, and how would we visualize geographic data?
Why to Visualize Data Geographically
We have largely covered the answer to this question. If a policy influencer wishes to make a point, data visualization is the best approach, and GIS is an excellent conduit through which to visualize data. If a researcher wishes to gain insight on their subject that doesn't seem readily apparent when looking through the tables in which data are stored, mapping and visualizing said data is an excellent approach to further understanding the subject. This is especially true if events need to be monitored and addressed in real time. Two examples that emphasize why data visualization is so important can be found in this John's Hopkins COVID-19 tracker and this climate change effects forecaster.
When to Visualize Data Geographically
Simply put, whenever your data has a spatial element (e.g., neighborhoods, zip codes, counties, x/y coordinates), any spatial variation (e.g., temperature change, population count per square mile, economic inequality, access to essential resources or services), or any capability to move (e.g., human immigration/emigration, animal migration, tides, transportation and infrastructure), it should be, when possible, visualized geographically.
How to Visualize Data Geographically
Unfortunately, this question enters a field of study that is far too broad and complex to be summarized succinctly. That said, there are three steps to consider when visualizing your data:
Understanding your data: Is your information related to people, or land, and should you visualize it by population centers or pure land mass? Are you showing the current location of a species of bird or its migration, and should you use a point or a line accordingly? Make sure you understand the phenomenon you're communicating, and choose your symbols to meet your communication goals.
Coloring your data: Do you want to motivate behavior change through loss- or gain-framed messages, or simply demonstrate an interesting pattern? Does your subject already have a color generally associated with it (e.g., red for danger such as fire, green for safety), and would you like to employ that color as a main theme in your map? What about aesthetics - do you want your map to look pretty, or to make the viewer feel uncomfortable? As emphasized in the messages module, choosing the correct color scheme for your data is perhaps the most important step in data visualization, and not one to be taken lightly.
Interacting with your data: Do you want to create a static map, or would you prefer your audience be able to interact with your information? What audience segment(s) are you trying to reach with your message, and what is the level of technology they are comfortable with? Is there a genuine benefit to communicating your message interactively, or is that overkill? Knowing your audience, as well as your data, will help you in answering these questions and optimizing your final product.
Analysis Activity 5
Go to this website and toggle between "Show Land Circles" and "Show Population Circles." Which method of data visualization do you think is most appropriate given the data studied? Which is more effective in convincing the audience?
Analysis Activity 6
Look at the below map. Is this map effective at accurately conveying the underlying data? Is it effective at eliciting an immediate response from the viewer? What do you think was the end goal of the cartographer who made this map?
Research Activity 1
Let's finally visualize your hometown phenomenon! Go to the ArcGIS Story Map, click Sign In to ArcGIS StoryMaps and create a free account by clicking on Create an ArcGIS public account, then create a basic story map to communicate your subject. You can add maps dynamically with points, lines, and polygons drawn on, or if you want to go the extra mile, you can find data on the Living Atlas data repository and try to add it to your maps. Your story can be as simple or as complex as you'd like, and is more designed to give you an introduction to incorporating maps and visuals into your work.