The recent news that a deadly PG&E gas line explosion was being blamed in part on the lack of reliable information in a mapping system has raised some issues about risk and liability. If you haven’t yet read the exposé by the San Francisco Chronicle that seems to assert that the database and mapping system should have caught the defective seam that was the cause of the explosion, it’s critical reading. While the piece makes mention of poor record keeping, and process problems in converting paper records to digital form, there was also an apparent over reliance on technology to cut through issues of process and policy to provide trusted answers despite shoddy data gathering.
One of the more disturbing aspects of this story is that simply the existence of the mapping system seems to have been enough for the utility to get out of having to invest in sending sensors throughout their pipelines to reliably measure the safety of these assets. The assertion was that the mapping system was enough, but clearly the existence of the system alone did not ensure that risk was being assessed accurately. The quality of data are of far greater importance than the existence of data, particularly in organizations where mapping and monitoring are tied closely to public safety.
In areas of infrastructure, the mapping and monitoring of assets are often tied very closely to public safety. A buried gas pipeline is certainly of concern as illustrated by the example above, but so too are electric cables, water lines, environmental hazards tied to industrial processes, and a whole host of other things that we capture and monitor in our geographic information systems. The accuracy and precision of our maps of these potential hazards need to be maintained with the utmost care, as there are clear consequences when they fail or are disturbed because of being wrongly confident of their location.
While record keeping standards are in place in many of these high-risk applications, often the standards haven’t kept pace with technology, and the complexity of the systems make it tough to regulate and monitor. Given the importance of the integrity of these data, there may be a need for a whole new level of quality assurance.
Among the key characteristics of data quality are accuracy, validity, reliability, timeliness, relevance and completeness. Standards for each of these areas need to be considered, but there are always cases of compromise where data are deemed fit for purpose despite some limitation. By understanding this limitation, and making it clear to the users of the data, then the data and process can be improved upon to ensure greater validity through time.
Spatial data quality problems, like any data quality problem, can be widespread. In the case of spatial data there can be geometric, topological or attribute-based problems. Metadata, or data about data, is the approach to profile data and to understand how the data fits business requirements. In the case of utilities, there may be an interest and need for industry-based groups to help one another with assessments and audits in order to ensure that their data meets their shared business rules, and that critical infrastructure and public safety are assured.
Maturity Hides Merit
A great deal of time and money has been spent collecting both engineering and spatial data for the past 20-25 years, but with changing collection technology and different standards. The longevity of collection may in fact be the greatest contributor to its poor quality, particularly if details about how it was collected are missing. With time, the pedigree of data becomes the most important element, because it enables its trusted integration with other data.
The process of quality assurance/quality control (QA/QC) is a painful process with spatial data, because of different spatial, temporal, and even semantic differences. Facilitating the assessment of data quality is a difficult task that gets harder as the amount of data, and differences between data escalate. In this era of big data, with sensors feeding volumes of information, the quality assurance task needs to be addressed frequently so that the volumes of data through automated processes don’t override the importance of quality.
Standards are in place, and are being fine tuned so that systems themselves might self-assess their data quality. As data volumes increase, and our built environments become more complex, audits of data quality are of increasing importance, with implications for our economy and public safety.
Improving Information to Support Decision Making: Standards for Better Quality Data, 2007, UK Audit Commission
Spatial Data Quality Management and its Role in Increasing Business Efficiency in Utility Organisations, by Graham Stickler, 2007, GIS Development
Towards spatial data quality information analysis tools for experts assessing the fitness for use of spatial data, by R. Devillers, Y. Bédard, R. Jeansoulin and B. Moulin, 2005, International Journal of Geographical Information Science, Taylor & Francis
Spatial Data Quality, by Wenzhong Shi, Peter Fisher, Michael F. Goodchild, 2002, CRC Press