GEOG 484

FINAL PROJCECT

By Brent McMilleon, Don Krysakowski & Geoff Price

 

1.0 Introduction and Objectives
The Albemarle Charlottesville Historical Society (ACHS) has formally announced a need to convert 20th Century Sanborn maps of the City of Charlottesville into a GIS format. ACHS expects the winning consultant to provide GIS applications that will facilitate land use change studies, genealogical research, and city visualization for each decade of Charlottesville from 1900-1990. Any data submissions are to be evaluated on the most efficient database design, digitizing accuracy and supplied metadata. All data submitted will, at minimum, include the following:

-Spatial and attribute data necessary for studying land use distribution
-Specific street address information
-Reference data for geocoding service
-Building height


2.0 Database Design
Two feature classes were created to represent street centerline and building footprint. The name of the each feature class was intended to designate the time period (i.e., Buildings1920, Streets1920). Preceding/Subsequent years will be identified similarly using the appropriate name and decade. Upon completion each time period feature class can be controlled independently allowing for a temporal analysis of Charlottesville from 1900 to 1990.

 


2.1 Unforeseen Challenges
The design proved efficient but could have been perfected for the following listing of unforeseen challenges:

1. One-half (1/2) floors not considered in the database design. (Although, we did add a z-height variable to account for differing building heights even with the same number of floors. Where no building height z-value was provided on the Sanborn Map we assumed a height per floor of 12 feet.)
2. Where there was no address data, or there was multiple address data, on the Sanborn Map, a set of default parameters needed to be created and documented in the digitizing procedures. These include:
-a. For the Streets Feature Class - defined guidance on how to implement the Left and Right Directional Address values for those instances without any addresses on the Sanborn Maps to reference.
-b. For the Buildings Feature Class:
--i. What to do in the case of where no address is provided, or
--ii. What to do when multiple addresses are provided for the same feature,
3. Instances of single buildings with multiple floors (and multiple uses on those floors) were not addressed by the database, e.g. apartments over a store. A database redesign would be needed to capture multi-floor attribute data over the same geographic footprint.
4. Stairways that carried separate street addresses were coded separately.
5. Overhangs were coded as separate polygons but classified as the same Building ID value as the primary building or store.
6. Multi-height buildings were treated like overhangs; each polygon was captured separately depending upon the roof height, but each polygon was then coded with the same Building ID number.
7. Different members of the team interpreted and implemented some variables slightly differently. This should have been addressed in a reference document or have the variable's use and description made more explicit in the digitizing procedures.
8. Buildings with multiple uses in a single space
9. Cannot determine land use category for all buildings
10. Public buildings tended not to have address numbers
11. How to specify ranges for geocoding: actual or potential
12. The geocoding service used could not decode the provided coded domains. A lookup table cross-referencing the code to an actual street name would prove a more effective design. An alternative to a lookup table would be to actually put the street name in as both the "code" and the "description" in the coded domain table, such that Main Street would be entered in Table 2.7 below

Table 2.7

By doing this, the actual street name would be coded into the table as an attribute variable and we would still get the benefit of seeing the list of streets available from the drop-down menu when actually entering the data (see Geocoding results in Figure 2.0 below)

Figure 2.0 Geocoding Results

3.0 Georeferencing and Digitizing Procedures

The following table is a step-by step procedure for georeferencing the Sanborn maps.

Table 3.0 Georeferencing Steps & Special Digitizing Procedures


4.0 Data Entry Error Prevention and Correction

After georefencing was complete the following considerations were carefully followed to prevent digitizing error.

4.1 Error Prevention
1. Set snapping environment appropriately.
2. Coded domains used where available.
3. Digitize Street Centerlines in the direction from the lowest to the highest addresses as identified on the Sanborn maps.
4. Set suitable scale and magnification to suitable levels.
5. Visually inspect 'dirty' areas and make corrections as needed.
6. Set transparency level of feature class to allow for:
-a. Visual inspect building layout relative to Sanborn map.
-b. Enter data from underlying Sanborn map.
7. Use editing tasks:
-a. Complete Polygon to digitize adjacent buildings with shared boundaries to prevent overlap.
-b. Cut Polygon to speed digitizing process and help with accuracy. (The use of this capability significantly improved the speed with which buildings could be digitized.)
-c. Split to break street centerline data into individual blocks.

4.2 Error Correction
1. A hard copy print of the data tables were reviewed for consistency and accuracy. Missing or suspicious values in the data tables were researched and corrections noted on the paper. These hard copy changes were then 'edited' into the feature class data table in ArcView

2. Test maps were created to confirm data consistency. When inconsistencies were discovered, values were researched and corrections made by directly editing the feature class tables. These test maps were created with the Symbology property and used to check for spelling and data entry errors. We have an example of this in Section 6.2 below.

Also, attribute data tables were sorted to also check for data consistency and for data entry errors, e.g. missing directional prefixes and suffices in our Streets_1920_merged feature class.

5.0 Time Estimation

 

6.0 Analysis of Error

Given the needs of the Albemarle Charlottesville (Virginia) Historical Society (ACHS) to create a digital representation of the city of Charlottesville to:

-Study land use changes over time,
-Identify specific building addresses and relate that information to historical city directories for genealogical research, and
-Provide animated or three dimensional views of the city as it grows over time the level of error in this project can be induced in two ways. First, there can be spatial error, or the error introduced by placing geographic entities in incorrect places. And second, database attribute errors whereby geographic entities are incorrectly identified, or described in an inconsistent or inappropriate way.

6.1 Spatial Error

Spatial errors can be attributed to incorrectly georeferencing the various Sanborn Maps to the underlying reference map, in this case an orthorectified photo of the City of Charlottesville. As with any manual process, there will always be some element of error introduced however, using the detailed procedures outlined above, this induced error can be minimized. Given that the process itself will likely induce some level of error, the question is how much error is tolerable.


Figure 6.0 Comparison of Digitized Streets


 

Figure 6.0 above represents a close-up view of the area around Main and 1st Streets. Elements of this map were created by different people (Brent, Geoff, and Don) and then combined into a single (Merged) view. As can be seen, there is some element of error introduced by each person digitizing their respective street areas, however as shown by the scale on the map, the intersection at Main and 1st Streets only varies by about 2 meters. Further, when each of the individual digitizing efforts has been combined into a single feature (Merged Streets), the difference among all three separated digitizing efforts is about 1 meter.


For the purposes defined for this project, this level of error is more than acceptable. The changes in land use can readily be seen, the locations of buildings will be accurate to a degree that they can be easily identified and addressed, and this nominal error will be virtually invisible when viewing city-wide animations or three dimensional images.

An additional level of error that cannot be addressed by this effort involves the use of the Sanborn Maps themselves. This study assumes that the data presented in the Sanborn maps is accurate. If for some reason, the Sanborn map data is inaccurate, such as from a poorly scanned, or a badly distorted and/or damaged original (i.e., stained, torn, wrinkled, or creased) that same inaccuracy will be reflected in the digitized images.

Finally, some may reference a computed value called RMS Error, or Residual Mean Square Error, as a measure of adequate georeferencing the Sanborn Maps to the digital orthophoto. Unfortunately, this is not an appropriate use of this statistic since we do not know the coordinate system of the Sanborn map. No inference regarding the relative goodness of geocoding can be made from the RMS values calculated at the time the Sanborn maps are geocoded.

6.2 Attribute Error

During this project, errors introduced while coding the attributes of the spatial features have a much higher potential for inaccuracies. Misspellings, typographical mistakes, and even some application-induced errors can occur. These errors are much harder to catch, but fortunately, are much easier to fix.

By using a pre-defined set of "test" maps, many of the attribute errors can be corrected by directly editing the database. For example, Figure 6.1 Comparison of Attribute Errors, shows the layout of buildings and streets for the test coding area of Charlottesville. In this map legend you can see three different references to "Dwellings". These should just be one label, however, in one instance the word was misspelled as "dweling" and in another instance there is actually a space " " following the word (unseen by the eye but picked up by the computer. In each of these cases, this information needs to be researched and corrected. For this reason, in our estimate of time to complete this project, time was allocated for each decade's worth of data to perform this kind of Quality Assurance analysis.

 

Figure 6.1 Comparison of Attribute Errors

6.3 Conclusions

The level of spatial error identified as a part of this test analysis is more than adequate to meet the needs of the ACHS. Further, as attribute errors are identified, they can and will be corrected directly to the supporting attribute data table(s).


7.0 FGDC - Metadata Links


Shapefiles
Streets_1920_merged.shp
https://cms.psu.edu/AngelUploads/Content/200506FAWD___IGEOG_484_001/_assoc/2281F95FEE96436C8EB69534B565B1F2/streets_1920_merged_metadata.htm

Building_1920_merged.shp
https://cms.psu.edu/AngelUploads/Content/200506FAWD___IGEOG_484_001/_assoc/4C4492B673694F9C868634471FBCC0D6/buildings_1920_merged_metadata.htm

Feature Classes
Streets_1920.shp
https://cms.psu.edu/AngelUploads/Content/200506FAWD___IGEOG_484_001/_assoc/9BF1371ED53448A886F9967FEF6C8EE8/streets1920.htm
Buildings_1920.shp
https://cms.psu.edu/AngelUploads/Content/200506FAWD___IGEOG_484_001/_assoc/9BF1371ED53448A886F9967FEF6C8EE8/buildings_1920.htm

 

 

8.0 Figures

Figure 8.0 Building Use Charlottesville, VA, Circa 1920

 

Figure 8.0 above displays the location of use of the various building structures coded from Sanborn Fire Maps from 1920. The building use information was taken directly from the maps and divided into Commercial, Residential, Public, and Unknown categories. Since this is the downtown area of Charlottesville, VA, it is not surprising to see most of the structures coded for commercial use.

 


Figure 8.1 Building Heights, Charlottesville, VA, Circa 1920

Figure 8.1 above displays the heights of various building structures coded from Sanborn Fire Maps from 1920. The building height information was taken directly from the maps in terms of the number of floors represented by each structure, or structural element. The number of floors for each structure is also printed with the structure on each map. The tallest building in Charlottesville in 1920 is the National Bank of Charlottesville building, rising to a height of 7 stories and depicted in red in the center of the map.

 

Figure 8.2 Acutal Building Height


Figure 8.2 depicts a thematic map of the buildings digitized by the Flying Projectors represented by actual height rather than by number of floors. The team felt that given the requirement to produce 3D landscape representation in the future that a z value would more accurately represent the 1920 cityscape. The map was created by using a graduated colors symbology with 5 classes of heights. The data required to produce this map was not 100% complete and estimates were used where we had no height data. The methodology for estimation included standardizing the height per floor to 12 feet. Height was then calculated by multiplying the number of floors by 12.

 

Figure 8.3 3D View and Building Use

Figure 8.3 above above is a three-dimensional representation of Charlottesville, Virginia in 1920. All building features were digitized from 1920s Sanborn Maps and building heights were recorded to ensure a more accurate portrayal of the city in 1920. In addition 1920 building use types were recorded and intended to study land use change overtime.