1.0 Introduction and Objectives
The Albemarle Charlottesville Historical Society (ACHS) has formally
announced a need to convert 20th Century Sanborn maps of the City of
Charlottesville into a GIS format. ACHS expects the winning consultant
to provide GIS applications that will facilitate land use change studies,
genealogical research, and city visualization for each decade of Charlottesville
from 1900-1990. Any data submissions are to be evaluated on the most
efficient database design, digitizing accuracy and supplied metadata.
All data submitted will, at minimum, include the following:
-Spatial and attribute data necessary for studying land use distribution
-Specific street address information
-Reference data for geocoding service
-Building height
2.0 Database Design
Two feature classes were created to represent street centerline and
building footprint. The name of the each feature class was intended
to designate the time period (i.e., Buildings1920, Streets1920). Preceding/Subsequent
years will be identified similarly using the appropriate name and decade.
Upon completion each time period feature class can be controlled independently
allowing for a temporal analysis of Charlottesville from 1900 to 1990.






2.1 Unforeseen Challenges
The design proved efficient but could have been perfected for the following
listing of unforeseen challenges:
1. One-half (1/2) floors not considered in the database design. (Although,
we did add a z-height variable to account for differing building heights
even with the same number of floors. Where no building height z-value
was provided on the Sanborn Map we assumed a height per floor of 12
feet.)
2. Where there was no address data, or there was multiple address data,
on the Sanborn Map, a set of default parameters needed to be created
and documented in the digitizing procedures. These include:
-a. For the Streets Feature Class - defined guidance on how to implement
the Left and Right Directional Address values for those instances without
any addresses on the Sanborn Maps to reference.
-b. For the Buildings Feature Class:
--i. What to do in the case of where no address is provided, or
--ii. What to do when multiple addresses are provided for the same feature,
3. Instances of single buildings with multiple floors (and multiple
uses on those floors) were not addressed by the database, e.g. apartments
over a store. A database redesign would be needed to capture multi-floor
attribute data over the same geographic footprint.
4. Stairways that carried separate street addresses were coded separately.
5. Overhangs were coded as separate polygons but classified as the same
Building ID value as the primary building or store.
6. Multi-height buildings were treated like overhangs; each polygon
was captured separately depending upon the roof height, but each polygon
was then coded with the same Building ID number.
7. Different members of the team interpreted and implemented some variables
slightly differently. This should have been addressed in a reference
document or have the variable's use and description made more explicit
in the digitizing procedures.
8. Buildings with multiple uses in a single space
9. Cannot determine land use category for all buildings
10. Public buildings tended not to have address numbers
11. How to specify ranges for geocoding: actual or potential
12. The geocoding service used could not decode the provided coded domains.
A lookup table cross-referencing the code to an actual street name would
prove a more effective design. An alternative to a lookup table would
be to actually put the street name in as both the "code" and
the "description" in the coded domain table, such that Main
Street would be entered in Table 2.7 below
Table 2.7

By doing this, the actual street name would be coded into the table
as an attribute variable and we would still get the benefit of seeing
the list of streets available from the drop-down menu when actually
entering the data (see Geocoding results in Figure 2.0 below)
Figure 2.0 Geocoding Results

3.0 Georeferencing and Digitizing Procedures
The following table is a step-by step procedure for georeferencing
the Sanborn maps.
Table 3.0 Georeferencing Steps & Special Digitizing
Procedures

4.0 Data Entry Error Prevention and Correction
After georefencing was complete the following considerations were carefully
followed to prevent digitizing error.
4.1 Error Prevention
1. Set snapping environment appropriately.
2. Coded domains used where available.
3. Digitize Street Centerlines in the direction from the lowest to the
highest addresses as identified on the Sanborn maps.
4. Set suitable scale and magnification to suitable levels.
5. Visually inspect 'dirty' areas and make corrections as needed.
6. Set transparency level of feature class to allow for:
-a. Visual inspect building layout relative to Sanborn map.
-b. Enter data from underlying Sanborn map.
7. Use editing tasks:
-a. Complete Polygon to digitize adjacent buildings with shared boundaries
to prevent overlap.
-b. Cut Polygon to speed digitizing process and help with accuracy.
(The use of this capability significantly improved the speed with which
buildings could be digitized.)
-c. Split to break street centerline data into individual blocks.
4.2 Error Correction
1. A hard copy print of the data tables were reviewed for consistency
and accuracy. Missing or suspicious values in the data tables were researched
and corrections noted on the paper. These hard copy changes were then
'edited' into the feature class data table in ArcView
2. Test maps were created to confirm data consistency. When inconsistencies
were discovered, values were researched and corrections made by directly
editing the feature class tables. These test maps were created with
the Symbology property and used to check for spelling and data entry
errors. We have an example of this in Section 6.2 below.
Also, attribute data tables were sorted to also check for data consistency
and for data entry errors, e.g. missing directional prefixes and suffices
in our Streets_1920_merged feature class.
5.0 Time Estimation

6.0 Analysis of Error
Given the needs of the Albemarle Charlottesville (Virginia) Historical
Society (ACHS) to create a digital representation of the city of Charlottesville
to:
-Study land use changes over time,
-Identify specific building addresses and relate that information to
historical city directories for genealogical research, and
-Provide animated or three dimensional views of the city as it grows
over time the level of error in this project can be induced in two ways.
First, there can be spatial error, or the error introduced by placing
geographic entities in incorrect places. And second, database attribute
errors whereby geographic entities are incorrectly identified, or described
in an inconsistent or inappropriate way.
6.1 Spatial Error
Spatial errors can be attributed to incorrectly georeferencing the
various Sanborn Maps to the underlying reference map, in this case an
orthorectified photo of the City of Charlottesville. As with any manual
process, there will always be some element of error introduced however,
using the detailed procedures outlined above, this induced error can
be minimized. Given that the process itself will likely induce some
level of error, the question is how much error is tolerable.
Figure 6.0 Comparison of Digitized Streets

Figure 6.0 above represents a close-up view of the area
around Main and 1st Streets. Elements of this map were created by different
people (Brent, Geoff, and Don) and then combined into a single (Merged)
view. As can be seen, there is some element of error introduced by each
person digitizing their respective street areas, however as shown by
the scale on the map, the intersection at Main and 1st Streets only
varies by about 2 meters. Further, when each of the individual digitizing
efforts has been combined into a single feature (Merged Streets), the
difference among all three separated digitizing efforts is about 1 meter.
For the purposes defined for this project, this level of error is more
than acceptable. The changes in land use can readily be seen, the locations
of buildings will be accurate to a degree that they can be easily identified
and addressed, and this nominal error will be virtually invisible when
viewing city-wide animations or three dimensional images.
An additional level of error that cannot be addressed by this effort
involves the use of the Sanborn Maps themselves. This study assumes
that the data presented in the Sanborn maps is accurate. If for some
reason, the Sanborn map data is inaccurate, such as from a poorly scanned,
or a badly distorted and/or damaged original (i.e., stained, torn, wrinkled,
or creased) that same inaccuracy will be reflected in the digitized
images.
Finally, some may reference a computed value called RMS Error, or Residual
Mean Square Error, as a measure of adequate georeferencing the Sanborn
Maps to the digital orthophoto. Unfortunately, this is not an appropriate
use of this statistic since we do not know the coordinate system of
the Sanborn map. No inference regarding the relative goodness of geocoding
can be made from the RMS values calculated at the time the Sanborn maps
are geocoded.
6.2 Attribute Error
During this project, errors introduced while coding the attributes
of the spatial features have a much higher potential for inaccuracies.
Misspellings, typographical mistakes, and even some application-induced
errors can occur. These errors are much harder to catch, but fortunately,
are much easier to fix.
By using a pre-defined set of "test" maps, many of the attribute
errors can be corrected by directly editing the database. For example,
Figure 6.1 Comparison of Attribute Errors, shows the layout of buildings
and streets for the test coding area of Charlottesville. In this map
legend you can see three different references to "Dwellings".
These should just be one label, however, in one instance the word was
misspelled as "dweling" and in another instance there is actually
a space " " following the word (unseen by the eye but picked
up by the computer. In each of these cases, this information needs to
be researched and corrected. For this reason, in our estimate of time
to complete this project, time was allocated for each decade's worth
of data to perform this kind of Quality Assurance analysis.
Figure 6.1 Comparison of Attribute Errors
6.3 Conclusions
The level of spatial error identified as a part of this test analysis
is more than adequate to meet the needs of the ACHS. Further, as attribute
errors are identified, they can and will be corrected directly to the
supporting attribute data table(s).
7.0 FGDC - Metadata Links
Shapefiles
Streets_1920_merged.shp
https://cms.psu.edu/AngelUploads/Content/200506FAWD___IGEOG_484_001/_assoc/2281F95FEE96436C8EB69534B565B1F2/streets_1920_merged_metadata.htm
Building_1920_merged.shp
https://cms.psu.edu/AngelUploads/Content/200506FAWD___IGEOG_484_001/_assoc/4C4492B673694F9C868634471FBCC0D6/buildings_1920_merged_metadata.htm
Feature Classes
Streets_1920.shp
https://cms.psu.edu/AngelUploads/Content/200506FAWD___IGEOG_484_001/_assoc/9BF1371ED53448A886F9967FEF6C8EE8/streets1920.htm
Buildings_1920.shp
https://cms.psu.edu/AngelUploads/Content/200506FAWD___IGEOG_484_001/_assoc/9BF1371ED53448A886F9967FEF6C8EE8/buildings_1920.htm
8.0 Figures
Figure 8.0 Building Use Charlottesville, VA, Circa
1920

Figure 8.0 above displays the location of use of the various
building structures coded from Sanborn Fire Maps from 1920. The building
use information was taken directly from the maps and divided into Commercial,
Residential, Public, and Unknown categories. Since this is the downtown
area of Charlottesville, VA, it is not surprising to see most of the
structures coded for commercial use.
Figure 8.1 Building Heights, Charlottesville, VA, Circa 1920

Figure 8.1 above displays the heights of various building structures
coded from Sanborn Fire Maps from 1920. The building height information
was taken directly from the maps in terms of the number of floors represented
by each structure, or structural element. The number of floors for each
structure is also printed with the structure on each map. The tallest
building in Charlottesville in 1920 is the National Bank of Charlottesville
building, rising to a height of 7 stories and depicted in red in the
center of the map.
Figure 8.2 Acutal Building Height

Figure 8.2 depicts a thematic map of the buildings digitized by the
Flying Projectors represented by actual height rather than by number
of floors. The team felt that given the requirement to produce 3D landscape
representation in the future that a z value would more accurately represent
the 1920 cityscape. The map was created by using a graduated colors
symbology with 5 classes of heights. The data required to produce this
map was not 100% complete and estimates were used where we had no height
data. The methodology for estimation included standardizing the height
per floor to 12 feet. Height was then calculated by multiplying the
number of floors by 12.
Figure 8.3 3D View and Building Use

Figure 8.3 above above is a three-dimensional representation
of Charlottesville, Virginia in 1920. All building features were digitized
from 1920s Sanborn Maps and building heights were recorded to ensure
a more accurate portrayal of the city in 1920. In addition 1920 building
use types were recorded and intended to study land use change overtime.