
| Home | Portfolio | Resume | Photos |
Jim Kompanek
Standard Deviational Ellipses
The below figure (Figure 1) depicts standard deviational ellipses for both murder (firearm-related) and attempted street robberies in St. Louis, MO based on the points where each crime was committed. The standard deviational ellipses were calculated for both one and three standard deviations. The most obvious observation visible regarding the ellipses, is that at three standard deviations, the ellipse is just about meaningless, as nearly the entire city is covered for both types of events (attempted robbery and murder).
In regards to the ellipses generated with one standard deviation, there is some overlap, with the murder ellipse located in the north-central portion of the city and angled slightly to the northwest. The robbery ellipse is located in the east-central of the portion of the city and appears to be slightly oriented towards to the northeast. It appears that the murders committed in St. Louis are more clustered in the northern 1/3rd of the city, while attempted robbers are spread somewhat consistently throughout the central region.

Figure 1. Standard deviational ellipses of crime patterns in St. Louis, MO
Nearest Neighbor Distance
Figures 2 and 3 depict the Nearest Neighbor Distance results for both murders (Figure 2) and attempted street robberies in St. Louis. This analysis measures the distance between nearest points and compares it against the expected values from a random sample of points. According to this analysis of gun-based homicide, there is an observed mean distance index of 0.77 and a Z score of -5.2. As a result, it can be presumed that clustering of events is evident with a "less than 1% chance likelihood that this dispersed pattern could be the result of random chance." According to the attempted robbery statistics, there is an observed mean distance index of 0.78 and a Z score of -2.9. As with the murder Nearest Neighbor Distance analysis, it can be presumed that a clustering of events is evident with a ""less than 1% chance likelihood that this dispersed pattern could be the result of random chance." According to this analysis, there is a significant clustering of events in both crime patterns and the events are not simply independent phenomena.

Figure 2. Nearest Neighbor Distance for Gun-based Homicide in St. Louis, MO

Figure 3. Nearest Neighbor Distance for attempted street robberies in St. Louis, MO
Quadrant Analysis
The quadrant analysis divides the dataset into equal cells with a grid referred to as quadrants. For each quadrant, the sum of the number of events is calculated and serves to depict general trends within the grid as well as provide statistics in regards to the spatial distribution of events. There are two methods of quadrant analysis: census and sample-based. The census analysis is appropriate for when a complete dataset is available and sample analysis is used when only a sample of points is available. Because presumably the datasets depict complete crime information for St. Louis, I chose to the census-based analysis. Although, it may be argued that a great deal of crime goes un or underreported at the sample-based analysis may be more appropriate. In terms of the quadrant size, I experimented with different sizes and locations. Small quadrants were generally not useful as most of the quadrants only depicted zero or one events. Excessively large quadrants resulted in rather imprecise results, with most, if not all, events located in one or two quadrants.
Figure 4 depicts a quadrant analysis of gun-based homicides based on a 10x15 grid. I chose this grid size because it encompassed virtually all of the city and each block was large enough to minimize the number of quadrants with either zero or no events. Of primary interest is the variance-to-mean-ratio (VMR), where a VMR greater than 1 indicates a clustering of events, while a VMR less than indicates a random distribution. The VMR based on the 10x15 grid indicated a VMR of 2.8401 and implies a clustering of events. This appears to corroborate the apparent visual distribution of points visible in Figure 1.
Figure 5 depicts a quadrant analysis of attempted robberies, also based upon a 10x15 grid, chosen for the same reasons as Figure 4. The VMR for attempted robberies was substantially lower than that of murder: 1.5550. Although this is greater than 1 and therefore indicates a clustering of events, it is unclear just how strong this clustering is. This lower strength of clustering also seems to corroborate the visual distribution of attempted robberies in Figure 1.

Figure 4. Quadrant analysis of gun-based homicides in St. Louis, MO

Figure 5. Quadrant analysis of attempted robberies in St. Louis, MO
To further experiment with quadrant analysis, I increased the size of each quadrant and focused the grid over what appeared to be the high crime parts of the city. Because this removed the isolated events from the picture, presumably it would have increased the clustering and resulted in a higher VMR. This was the case in Figure 6, where I used a 6x7 grid placed over the high crime area of the city. The VMR increased to 3.5201, indicating a stronger clustering of events.
This was not the case in Figure 7, where I placed a 6x6 grid in a slightly smaller region. The VMR actually decreased to 1.24. This appears to be the result of the relatively even distribution of attempted robbers. This may be because when I decreased the overall size of the grid, I removed most of the quadrants with zero events and subsequently was left with a relatively even distribution of events.

Figure 6. Quadrant analysis of gun-based homicides in St. Louis, MO

Figure 6. Quadrant analysis of attempted robberies in St. Louis, MO
Kernel Density
Figure 7 depicts a kernel density map of gun-based murders in St. Louis, MO. I experimented with a variety of parameters before coming up with a search radius of 1,000 m and a cell size of 10 m. A higher search radius (I attempted 2,000 m) resulted in most of the map containing a high density of gun-based murders. A lower search radius (200 m) simply resulted in a small high density area surrounding each individual event. The radius set to 1,000 m provided a good balance between the two. I also experimented with different cell sizes. A cell size of 100 m resulted in a pixilated distribution map and a cell size of one was too small to be necessary at the scale of approximately 1:150,000. A cell size of 10 m appeared to provide smooth, non-pixilated lines.
Figure 8 depicts a kernel density map of attempted robberies in St. Louis. Because the individual events are relatively evenly distributed, a search radius of 1,000 m appeared inadequate and only highlighted individual events. A search radius of 1,500 m appeared more adequate. A cell size of 10 m was also utilized for the same reasons as previously mentioned.

Figure 7. Density of gun-based murders in St. Louis, MO

Figure 8. Density of attempted robberies in St. Louis, MO
Conclusions
The analysis conducting during this lesson were ultimately limited by the datasets provided. The greatest limitation of the data is the artificial boundary of the city and the subsequent lack of data from outside of the city limits. It is possible that there are clusters of events outside of the city limits and the isolated events we observe within the dataset are actually part of the high crime areas outside of our provided data. Similarly, Figure 7 depicts a high density of gun-based murders near the northwest city limit. It is unclear if this is a core area of crime or is actually the border of a high crime area centered outside of the city.
In terms of attempted robberies, each event is relatively evenly distributed with a few clusters in the central and northeastern parts of the city; especially near the bridge over the Mississippi River. It is unclear if this may indicate crime spilling over from the other side of the river or if other factors come to play; such as being a tourist area along the riverfront subject to muggings. Without more specific data, this is unclear. Another variable which is unaccounted for is the role of physical barriers within the study area. This may include railroad tracks or highways with no crossings or bodies of water. This may impact the nearest neighbor statistics by implying a non-existent correlation between points, where a cluster of events may simply be unrelated due to an unseen barrier.
This document is published in fulfillment of an assignment by a student enrolled in an educational offering of The Pennsylvania State University. The student, named above, retains all rights to the document and responsibility for its accuracy and originality.