Wednesday, September 14, 2011

Final Project: What determines the Library Location in Los Angeles County?



Introduction
Public libraries provide access to any residents for the use of books and other educational materials. It also provides a place for residents for internet access and reading. They are as important as schools because they are invaluable institution the education and literacy levels of a country. Since library access is so important, we will examine the accessibility of library to the schools location in the Los Angeles County. If the public libraries are well-located near schools, children especially in the age of 5-17 are more willing to access to these locations. This will promote literacy and education development of the children. In this final project, we will also examine the factors determining the location of the library with race, age, population density and household income median level analysis. For example, are the public libraries mainly located in high household income median level? A GIS-based analysis is able to determine the relationship between schools and the library, and democratic factors and locations of library.

Method
First, to show the numbers and locations of public libraries within each city in the Los Angeles County, more than 200 public libraries were geocoded and the newly-created shapefile by the ArcGIS address locator of library location was joined to the city boundary shapefile.
Second, to determine the accessibility of library to schools, there are two sets of buffers were created. The first one is a two-mile buffer representing the walking distance from the library. The second is a ten-mile buffer representing the driving distance from the library. Some library locations are so near the coast or the edge of the county that the buffers extended over the boundary; both sets of buffers were clipped to the boundary of the LA county city boundary layer.
Third, using the selection tools, we can find out how many schools are within the walking and driving distance of the library buffer. In the select by location tool, the target layer will be the school and the source layer will be the geocoded library location result. Using target layer features are within 10 miles of the source layer, several school locations will be selected. The percentage of number of schools that are within walking and driving distance can be determined. The number of schools that are left out or not within the buffer can be viewed on the map as well.
Forth, to determine several democratic factors that may affect the distribution of public libraries in the Los Angeles County, we are using cluster analysis on the race and age census tracts data in the Los Angeles County. Cluster Analysis is a spatial statistics technique that using spatial autocorrelation to determine the clustering of a data. The age group 5-17 population was picked because it represents the population of children studying in schools before college level. The Black, White and Asian population was picked because they are the most common types of race in Los Angeles County. The overall clustering will be overlay with the geocoded library locations to determine what factors determine the location of the library.
Besides, two other maps were created to have a better understanding the relationship between the library location and democratic factors,
One map of population density by census tracts was made by joining an excel spreadsheet containing all the census tracts data of the population density to the census tracts shapefile.
One map of household income median by dollars was made by joining an excel spreadsheet containing the income data for each city to the city boundary shapefile.

Results
The results will be divided into two parts, the school/library accessibility and democratic factor/library distribution.
1.     School and Library Accessibility
To determine how well the Los Angeles County library reaches the schools location, a buffer analysis was performed on the library locations.  Both “walking distance” and “driving distance” buffer were created. A “walking distance” buffer is a distance of 2 miles away from the library. This may represent a 20-30 minute walking distance or a 15 minute bike distance. A “driving distance” buffer is a distance of 10 miles away from the library. This may represent approximately 20-30 minute of drive. Figure 1 shows the result of the library accessibility to the surrounding school locations. Using the select by locations technique in ArcGIS, school locations within the buffer can be examined. There are 2387 out of 2400 schools selected within the 10 miles of driving distance buffer and there are 1773 out of 2400 schools selected within the 2 miles of walking distance buffer. After a simple calculation, the percentage of the library accessibility over school in 10 mile driving distance will be 99.46% and 2 mile walking distance will be 73.875%.
2.     Democratic factors and library distribution
There are several maps created regarding to the democratic factors to determine the distribution of the library.
First, the cluster analysis shows the sense of clustering of the population.  The HH areas represent tracts with high numbers of individuals surrounded by areas with high numbers of individuals; these are the hot spots. The LL areas represent tracts with low numbers of individuals surrounded by areas with low numbers; these are the cold spots. The remaining two categories are the outlier. White generally clustered in the west, Asian generally clustered in the east and the Black generally clustered in the South. Most of the library locations are concentrated in the Black clustered area. As shown in the map, most library locations are located in the Black clustered area with some library in both Asian and White clustered area.
On the other hand, the age group 5-17 population is mostly clustered in the south-east. But most of the library locations are on the LL area which is the cold spots of age group 5-17 population.  
The remaining two maps are population density with census tracts and household income median with city boundary. The population density map shows that there will be library located in high/very high population density area such as the downtown area, while there will be no library located in very low population density area such as the central valley area. The household income median map shows that with high household income median, there will be less library located, such as the west or Santa Monica area. The map shows a poor area may have more library located, for example, center of city Los Angeles, the downtown area.

Conclusion/Discussion
In terms of the library accessibility over school, the result suggested that it has a high accessibility to the library over school in the Los Angeles County. However, there are around 8 schools that have low accessibility because they are not within both walking and driving distance buffer. These schools are mostly located in the northern part of the Los Angeles County. I would also suggest that the percentage of walking distance access should be increased to 80% because most children do not drive. If we want to provide a better education development to them, more libraries should be built in the 2 miles walking distance buffer zones.
On the distribution of library, it is hard to tell what the exact factor to determine the library location. But from this project, there are several things were found. Library locations are mostly located in Black clustered area and poor (low household median income). The reason may due to the poor are more needed to get access to the library because they can’t afford to buy books to read. Library would be a best location for the poor because they can enjoy free access to the internet and it is a free entertainment for them.
However, there is lack of library for the age group of 5-17 population. In my opinion, they are the most needed for access to the library. Library offers book and educational materials. Apart from the formal reading, it is very important for them to read outside the school. It is needed to build more libraries in the age group 5-17 clustered area to meet the population needs. They are the future of the country.
There are several limitations for this study. City boundary in the Los Angeles County is complicated. I found several city boundary layer but most of them do not have all the city names/boundaries in the layer. Because of this, most of the data is using the census tracts because there will be a more accurate and precise view on the result. Besides, the study only includes city and county library in the Los Angeles County. The total number of library may be more than that because there is different library system in every place.

Sources:
California counties and census tracts shapefiles were downloaded from UCLA GIS data resource website.
The data and census tracts data were downloaded from US Census Bureau
Household median income data were downloaed from www.laalmanac.com/LA/la09.htm
Los Angeles County/ City of Los Angeles Library address from

Monday, September 12, 2011

Lab 6: Interpolation

In this final lab, we are using the interpolation technique to estimate the rainfall amount between stations that recorded the amount of precipitation. Spatial Interpolation is a useful tool to estimate unknown values with known values. In doing so, I picked around 55 precipitation station where they have both season normal and season total data. These geographical location values are called sample points or control points. We are using these 55 already known values to predict/estimate the rainfall in the other unknown value area. There is no need to collect data from every single point in the country to get the result.
There were two methods in analyzing the data in this lab, IDW and Spline. IDW uses the 40 surrounding points that I picked to determine the new value for in-between points. This method works well when the point density is high, just like this case, the calculations for in-between points will be more accurate when the points are closed together. In other words, spline estimates values using a mathematical function that minimizes overall surface curvature.
There are slightly different result in IDW and Spline. I think the IDW shows the a better result than Spline. From the IDW maps, in terms of the normal and total rainfall, the north-east of the LA County has the least amount of rain fall. This is pretty close to the result in reality and what I entered from the precipitation data. The most "heaviest" rain fall located in the mid-east of the LA County where most of the dams are located. The dam would be one of the factor that provide climate factors that causing more amount of rainfall. The IDW shows a more "normal" result. For Spline, although there is heaviest rainfall located in the dam area, there is rainfall located in the west. This is not so true that compare to common sense and what I entered from the precipitation data. Normal and Total rainfall map both have influences on the difference of rainfall map. The difference in result may due to the differences between the interpolation method.  IDW calculated the value of  in-between points and spline estimates values using a mathematical function that minimizes overall surface curvature. Precipitation data would be best to use IDW to represent.