Educators Online |
WebED Education, Outreach and TrainingUsing Real Data for Science and Math: Urban Sprawl and its Impacts on Rural AmericaIn many regions, people are moving away from the central cities in major metropolitan areas into suburban and rural communities. This migration consumes a tremendous amount of land resources, particularly prime agricultural land. It also increases commuting times, energy use, and the potential for spreading air and water pollution problems. References at the end of this mini-project will give you background on the problem. We have assembled data on one of the fastest growing such communities in Ohio – Medina County. A bedroom community for Cleveland and Akron-Canton, this county has grown in population by 120% from 1960 (65,315) to 2000 (143,230). We obtained data from the auditors land records, showing the dates that different buildings were constructed and the land they consumed. MINI-PROJECT This mini-project includes: 1. Doing some background reading on the nature of the problem of sprawl and its possible solutions. 2. Using the maps we have created to visualize the rate of land use change in Medina County. 3.Using the auditor's dataset and data from the U.S. Census to quantify the rate of change in population and land use in the county by creating tables and graphs in Excel. 4. Making some simple forecasts using Microsoft Excel to create a Line and Curve of Best Fit. 5. Applying the same methodology in Excel to examine population growth, acreage growth, or another data set. 6. Writing a short report describing the nature of the overall problems and the specific findings for Medina County. Represent your analysis in the form of Excel graphs, charts, and tables and describe the implications of sprawl. GETTING STARTED
Reading:
DATASETS There are two underlying datasets that you can use in your analysis. The first is the dataset showing the auditor's records of land holdings for all of Medina County. The dataset appears on a spreadsheet called meparcels1.xls (Click here to download the spreadsheet). The second dataset – census data – is available online. To analyze the auditor's dataset, you will need to do the following: First, open the Excel spreadsheet that contains the raw data: the meparcels1.xls spreadsheet. Make certain you have access to the Data Analysis Toolpak that comes with Excel. If it is already installed, you will find a menu item under Tools > Data Analysis. If you do not see this menu item under the tools menu in Excel, do the following to turn on this option:
meparcels1.xls spreadsheet fields are as follows:
The coding sheets for these variables appear in Appendices A and B.
Now you are ready to create a histogram. “A histogram is an analysis tool in Excel that calculates individual and cumulative frequencies for a cell range of data and data bins. This tool generates data for the number of occurrences of a value in a data set. For example, in a class of 20 students, you could determine the distribution of scores in letter-grade categories. A histogram table presents the letter-grade boundaries and the number of scores between the lowest bound and the current bound.” Excerpted from Excel Help. To create the histogram:
Once you create a histogram, determine the percentage of parcels built over the last 100 years.
Next, determine the Line of Best Fit:
REGRESSION “The purpose of linear regression is to find the line that comes closest to your data. More precisely, the linear regression finds values for the slope and intercept that define the line that minimizes the sum of the square of the vertical distances between the points and the line.” Linear regression allows you to “analyze how a single dependent variable is affected by the values of one or more independent variables — for example, how an athlete's performance is affected by such factors as age, height, and weight. You can apportion shares in the performance measure to each of these three factors, based on a set of performance data, and then use the results to predict the performance of a new, untested athlete.” Regression gives you two things: 1) the predicted equation and 2) an indication of how well the equation fits the data. R2 shows whether you can use the equation to make predictions. Use the equation to find predicted Y values for known X values. Regression gives you the equation. R2 only shows you if the equation fits the data. (Excerpted from Excel Help) R Squared (R2) R2 shows us whether we can use the linear or exponential equation to make predictions. Use the equation to find predicted Y values for known X values. Regression gives you the equation; R2 only shows you if the equation fits the data. The value R2 is a fraction between 0.0 and 1.0, and has no units. An R2 value of 0.0 means that knowing X does not help you predict Y. There is no linear relationship between X and Y, and the best-fit line is a horizontal line going through the mean of all Y values. When R2 equals 1.0, all points lie exactly on a straight line with no scatter. Knowing X lets you predict Y perfectly. It all depends on the data!
Excerpted from www.curvefit.com QUESTION Based on the R2 values, which of your lines best fits the data? EQUATION EXPLANATION To predict parcels for the next time period, put in the number for the next time period. To forecast the total number for the next time period (in ten more years), calculate the equation. Linear Fit The equation for determining the Linear Fit is Y = a ± b*X. X represents the time period (i.e., 1920 or 1940 or 1990) numbered from the first period (1900) through the last in your data (2000) or the 11th period. Y represents the total number of structures. The letter “a” signifies the intercept number, and the letter “b” signifies the slope constant. The * represents the multiplication operation. The final equation looks like this: Y = (intercept number or constant) + slope constant * X variable Exponential Fit The equation of finding the Exponential Fit is Y=aebx. Again, “x” represents the known time period, and Y represents the total number of structures. The letter “e” signifies the exponential function. The final equation looks like this: Y = (intercept number or constant)e(exponent constant)*x To predict the number of developed parcels in ten years, you need to put a formula underneath your graph in the spreadsheet and substitute the appropriate values for your linear or exponential equation. For the linear equation, the formula in the spreadsheet will take the form: = intercept number + slopeconstant*12. In this example there are 11 periods. The next period of 10 years will be the 12th period – hence substituting the “12” for X in the formula. For the exponential equation, the form will be: = intercept number*EXP(exponentconstant*12) The EXP( ) expression is the Excel function for the exponential with the items in parentheses representing the power function. Again, the 12 reflects the next 10 year period. QUESTIONS – Use your results to discuss the following questions 1. Is the best fit for the data linear or exponential? 2. If this trend were to continue for the next 10 and 20 years, how many more parcels would be developed? 3. Try to apply the histogram and best-fit methodology to describe county trends in at least one other way using either the acreage variable on this spreadsheet or population data over time for the county. What trends can you identify by looking at the number of acres of land as they develop over time? Which Best-Fit model matches the number of parcels? What conclusions can you draw by looking at census of population data? - To obtain population census data, view these sites: http://www.census.gov/population/cencounts/oh190090.txt http://factfinder.census.gov/servlet/BasicFactsServlet?_lang=en - For Tract 2000 information, go to: http://factfinder.census.gov/servlet/GCTGeoSearchByListServlet?_lang=en 4. Now create a report on your findings. Your report should contain an introduction that reviews the nature of the problems with urban sprawl based on your reading. Properly cite your sources! The second section should present your results of the analysis of the conditions in Medina County. You can use the maps from the website, graphs and charts you created, and/or tables. Present these items as illustrations and discuss what you found. 5. Now draw some conclusions about whether this is a big problem in Ohio and what might be done to solve it. What are the pros and cons of alternatives? 6. Put your report in the form of an electronic document using Microsoft Word. Instructions for submitting it.Your instructor will tell you when, where, and how to submit the final report.
|



