MODEL 1

STEP 3: DERIVING THE RELATION BETWEEN WEATHER CONDITIONS AND STREAMFLOW

We derived the relation between streamflow and gage height. Now, it is essential to derive the relation between weather conditions and gage height so that we can use a weather event as input and predict whether or not there will be a flood.  To this end, we will derive the relationship between weather conditions and streamflow, and use the relation between streamflow and gage height we derived earlier to complete the flood prediction process.

Unfortunately, not all of the stations have the necessary data to complete the model.  We suggest you use the sites 03451000(SWANNANOA RIVER AT BILTMORE) and 02089000 (NEUSE RIVER NEAR GOLDSBORO) for this exercise. There are other stations where this will work if the complete datasets are available. For the rest of this exercise, we will provide instructions for station 03451000. You can check the datasets and apply the process to other stations if you wish.

Weather data and streamflow data will be obtained from different sources. The table below links these two different data sources.

SITE_NUMBER
(Streaflow Data)

SITE_NAME

COOPID
(Weather Station)

03451000

SWANNANOA RIVER AT BILTMORE, NC

310301

03451500

FRENCH BROAD RIVER AT ASHEVILLE, NC

310301

02096960

HAW RIVER NEAR BYNUM, NC

311677

02087500

NEUSE RIVER NEAR CLAYTON, NC

311820

02102500

CAPE FEAR RIVER AT LILLINGTON, NC

312500

02085070

ENO RIVER NEAR DURHAM, NC

312515

02112250

YADKIN RIVER AT ELKIN, NC

312740

02083000

FISHING CREEK NEAR ENFIELD, NC

312827

02089000

NEUSE RIVER NEAR GOLDSBORO, NC

313510

02096500

HAW RIVER AT HAW RIVER, NC

313555

02084000

TAR RIVER AT GREENVILLE, NC

313643

02089500

NEUSE RIVER AT KINSTON, NC

314684

02081747

TAR R AT US 401 AT LOUISBURG, NC

315123

02102000

DEEP RIVER AT MONCURE, NC

315763

02087183

NEUSE RIVER NEAR FALLS, NC

316091

02112000

YADKIN RIVER AT WILKESBORO, NC

316256

02126000

ROCKY RIVER NEAR NORWOOD, NC

316275

02111000

YADKIN RIVER AT PATTERSON, NC

316602

03443000

FRENCH BROAD RIVER AT BLANTYRE, NC

316805

0208726005

CRABTREE CR AT EBENEZER CHURCH RD NR RALEIGH, NC

317079

02087275

CRABTREE CREEK AT HWY 70 AT RALEIGH, NC

317079

02080500

ROANOKE RIVER AT ROANOKE RAPIDS, NC

317319

02082585

TAR RIVER AT NC 97 AT ROCKY MOUNT, NC

317398

03439000

FRENCH BROAD RIVER AT ROSMAN, NC

317486

02108000

NORTHEAST CAPE FEAR RIVER NEAR CHINQUAPIN, NC

317974

02083500

TAR RIVER AT TARBORO, NC

318500

02105769

CAPE FEAR R AT LOCK #1 NR KELLY, NC

319423

02105500

CAPE FEAR R AT WILM O HUSKE LOCK NR TARHEEL, NC

319427

02116500

YADKIN RIVER AT YADKIN COLLEGE, NC

319667

02115360

YADKIN RIVER AT ENON, NC

319675

 

Temperature, Rainfall Data, and Streamflow Data

Using the table from above, we can match the weather data for our gage stations.  A number of steps are involved to download the data, remove missing values, and delete unwanted columns.  To save you that trouble, we have compiled the data for you to download.  If you want to replicate the process for other stations, we have included the instructions in the Appendix section of the exercise.

You can download the spreadsheet climgage2089000.xls, to get these data.  The climate data and streamflow data have been combined on this worksheet.  Before you are able to undertake a statistical analysis, you will need to clean up the data by removing the days where information from one or another source is missing and is blank.  This is because the Excel statistics routines do not know how to deal with missing data.  To do the clean up, follow these steps:

  1. Copy everything on the current worksheet to another worksheet.  One of the problems in working with a very large dataset is finding the end of the data.  On a very large spreadsheet, you could be scrolling down for a long time.  One shortcut for this task is the Shift - End and Down key combinations.  Start with your cursor on cell A1.  Hold down the Shift key throughout this move.  With the Shift key on, touch the End key once then touch the Down key. You should be taken to the end of the data in the first column. Because you held down the Shift key, you also selected all of these cells. Now hold the Shift key again and use the Right Arrow key to move across the columns until you have selected through Column J.  Notice that Columns I and J are blank at the bottom of the spreadsheet since the streamflow dataset has a shorter published record than the weather data.  Now Copy the selected data, click on the other available worksheet (Sheet 1) and then paste the data into cell A1.  We will work with this copy rather than the original.  That way, if you make a mistake, you still have the original data.
  2. Following the same steps you used to copy the data, select all the data on your new worksheet. The sort function will allow us to sort the data.  The first sort will be by the Prec (precipitation) column.  When the sort is done, the blank cells will be left at the bottom of the sheet.  Click on Data Sort and you should see a query box like the one below. If you click on the top Sort pull-down you will see that you have choices of which variable to sort by. Notice also that since the top line is titles, the program has selected only the numbers below the titles and has checked the Header Row box. Select Prec and then click on OK.  The data will be sorted with the blank values for precipitation at the bottom of the sheet.  Find the first blank value for precipitation and select those rows by clicking in the row number. You can then use the Shift - End - Down combination to select all of the rows below this one and then click on the Delete key to delete those rows. 

                                         Sort selection box
  3. Now go back to the top of the spreadsheet.  Select the entire remaining dataset again.  This time, do Data Sort but scroll down and select AvgTemp as the target sort variable.  Now blanks in discharge are sorted to the bottom.  Go to the bottom and delete all of those rows.

  4. Now go back to the top of the spreadsheet.  Select the entire remaining dataset again.  This time, do Data Sort but scroll down and select Discharge as the target sort variable.  Now blanks in discharge are sorted to the bottom.  Go to the bottom and delete all of those rows.

  5. Finally, we need to sort the spreadsheet back into the original date order.  Select all of the data again and then Data Sort but this time select three variables as illustrated below, the Year, Month, and Day.  You will now have a sheet with no blanks in the correct order.

Sorting in date order

Adding Season Data

The impacts of rainfall on runoff vary with temperature and season.  This is related to a number of factors.  First, the status of vegetation and sunshine varies markedly by season.  In the summer months, vegetation is most mature and will intercept the maximum amount of rainfall.  At this time, temperatures are also higher so there is more evaporation and transpiration.  In the winter, precipitation is more likely to occur, as snow or sleet and will not runoff until temperatures warm.  At the same time, deciduous vegetation has lost its leaves and allows more precipitation to reach the ground.  The autumn and spring seasons will have intermediate types of effects on the rainfall/runoff relationship.  For this reason, we will add a set of indicator variables for the seasons with precipitation.

We will add variables for three of the seasons (we don't need the fourth because it is determined by the other three) where 1 means it is that season and 0 means it is not that season.  There are several ways to do this. Using the keyboard would be very tedious. 

Alternatively, you can insert a formula that will check on the month and fill in the correct value.  To do this, we use the IF function in Excel.  At the top of the first blank column, insert the word Winter.  To fill in the correct values for Winter, we will use this formula in the cell below the title:

=IF(OR(C2=1,C2=2,C2=12),1,0)

(You can copy the formula from this webpage into your spreadsheet.)  The formula can be interpreted as follows:  If the value in C2 (the month column) is 1 or 2 or 12, make the value in the current cell 1.  Otherwise, make the value of the current cell 0.  Put in the formula and hit enter.  Since the first month is March, the value 0 should appear.  Now copy this formula to all the rows below the first one.  Did the correct 0 and 1 values for Winter get filled in?

For the next column, put Spring at the top.  This time the formula should be 1 if the values are 3, 4, or 5.  Finally, fill in the Summer label and use the values 6, 7, and 8.  When you have copied these formulas all the way down the columns, save the spreadsheet.

Deriving the Streamflow Model

To create a model of the relationship between streamflow and gage height, we will use the regression technique. This technique allows us to fit data from multiple casual variables (also called independent variables) to a single effect variable (dependent variable).  In this case, the independent variables are temperature, rainfall, and the three season variables. The dependent variable is gage height.  To learn more about regression using Excel and how to interpret results, click here.

Before we start the regression, we will need to do a couple more data manipulations.  First, we need the data in the precipitation column for our analysis.  We will not actually use the precipitation from the current day to predict discharge.  This is because there is a time delay between the rainfall and runoff reaching the gaging station. Instead, we will use a 10-day total for precipitation as our indicator.  Start by placing the title 10dayPrecip at the top of the last blank column.  In the column's 10th row, enter the formula =SUM(E2:E11) to sum the 10 days of precipitation. Next, copy this formula into all the cells below this one to complete the table.  Fill in zeros in the top nine cells so that they are not blank.

Now we need to copy the average temperature data to the next blank column.  Finally, we will need to do a statistical transformation of the discharge data.  As we saw in our previous curve fitting, streamflow is non-linear. There are a few instances of very high flow but most instances are lower flows.  Since regression is a linear technique, this will cause problems.  To adjust for this, we will apply the natural logarithm function to the discharge.  This will give the resulting distribution a linear appearance.  Put a title in the blank column to the right of the copy of the precipation data, LnDischarge.  Now insert this formula into the cell below:

=ln(j2)

The number 7.709 should appear in the cell. This is the results of the formula.  Now copy that formula into all the cells below.

The focus of the regression is our spreadsheet with all of the variables.  To start the process, click on Tools, Data Analysis and select Regression. For this menu item to be active, your version of Excel must have activated the AnalysisToolpak.  To do this, click on Tools, Add Ins and then select AnalysisToolpak.  Now you should be able to find Tools, Data Analysis.  

Choosing the regression option

When you click OK, you will be given a selection box to fill in the input and output information. It will look like this:

regression selection box    

The Input Y Range is the dependent variable, or in the example here, P1 until the bottom of the data set.  To select this, click inside the white box.  Then click at the top of the column P and notice that P1 gets filled in the box.  If you hold down the Shift key and then click End and then the down arrow, the cursor should jump to the table's last entry and fill in the range as illustrated below.  You can also hold down the shift key and page or scroll down to the bottom.

Click in the Input X Range box and then select the WINTER, SPRING, SUMMER, PRECIP, AvgTEMP to complete that box in the same way.  You should also click on the Labels box, check the New Worksheet Ply box, and fill in the name of the destination worksheet.  Let's call this worksheet regress1.

You can then click OK and the regression model will be calculated.  As output, you will get the information on the amount of the variance explained, a set of coefficients for each of the independent variables, and the significance of each of those variables.  Refer to the regression background information above if you need help interpreting the results.

By using the coefficients estimated by the model, you can write your streamflow function in this form:

LN (STREAMFLOW) = Intercept + (B1*WINTER) + (B2*SPRING) + (B3*SUMMER)+(B4*10dayPRECIP)+(B5*AvrTemp)

In this model, Intercept and each of the B coefficients are constants.  Filling in the values for the actual temperature, precipitation, and season for any weather event in the area near this gage will give you an streamflow estimate for that day. 

 


PREVIOUS PAGE                                                    NEXT PAGE