Methods of Data Compilation and Analysis
A complete description of the methods used by the USGS to analyze water-quality data collected in the Chesapeake Bay watershed is provided in Hirsch and others (2010), Moyer and others (2012), Hirsch and De Cicco (2014), Hirsch and others (2015), and Chanat and others (2015). The following summary describes how scientists construct the dataset and use the water-quality model to determine nutrient and suspended-sediment loads and trends.
Updated streamflow and water-quality data are compiled each spring to prepare for the annual computation of loads and trends. Daily mean streamflow data are retrieved for all sites to be analyzed directly from the USGS National Water Information System (NWIS). For water-quality analysis, the USGS has compiled a database of historical (1972 through 2014) observed water-quality data collected at each of the nontidal network stations. Since 2011, all observed water-quality results collected by the multiple monitoring agencies that operate the nontidal network are reported to U.S. Environmental Protection Agency (EPA), Chesapeake Bay Program and stored in the Chesapeake Environmental Data Repository (CEDR).
Acquiring and Incorporating Data From Other Agencies
Annually, the USGS receives water-quality records from EPA and the CEDR database for the previous water year, defined as the 12-month period from October 1 through September 30. These new water-quality observations are combined with the historical observations to create a complete record of water-quality observations for each nontidal network station.
The primary water-quality constituents considered are total nitrogen, dissolved inorganic nitrogen, total phosphorus, orthophosphate, and suspended sediment. All records of water-quality observations are reviewed by the collecting agency to ensure data completeness and accuracy.
Accounting for Changes in Laboratory Procedures
Various evolutions in the long-term history of the program have resulted in slight changes to laboratory analyses and methodologies that must be accounted for prior to the data analysis. These evolutions include:
- Historically, most programs analyzed samples for total suspended solids (TSS), but now, many programs analyze samples for suspended sediment concentration (SSC) at all stations. Glysson and others (2001) provides details about the differences between these analyses. Because of this shift in procedure, TSS and SSC were sometimes combined in water-quality input files. If TSS and SSC were collected at the same time, priority was given to the SSC sample.
- In addition, where supported by available information, missing observations of total nutrient concentrations were calculated as the sum of constituent species; for example, total nitrogen (TN) may be calculated as the sum of total dissolved and total particulate nitrogen.
Water-Quality Model Used for Load and Trend Determination
Concentration data retrieved from the nontidal database and daily streamflow data from NWIS are used for load and trend analyses. Recent advances in the statistical tools available to compute loads and trends have led to the use of revised data-analysis methodologies. For water year 2014, all load and trend estimates were made using a multiple linear regression model known as Weighted Regressions on Time, Discharge, and Season (WRTDS; Hirsch and others, 2010).
How Does the WRTDS Model Work?
The WRTDS model uses a sparse set of discrete water-quality observations combined with a continuous daily discharge record to estimate concentration on days for which no water-quality data are available. Daily concentration and load estimates are then aggregated to monthly and annual time scales. An algorithm is then applied to estimate the trend in “flow-normalized load,” namely a trend that minimizes the confounding effect of any concurrent trend in discharge. Detailed comparative studies by Chesapeake Bay River Input Monitoring (RIM) team staff (Moyer and others, 2012; Chanat and others, 2015) have documented that WRTDS performs better than regression-based approaches used historically.
Why Are Trends Flow-Normalized?
Observed water-quality loads are highly influenced by streamflow and season. Trends are adjusted for flow and season to minimize the influence of these potentially confounding factors. This process is referred to as “flow-normalization,” and is described further in Hirsch (2010). Flow-normalized trends help scientists evaluate changes in load resulting from changing sources, delays associated with storage or transport of historical inputs, and (or) implemented management actions.
How Are Trends in Loads Identified
Identified trends are based on the results of likelihood analyses using bootstrapped replicates (Hirsch and others, 2015). As an example, for a given site and constituent, reported positive (or negative) trends having likelihood estimates of at least 0.66 mean that positive (or negative) trends were evident in about two-thirds of the bootstrapped replicates for that site and (or) constituent.
What Trends Are Computed?
For stations having water-quality records beginning prior to 1990, trends in load are computed for both the period of record and for the most recent 10 years (2005-14). For stations having records beginning after 1990, only 10-year trends (2005-14) are computed. All data available, including data collected prior to 2005, are used to estimate 10-year trends.
Are There Any Exceptions?
Historically, no trends are computed for stations having water-quality records of less than 10 years. However, in 2014 a large number of newer stations had records that reached 9 years. Because of the need for the most spatially comprehensive analyses available in advance of the “Mid-Point Assessment” for the bay total maximum daily load set for 2017, RIM staff elected to compute and include trends for stations having only 9 years of data (2006-2014) in the 2014 water year results.