Update of Scoring Functions for Cornell Soil Health Test

Aubrey K. Fine, Aaron Ristow, Robert Schindelbeck and Harold van Es
Soil and Crop Sciences Section, School of Integrative Plant Science, Cornell University

Comprehensive Assessment of Soil Health
Soil health refers to the ability of a soil to function and provide ecosystem services. The Cornell Comprehensive Assessment of Soil Health (CASH), initially referred to as the Cornell Soil Health Test, is a tool designed to aid landowners and managers in the evaluation of their soil health status. When a soil is not functioning to its full capacity, sustainable productivity, environmental quality, and net farmer profits are jeopardized over the long term.

Soil health cannot be determined directly, so it is assessed by measuring indicators that relate to soil quality. The CASH approach broadens the scope of conventional soil testing by evaluating the integration of biological, physical, and chemical properties, including soil texture, available water capacity, soil penetration resistance (i.e., compaction), wet aggregate stability, organic matter content, soil proteins, respiration, active carbon, and macro- and micro-nutrient content (see soilhealth.cals.cornell.edu/ for more details). Results from testing are interpreted using scoring functions, which are equations quantifying the relationship between measured indicator values and soil health status. Scores from all indicators are synthesized into a comprehensive report, which identifies specific soil constraints and provides management suggestions for clients.

A challenge of assessing soil health in this way is the interpretation of measured values for each soil property. For example, what does a 30% value for wet aggregate stability mean? Is it an indication of a problem, or does it signify a good soil? Should the interpretation be different depending on soil texture? The CASH scoring functions and interpretative color code for each indicator were developed to address this issue. The scoring functions for the original Cornell Soil Health Test, first made publically available in 2006, were based on soils data collected from the Northeastern United States. In the decade since, the Cornell Soil Health Laboratory (CSHL) database has expanded to include data for a much greater number and more geographically diverse set of samples representing over 60% of the United States and areas overseas. This project reports on the most recent analysis of the CSHL database, performed in 2016. The results of this work allowed us to refine the scoring functions and incorporate regional differences that broaden the scope of the CASH outside of the Northeast.

Scoring Functions
The CASH scoring functions are based on the distribution of measured values for each indicator from all samples in the CSHL database. This approach allows us to assess whether a particular soil sample shows low, medium, or high values relative to other soils in our database, and thereby make some judgment on the health of that soil and possible problems. This is similar to many medical tests where an individual’s health measure (e.g., blood potassium level) is scored based on the values measured from a large population to assess whether or not it is within normal range.

For most CASH scoring functions, we use the mean and standard deviation of our data set to calculate the cumulative normal distribution (CND). The CND function is essentially the scoring function, as it translates measured values to a unit-less score ranging from 0-100. Scoring functions for all indicators and textural groups (i.e., coarse, medium, fine) were calculated this way. This approach can be adapted to other regions with different soils and climate, as scoring functions can be attuned to fit different conditions.

As an illustration on how scoring functions are developed, the histogram in Figure 1 shows the observed distribution of measured values of active carbon (Active C) for medium textured soils. The height of the bars depicts the frequency of measured values that fall within a range (bin) of 100 ppm along the horizontal axis. For instance, approximately 24% of the soil samples in this set had measured Active C concentration between 500 and 600 parts per million (ppm). The normal distribution, or bell curve, superimposed over the bars was calculated using the mean (531 ppm) and standard deviation (182 ppm) of all medium textured soils. Using these two parameters, we can develop a scoring curve (Fig. 2e) representing the CND having the same parameters (i.e., the mean and standard deviation).

FIGURE 1. Example of the distribution of active carbon indicator data in medium textured soils used to determine the scoring curve. — Figure 1. Example of the distribution of active carbon indicator data in medium textured soils used to determine the scoring curve.

Three general types of scoring are used, whether the curve shape is normal, linear, or otherwise:

More is better, where a higher measured value of the indicator implies a higher score. We use this type of scoring curve for most soil health indicators.
Less is better, where higher measured values are assigned a lower score and are associated with poorer soil functioning. This is the case for Surface and Subsurface Hardness and the Root Health Bioassay Rating. Manganese and Iron are also scored as ‘less is better’ because these micronutrients are associated with a risk of toxicity from excess levels.
Optimum curve, where the scoring curve has an optimum range and the scores are lower when measured values fall either below or above this range. Extractable Phosphorous and pH are both scored using an optimum curve.

In general, scoring functions are texture group-dependent for physical and biological indicators, with higher scores associated with better soil health. For example, an Active Carbon measurement of 600 may be quite good for a sand, but low for a clay.

Database Analysis
In 2016, we examined samples analyzed using CASH from the continental US states. We identified three regions having suitable sample sizes (n=5,767 total) for further analysis, including the Mid-Atlantic, Midwest, and Northeast. These regions align with the United States Dept. of Agriculture Natural Resources (USDA) Conservation Service (NRCS) Major Land Resource Areas (MLRA) delineations. For each region, samples were identified by textural grouping to create a number of sub-datasets. Descriptive statistics and ANOVAs were performed to evaluate the mean and standard deviation of each region and texture. Based on these findings, we adjusted scoring functions for physical and biological indicators to account for observed statistically significant regional differences in mean indicator values. Chemical indicators are scored using experimentally-established thresholds, rather than the CND (see below), so they were left largely unchanged.

New Scoring Functions
Figures 2 and 3 show the updated scoring functions for each soil health indicator. Most of these are universally applied to all soils analyzed with the CASH, but in some cases, special considerations are required. For example, a separate scoring function for pH is now used for acid-loving crops (e.g., blueberries or potatoes; Fig. 3a), set one pH unit lower (5.2-to-6.3 are optimum, etc.). Modified-Morgan-P also uses an optimum scoring function (Fig. 3b), where concentrations ranging from 3.5-21.5 ppm are scored at 100. Negative impacts are expected when P is deficient ([P] ≤ 0.45 ppm) or excessive ([P] ≥ 100 ppm). Secondary (Mg) and trace (Fe, Mn, Zn) nutrients are scored using a sub-scoring system (Fig. 3d). Each nutrient is assigned a sub-score of either 0 (suboptimum) or 100 (optimum) depending on measured values. The average of the four nutrient sub-scores is used to determine the secondary nutrient score.

Figure 2. Comprehensive Assessment of Soil Health scoring functions for physical (a.-c.) and biological (d.-h.) soil health indicators. Functions are shown overlying a five color scheme (red-orange-yellow-light green-dark green), used to classify scores as very low (0-20), low (20-40), medium (40-60), high (60-80), and very high (80-100), respectively.

Figure 3. Comprehensive Assessment of Soil Health scoring functions for chemical indicators: pH (a) and Modified Morgan Extractable Phosphorus (b), Potassium (c), and secondary/trace nutrients (Mg, Fe, Mn, Zn) (d). Scores are coded using a five-color scheme (red-orange-yellow-light green-dark green), used to classify scores as very low (0-20), low (20-40), medium (40-60), high (60-80), and very high (80-100), respectively. — Figure 3. Comprehensive Assessment of Soil Health scoring functions for chemical indicators: pH (a) and Modified Morgan Extractable Phosphorus (b), Potassium (c), and secondary/trace nutrients (Mg, Fe, Mn, Zn) (d). Scores are coded with a five color scheme (red-orange-yellow-light green-dark green), which classifies scores as very low (0-20), low (20-40), medium (40-60), high (60-80), and very high (80-100), respectively.

The CASH Report Summary has traditionally used a three-color system (green-yellow-red; or low-medium-high) for interpreting measured indicator values. This system provided limited resolution for detecting changes in soil health over time, as scores ranging from 30-to-70 would be interpreted as ‘medium’ in the report. To address this, we adjusted to a five-color scale (red-orange-yellow-light green-dark green) to classify values as very low (0-20), low (20-40), medium (40-60), high (60-80), and very high (80-100), respectively. This visual change more easily demonstrates subtle soil health improvements.

The lower the CASH score, the greater the constraint in the proper functioning of processes as represented by the indicator. Land management decisions should, therefore, place priority on correcting very low scores (red). Low and medium scores (orange and yellow) do not necessarily represent a major constraint to proper soil functions, but rather suggested improvements that can be made in management planning. High or very high scores (light green and dark green) indicate that the soil processes represented by these indicators are likely functioning well. As such, management goals should aim to maintain those conditions.

Conclusion
The initial CASH soil health scoring functions were developed using data collected from Northeastern soil samples analyzed in the early 2000s. Ten years of soil health testing allowed us to build on a robust database including measured data for multiple soil health indicators. In 2016, we revisited the scoring functions used to score physical and biological indicators to increase the scope of the CASH to soils outside of the Northeast US. These changes have been incorporated into the CASH, most of which effectively increase the score associated with a given measured indicator value. These adjustments, in addition to the expanded five-color scheme, have helped address some of the concerns expressed by clients who found the CASH interpretations to be slightly off in some cases.

A full manuscript of this article titled “Statistics, Scoring Functions, and Regional Analysis of a Comprehensive Soil Health Database” is currently under review by the Soil Science Society of America Journal. For more details about the CASH framework, visit bit.ly/SoilHealthTrainingManual for a free download of the third edition of the training manual.

Save