What’s Cropping Up? Volume 29, Number 1 – January/February 2019

Increase Yield Monitor Data Accuracy and Reduce Time Involved in Data Cleaning

Sheryl Swink1, Tulsi Kharel1, Dilip Kharel1, Angel Maresma1, Erick Haas2, Ron Porter2, Karl Czymmek1,2, and Quirine Ketterings1
1Cornell Nutrient Management Spear Program, 2Cazenovia Equipment Company, 3PRODAIRY


Reliable yield maps allow farmers and farm consultants to analyze yields per field, within fields, across fields and across years. Yield maps can be used to develop yield stability zones, or to identify reason(s) for low/high yielding areas by overlaying them with other geospatially tagged data such as elevation maps, soil series maps, etc. For reliable data, pre-harvest calibration of yield monitors and sensors should be followed up by careful operation in the field and proper post-harvest data cleaning in the office (Figure 1). This article presents best practices (pre-harvest, in-field, and post-harvest) that minimize yield monitor data errors and noise, reduce loss of data, and speed up data cleaning.


  1. Field naming. Develop a simple and consistent set of field IDs or names for each farm. Make sure all operators know and use the correct field identification. Using numbers eliminates spelling errors. Inconsistency in a field’s name from year to year results in extra, time consuming, post-harvest data clean-up.
  2. Field boundaries. Establish and load geo-spatially fixed/frozen field boundary files into the Yield Monitor prior to harvesting. This will assist in maintaining the accuracy of field IDs. Preloading fixed field boundaries facilitates assignment of harvest data to the correct fields as the harvester moves from field to field. Follow the procedures in your Yield Monitor manual to load boundary files before harvest begins.
Figure 1: Valuable data can be obtained when yield monitors are calibrated and yield data are properly cleaned. For instructions on corn silage and grain yield monitor data cleaning, see: http://nmsp.cals.cornell.edu/publications/extension/ProtocolYieldMonitorDataProcessing2_8_2018.pdf.


  1. Calibrate. Calibration using accurate scale weights or a grain cart with load sensors will increase accuracy. When calibrating, harvest as you would normally do in average crop areas in the field (include variability in the field, not just the best part). Re-calibrate the yield monitor often – for each crop or even variety that is being harvested, and for significant changes in crop conditions (very dry to very wet). Check and zero the mass flow sensor every morning so that the sensor identifies crop flow accurately. Clean the lens of the moisture sensor and inspect for damage daily.
  2. Field name/ID. Check to be sure correct field name/ID is entered or displayed before harvester enters a new field. Avoid inventing field names “on the fly.” Carefully check spelling if manually entering a field ID while harvesting. Misspelled or variations in field names from season to season make it difficult to match field data files across years for yield comparisons and within-field variability analysis. Proper field naming will ensure that yield data are assigned to correct field files.
  3. Harvest speed. Maintain a steady harvest speed within the calibration range for your system. Yield data recorded outside of the calibration range will be less accurate (irregular and/or very slow or high velocities over parts of the field result in yield calculations errors).
  4. Header height. Be sure the monitor logs a start and stop for each directional pass across the field to ensure data and yield area are logged properly. In most cases, the operator must lift the header beyond a set height to trigger the “stop logging” signal when exiting a pass or turning in the field. For some equipment, material flow can also be used to log the end of passes when the header is not raised for turning or for driving in the field without harvesting. Correctly logged field passes expedite trimming of unrepresentative start and end pass data points (ramping effect) during the cleaning process and proper shifting of data when correcting for flow and/or moisture delays relative to GPS location.
  5. Swath width. Be sure the recorded swath width is the actual width harvested. If swath width is not recorded properly, the harvested area calculated is wrong and so is the yield value. If the GPS system of the yield monitor has a large positional error (e.g. WAAS), turn off the auto swath adjustment and manually enter the default swath/chopper width. When harvesting less than the default chopper width without auto-swath, manually adjust swath width of the pass in the yield monitor to avoid erroneous yield calculations.
  6. Short rows. For long, narrow fields, plant and harvest rows the length of the field rather than the width if practical and consistent with soil conservation and other farm objectives. Short harvest passes distort yield data due to ramping velocity and flow impacts at the beginning and end of a pass, leaving few or no accurate data points in very short passes.
  7. Multiple combines/choppers in the field. If using more than one combine or chopper on a field, harvest a discrete section of the field with each one rather than mixing their passes across the whole field. Differences between operators, equipment and sensors result in different flow and moisture delays. These factors, if interlaced across the field, make it difficult to properly clean data.


Do not risk losing the season’s data by just leaving it on your monitor or relying on the cloud to save it. Download the raw yield monitor data files periodically during the season. The data cleaning protocol requires raw data to be transferred into Ag Leader format. Save the original files, backing them up on thumb drives and on your computer.

In Summary

Reliable data are essential for making the right decisions in field management. Mitigating errors at the source reduces the amount of data loss when filtering out noise during the post-harvest data cleaning process. The accuracy of yield data depends not only on proper calibration of yield monitoring equipment prior to and during harvest, but also on operation in the field and post-harvest data cleaning. Data become more reliable and the data cleaning process can be accelerated with implementation of the pre-harvest, in-field, and post-harvest practices described in this article.


This work was co-sponsored by the United States Department of Agriculture, National Institute of Food and Agriculture, Agriculture and Food Research Initiative Bioenergy, Natural Resources and Environment program, grants from the Northern New York Agricultural Development Program (NNYADP), New York Farm Viability Institute, New York Corn Growers Association, and Federal Formula Funds. For questions about these results, contact Quirine M. Ketterings at 607-255-3061 or qmk2@cornell.edu, and/or visit the Cornell Nutrient Management Spear Program website at: http://nmsp.cals.cornell.edu/.

Print Friendly, PDF & Email

What’s Cropping Up? Volume 28, Number 5 – November/December 2018

What’s Cropping Up? Volume 28, Number 1 – January/February 2018


Print Friendly, PDF & Email

Corn Silage and Grain Yield Monitor Data Cleaning

Tulsi Kharel, Sheryl Swink, Connor Youngerman, Angel Maresma, Karl Czymmek, and Quirine Ketterings
Cornell University Nutrient Management Spear Program

Calibration of yield monitors during the harvest season is essential for obtaining accurate yield data but even if calibrated properly, the data obtained from the yield monitors still need to be “cleaned”. Yield monitor values recorded are estimated based on:

  1. Distance (inches or feet) travelled by the harvester during data logging time period.
  2. Width (inches or feet) harvested during each logging time period.
  3. Silage or grain flow (mass) measured by the equipment’s flow sensor per logging time period (lbs/second).
  4. Moisture content (MC in %) of the harvested mass as measured by a moisture sensor per time period.
  5. Logging interval of the yield monitoring system (seconds).

Errors that impact the accuracy of the yield data occur in multiple ways. The distance the combine/chopper travels during a time period and the width give the area required for yield calculation. If a combine is not equipped with a harvest swath width sensor, the default will be the chopper/combine width and that can cause errors when fewer rows are harvested than the equipment width. Another source of error is the delay time of grain or silage moving from the chopper/combine head to the flow rate sensor. Flow rate sensors, moisture sensors, and Global Positioning System (GPS) units are located in different places on harvest equipment and since it takes some time for harvested silage or grain to travel to the sensors, adjustments need to be made (this is called delay time correction). Each harvest pass will be affected by this delay correction, independent of whether a new pass starts from one end of the field or from somewhere within the field (in situations where the harvester is paused during harvest). The delay time itself is related to the speed of the combine/chopper as well, which may introduce another source of error.

Combines and forage choppers are calibrated for a certain velocity range. If the velocities that are recorded fall outside the calibrated range, flow rate and yield values associated with those points are no longer trustworthy and should be removed from the data. Similarly, abrupt changes in velocity affect the flow rate, resulting in erroneous yield calculations for logged data points. Other easily trackable errors are logged data points with zero grain or silage moisture; this may occur as the chopper or combine enters the field or pauses mid-field while the silage or grain flow has not yet reached the moisture sensor.

Last but not least, if the operator does not raise the combine/chopper head after completion of a pass,  the pass number will not be updated in the logged dataset. Cleaning of data that are obtained this way will take additional effort, so lifting of the combine/chopper head while turning in the field is recommended.

The use of raw data without proper cleaning can lead to substantial over- and under-prediction of actual yield depending on the field and harvest conditions, especially for corn silage yield data. Figure 1 shows this in more detail for a number of fields. Look at a 20 ton/acre corn silage yield (cleaned yield) for the fields in this figure, and you will see that the raw data corresponding to this cleaned yield can range from 15 to 37 tons/acre! The raw data for many of the fields in this figure overpredicted yield, while for a number of other fields it actually underpredicted. Thus, data cleaning is absolutely necessary.

Figure 1: Not cleaning yield monitor data can result in large over or under predictions of actual corn silage yield.

In the past months, the Cornell Nutrient Management Spear Program, in collaboration with colleagues at the University of Missouri, the United States Department of Agriculture Agricultural Research Service (USDA-ARS) Cropping Systems and Water Quality Unit, Columbia MO, and the Iowa Soybean Association, evaluated cleaning protocols to develop a standardized and semi-automated procedure that allows for cleaning of datasets for whole farm yield data recording. The protocol developed for whole-farm data cleaning calls for unfiltered or “raw” harvest data files that are downloaded from the yield monitor with corresponding field boundary files. These files are read into the Ag Leader Technology Spatial Management System (SMS) software to preview the yield map and reassign any harvest data that might show up in the wrong field. Next, the individual field harvest data are exported as Ag Leader Advanced file format. The yield map files are then imported into Yield Editor (https://www.ars.usda.gov/research/software/download/?softwareid=370) for cleaning. Yield Editor is a freely available software developed by the USDA-ARS. The software allows for use of different ‘filters’ to remove the errors mentioned above. The final step in the cleaning protocol is deletion of data points with a moisture content <1 % for corn grain and <46 % for corn silage, which can be done in Yield Editor or in MS Excel or other sortable spreadsheet program. This final step is particularly important for obtaining accurate corn silage yield data. A step-by-step protocol for cleaning individual field datasets and batch processing of harvest data from growers with large numbers of corn silage or grain fields is described in a manual that is available for downloading from the YieldDatabase page (http://nmsp.cals.cornell.edu/NYOnFarmResearchPartnership/YieldDatabase.html) of the Cornell Nutrient Management Spear Program website.

Farmers with an interest in sharing corn silage and/or grain yield data with the Nutrient Management Spear Program for updating of the Cornell University yield potential database are invited to get in touch with us. The protocols for data sharing are available at the same weblink listed above. If interested in training sessions on the cleaning protocol this winter, contact Quirine M. Ketterings at qmk2@cornell.edu.


We thank the farmers and farm consultants that supplied data for this project, and our NMSP team members and colleagues in Missouri and Iowa for working with us on the protocol. For questions about the project contact Quirine M. Ketterings at 607-255-3061 or qmk2@cornell.edu, and/or visit the Cornell Nutrient Management Spear Program website at: http://nmsp.cals.cornell.edu/.

Print Friendly, PDF & Email

What’s Cropping Up? Volume 27, No. 4 – July/August 2017