Forecasting early-lactation diseases in Holstein dairy cows using milk spectra and machine learning

D. Lin1,2, J. Li1,2,3*, J.A. Seminara4, D. M. Barbano5, J. A. A. McArt4*

1 City University of Hong Kong, Shenzhen Research Institute, Shenzhen, China
2 Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong, China
3 School of Data Science, City University of Hong Kong, Hong Kong, China
4 Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853
5 Department of Food Science, College of Agriculture and Life Sciences, Cornell University, Ithaca, NY 14853

Fourier-transform infrared (FTIR) spectroscopy offers a non-invasive and cost-effective method for analysing milk composition, but its potential in forecasting early-lactation diseases has yet to be fully explored. We aimed to uncover the ability of milk FTIR spectra to forecast postpartum diseases in 1,114 Holstein cows from a dairy farm in New York. We collected proportional milk samples once daily on all early lactation cows and stored milk at 4°C until analysis via FTIR. Cows were followed through 30 DIM and classified as healthy (n = 825; no adverse health events) or diseased (n = 289; diagnosis of clinical ketosis, metritis, displaced abomasum, and/or mastitis). We constructed predictive models for 8 distinct time periods prior to disease diagnosis (>14 d, 14 to 11 d, 10 to 8 d, 7 to 6 d, 5 to 4 d, 3 d, 2 d, and 1 d) by employing machine and deep learning techniques and incorporating milk spectra and cow-level variables including milk yield, somatic cell count, and parity. Model performance was evaluated based on accuracy (Ac), sensitivity (Se), and specificity (Sp) under a combined scheme of multiple downsampling and 10-fold cross-validation. As disease progressed, critical spectral regions related to the absorbance of fat, protein, and lactose exhibited progressive changes. This enhanced the average Ac of 0.52, Se of 0.49, and Sp of 0.55 at >14 d prior to disease diagnosis to an Ac of 0.73, Se of 0.70, and Sp of 0.77 at 1 d prior to disease diagnosis. The inclusion of cow-level variables into the spectra-based models resulted in an average increase of 7.4%, 9.0%, and 5.9% in Ac, Se, and Sp. Deep learning models demonstrated their superiority with an average Ac of 0.76, Se of 0.75, and Sp of 0.77 across the 8 distinct time periods, outperforming the baseline partial least squares discriminant analyses which averaged an Ac of 0.60, Se of 0.52, and Sp of 0.68. These results highlight the opportunity to use milk FTIR spectra and cow-level variables to forecast health conditions and enable timely management interventions and improve the overall efficiency of modern dairy farms.

 Key words: Fourier-transform infrared spectroscopy, machine learning, deep learning