You are in:Home/Publications/Comparative study of single imputation techniques for the prediction of missing dairy data

Assist. Ahmed Abdel Hakim Ahmed Al Huity :: Publications:

Title:
Comparative study of single imputation techniques for the prediction of missing dairy data
Authors: Ahmed A. Ahmed, Eman A. Manaa, Basant M. Shafik, Saqr A.Mustafa, Ahmed M. Gad
Year: 2025
Keywords: Imputation methods, Missing data, Record keeping, Expectation maximization, Power regression
Journal: Benha Veterinary Medical Journal
Volume: 48
Issue: 2
Pages: Not Available
Publisher: Benha Veterinary Medical Journal
Local/International: Local
Paper Link: Not Available
Full paper Ahmed Abdel Hakim Ahmed Al Huity_first.pdf
Supplementary materials Not Available
Abstract:

Dairy farm records are a crucial component of effective livestock business management. Record analysis allows a farm’s owner to make informed decisions. Incomplete records are less useful for data analysis, so it's important to handle missing values correctly. This study compares different imputation methods for handling missing values in a dataset of dairy records comprising 997 records collected from 234 cows between 2012 and 2022. The dataset was screened against records with missing values and then deleted, resulting in 858 observations from 200 animals. There were missing values in two variables, with a missing percentage of 13.9%: days in milk (DIM) and total milk yield (TOTM). Then, cases with known values that show the same percentages of missing data as the original dataset for DIM and TOTM are randomly excluded. Five different imputation techniques were compared to obtain the best imputation technique. These techniques include mean imputation, median imputation, power regression imputation, multiple regression imputation, and expectation-maximization method (EM). The results showed that the expectation maximization method was the best imputation method for the data under study. It has the lowest mean absolute deviation MAD (37.54), the lowest mean square error MSE (15425.07), the highest Spearman’s correlation coefficient (0.967) and the second lowest mean absolute percentage error MAPE (5.27) for predicting the missing data in missing variables. Power regression imputation comes after expectation maximization (EM) in predicting missing values, as it gives results better than other imputation methods but lower than Expectation-maximization (EM).

Google ScholarAcdemia.eduResearch GateLinkedinFacebookTwitterGoogle PlusYoutubeWordpressInstagramMendeleyZoteroEvernoteORCIDScopus