Improve machine learning models by curating vision data. Find and remove redundancy and bias introduced by the data collection process to reduce overfitting and improve generalization.