Data Versioning and Its Impact on Machine Learning Models

Authors

  • Thirupurasundari Chandrasekaran Sr. Project Manager, Phoenix, AZ USA Author
  • Sreenivasulu Ramisetty Data Architect, Conduent Services Inc Georgia, USA Author
  • Vamsi Krishna Eruvaram Sr. Data Engineer, Lowe's, USA Author
  • Mohan Raja Pulicharla Data Engineer, Maryland USA Author

PlumX DOI based Article Level Metrics

DOI:

https://doi.org/10.55662/JST.2024.5101

Keywords:

Machine Learning Models, Data Versioning, ML pipeline

Abstract

Data versioning in machine learning is of paramount importance as it ensures the reproducibility, transparency, and reliability of ML models. In the dynamic landscape of ML research, where models heavily rely on diverse datasets, data versioning plays a crucial role in maintaining consistency throughout the ML pipeline. By tracking changes in datasets over time and aligning machine learning models with specific versions of data, researchers can reproduce experiments, verify results, and address challenges related to data quality, collaboration, and model training. Effective data versioning practices contribute to the robustness of ML workflows, fostering trust in model outcomes and supporting advancements in the field.

Readership Data

βˆ’
🌐

Refreshing Cached Analytics Data

The cached analytics data has become stale and www.thesciencebrigade.com is making a fresh request to fetch the latest data from Google Analytics. This may take 20-30 seconds depending on the server response time from Google Analytics. Please do not close the browser during this time. We appreciate your patience.

Downloads

Download data is not yet available.

Citation Metrics

Downloads

Published

29-01-2024

How to Cite

β€œData Versioning and Its Impact on Machine Learning Models”. Journal of Science & Technology, vol. 5, no. 1, Jan. 2024, pp. 22-37, https://doi.org/10.55662/JST.2024.5101.

Plaudit