Monitoring the Performance of Machine Learning Models in Production |
||
|
|
|
© 2022 by IJCTT Journal | ||
Volume-70 Issue-9 |
||
Year of Publication : 2022 | ||
Authors : Satyanarayan Raju Vadapalli | ||
DOI : 10.14445/22312803/IJCTT-V70I9P105 |
How to Cite?
Satyanarayan Raju Vadapalli, "Monitoring the Performance of Machine Learning Models in Production," International Journal of Computer Trends and Technology, vol. 70, no. 9, pp. 38-42, 2022. Crossref, https://doi.org/10.14445/22312803/IJCTT-V70I9P105
Abstract
Machine learning (ML) models have become vital decision-making components for many businesses in the last decade. However, the performance of ML models degrades over time due to multiple economic or environmental factors that can lead to non-optimal decision-making. With organizations having tens or even hundreds of ML models deployed in production, it is important to ensure the models perform the way they were trained to perform. Additionally, models need to be retrained every few weeks or months to adapt to the evolving environment that affects the model performance. In this article, we discuss an approach that can be used to proactively identify issues with model output and inform the developers and data scientists when it’s time to retrain the model. Given the importance of input data quality in model performance, our approach`s significant attention is coming up with ways to identify data quality issues and take proactive measures to mitigate the associated risks. This monitoring approach is currently deployed on multiple production models, generating automated alerts on models or data drifts, enabling the data scientists to take corrective actions.
Keywords
Drift detection, Machine learning, MLOPs, Monitoring, Observability.
Reference
[1] Georgios Symeonidis, Evangelos Nerantzis, Apostolos Kazakis, and George A. Papakostas, “MLOps - Definitions, Tools and Challenges,” in Proc. IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC), pp. 0453-0460, 2022.
[2] Sasu Makinen, Henrik Skogstrom, Eero Laaksonen, and Tommi Mikkonen, “Who Needs MLOps: What Data Scientists Seek to Accomplish and How Can MLOps Help?,” IEEE/ACM 1st Workshop on AI Engineering - Software Engineering for AI (WAIN), pp. 109-112, 2021.
[3] KDnuggets home, 2021. [Online]. Available: https://www.kdnuggets.com/2021/03/machine-learning-model-monitoring-checklist.html
[4] Altexsoft, 2022. [Online]. Available: https://www.altexsoft.com/blog/machine-learning-metrics/
[5] Towards Data Science, 2021. [Online]. Available: https://towardsdatascience.com/psi-and-csi-top-2-model-monitoring-metrics-924a2540bed8
[6] Alec Zhixiao Lin, “Examining Distributional Shifts by Using Population Stability Index (PSI) for Model Validation and Diagnosis,” Western Users of Sas Software(WUSS), 2017.
[7] Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang, and Martin Zinkevich, "Data Lifecycle Challenges in Production Machine Learning: A Survey," ACM SIGMOD Record, vol. 47, no.2, pp. 17–28, 2018.
[8] Aporia, 2022. [Online]. Available: https://www.aporia.com/machine-learning-model-monitoring-101/
[9] Deepchecks, 2022. [Online]. Available: https://deepchecks.com/how-to-monitor-ml-models-in-production/
[10] X. Li, O.R. Zaiane, and Z. Li (Eds.), "Learning with Local Drift Detection,” ADMA, Lecture Notes in Computer Science, vol. 4093, pp. 42–55, 2006.
[11] Maayan Harel, Shie Mannor, Ran El-Yaniv, Koby Crammer, "Concept Drift Detection through Resampling," Proceedings of the 31st International Conference on Machine Learning, vol. 32, no. 2, pp. 1009-1017, 2014
[12] Rosana Noronha Gemaque, Albert França Josuá Costa, Rafael Giusti, and Eulanda Miranda dos Santos, "An Overview of Unsupervised Drift Detection Methods," WIREs Data Mining Knowledge Discovery Wiley Periodicals LLC., vol. 10, no. 6, pp. e1381, 2020.
[13] Samuel Ackerman, Parijat Dube, Eitan Farchi, Orna Raz, and Marcel Zalmanovici, "Machine Learning Model Drift Detection Via Weak Data Slices," DeepTest workshop of ICSE, arXiv:2108.05319 [cs.LG], 2021
[14] Jan Zenisek, Florian Holzinger, and Michael Affenzeller, "Machine Learning Based Concept Drift Detection for Predictive Maintenance", Computers & Industrial Engineering, vol. 137, 2019.
[15] João Gama, Pedro Medas, Gladys Castillo, and Pedro Rodrigues, "Learning with Drift Detection," Advances in Artificial Intelligence – Brazilian Symposium on Artificial Intelligence (SBIA), vol. 3171, pp. 286–295, 2004.
[16] Anton Dries, and Ulrich Rückert, "Adaptive Concept Drift Detection," Statistical Analysis and Data Mining: The ASA Data Science Journal Wiley, vol. 2, no. 5-6, pp. 311-327, 2009.
[17] Towards Data Science, 2021. [Online]. Available: https://towardsdatascience.com/the-playbook-to-monitor-your-models-performance-in-production-ec06c1cc3245
[18] Neptune AI, 2022. [Online]. Available: https://neptune.ai/blog/ml-model-performance-monitoring
[19] B. Balaji, Kanagaraj. U, Mahendran. R, Rethinasiranjeevi. R, "Fault Prediction of Induction Motor using Machine Learning Algorithm," SSRG International Journal of Electrical and Electronics Engineering, vol. 8, no. 11, pp. 1-6, 2021. Crossref, https://doi.org/10.14445/23488379/IJEEE-V8I11P101
[20] Tommi Mikkonen, Jukka Nurminen, Mikko Raatikainen, Ilenia Fronza, Niko Makitalo and Tomi M, "Is Machine Learning Software Just Software: A Maintainability View In Software Quality Days," Springer, vol. 404, pp. 94-105, 2021.
[21] Towards Data Science, 2020. [Online]. Available: https://towardsdatascience.com/how-to-detect-model-drift-in-mlops-monitoring-7a039c22eaf9
[22] Analytics Vidhya, 2021. [Online]. Available: https://www.analyticsvidhya.com/blog/2021/10/mlops-and-the-importance-of-data-drift-detection/
[23] Deepchecks, 2022. [Online]. Available: https://deepchecks.com/how-to-detect-concept-drift-with-machine-learning-monitoring/
[24] Databricks, 2019. [Online]. Available: https://www.databricks.com/blog/2019/09/18/productionizing-machine-learning-from-deployment-to-drift-detection.html
[25] K. Jino Abisha, J.Roshan Nilofer, A.Silviya, Dr. S. Raja Ratna, "Detection of Twitter Spam`s using Machine Learning Algorithm," SSRG International Journal of Computer Science and Engineering, vol. 6, no. 3, pp. 10-13, 2019. Crossref, https://doi.org/10.14445/23488387/IJCSE-V6I3P103
[26] Microsoft, 2022. [Online]. Available: https://learn.microsoft.com/en-us/azure/machine-learning/v1/how-to-monitor-datasets?tabs=python
[27] Github, 2017. [Online]. Available: https://etav.github.io/python/vif_factor_python.html
[28] Sigmoid. [Online]. Available: https://www.sigmoid.com/blogs/how-to-detect-and-overcome-model-drift-in-mlops.