AI-Driven Adaptive Data Cleansing: Automating Error Detection and Correction for Dynamic Datasets

  IJCTT-book-cover
 
         
 
© 2024 by IJCTT Journal
Volume-72 Issue-11
Year of Publication : 2024
Authors : Sandip J. Gami, Rajesh Remala, Krishnamurty Raju Mudunuru
DOI :  10.14445/22312803/IJCTT-V72I11P117

How to Cite?

Sandip J. Gami, Rajesh Remala, Krishnamurty Raju Mudunuru, "AI-Driven Adaptive Data Cleansing: Automating Error Detection and Correction for Dynamic Datasets," International Journal of Computer Trends and Technology, vol. 72, no. 11, pp. 159-164, 2024. Crossref, https://doi.org/10.14445/22312803/IJCTT-V72I11P117

Abstract
This study presents an adaptive data cleansing framework powered by artificial intelligence (AI) to address the challenges of maintaining data quality in dynamic and large-scale datasets. Traditional data cleansing methods are limited in handling real-time data inconsistencies and evolving patterns, making them unsuitable for modern applications. With minimal human intervention, the proposed AI-driven algorithms autonomously detect and correct errors, including missing values, anomalies, and inconsistencies. By leveraging machine learning, pattern recognition, and statistical techniques, the framework continuously adapts to data changes, ensuring high accuracy and integrity. This research highlights the novelty of integrating AI to automate data quality management, outperforming static rule-based systems by dynamically refining cleansing strategies based on incoming data. The study demonstrates that these adaptive algorithms reduce operational costs, enhance scalability, and improve decision-making across various industries, making them critical innovations in AI, data quality, and automation.

Keywords
Adaptive data cleansing, AI-driven algorithms, Automated error detection, Data correction techniques, Dynamic data sets, Machine learning for data quality, Real-time data processing.

Reference

[1] Mohiuddin Ahmed, Abdun Naser Mahmood, and Jiankun Hu, “A Survey of Network Anomaly Detection Techniques,” Journal of Network and Computer Applications, vol. 60, pp. 19-31, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Roberto Boselli et al., “Accurate Data Cleansing through Model Checking and Machine Learning Techniques,” Data Management Technologies and Applications, Communications in Computer and Information Science, vol. 178, pp. 62-80, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Jianwu Wang et al., “Big Data Provenance: Challenges, State of the Art and Opportunities,” 2015 IEEE International Conference on Big Data (Big Data), Santa Clara, CA, USA, pp. 2509-2516, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Aditya Saxena et al., “Comparative Analysis Of AI Regression and Classification Models for Predicting House Damages in Nepal: Proposed Architectures and Techniques,” Journal of Pharmaceutical Negative Results, vol. 13, no. 10, pp. 6203-6215, 2022.
[Google Scholar] [Publisher Link]
[5] Christopher M. Bishop, Pattern Recognition and Machine Learning, 1st ed., Information Science and Statistics, Springer New York, pp. 1-778, 2006.
[Google Scholar] [Publisher Link]
[6] Jiawei Han, Micheline Kamber, and Jian Pei, Data Mining: Concepts and Techniques, Elsevier Science, pp. 1-744, 2011.
[Google Scholar] [Publisher Link]
[7] Maksims Volkovs et al., “Continuous Data Cleaning,” 2014 IEEE 30th International Conference on Data Engineering, Chicago, IL, USA, pp. 244-255, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Seok-Jae Heo, Zhang Chunwei, and Eunjong Yu, “Response Simulation, Data Cleansing and Restoration of Dynamic and Static Measurements Based on Deep Learning Algorithms,” International Journal of Concrete Structures and Materials, vol. 12, pp. 1-13, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Leveraging AI for Data Quality Improvement, IBM Research, 2020. [Online]. Available: https://research.ibm.com/projects/data-quality in-ai
[10] Ramana Kumar Kasaraneni, “AI-Enhanced Process Optimization in Manufacturing: Leveraging Data Analytics for Continuous Improvement,” Journal of Artificial Intelligence Research and Applications, vol. 1, no. 1, pp. 488-530, 2021.
[Google Scholar] [Publisher Link]
[11] Oded Maimon, and Lior Rokach, Data Mining and Knowledge Discovery Handbook, 2nd ed., Springer New York, pp. 1-1285, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[12] R.Y. Wang, V.C. Storey, and C.P. Firth, “A Framework for Analysis of Data Quality Research,” IEEE Transactions on Knowledge and Data Engineering, vol. 7, no. 4, pp. 623-640, 1995.
[CrossRef] [Google Scholar] [Publisher Link]