![]() |
International Journal of Scientific Research and Engineering Development( International Peer Reviewed Open Access Journal ) ISSN [ Online ] : 2581 - 7175 |
IJSRED » Archives » Volume 8 -Issue 6

📑 Paper Information
| 📑 Paper Title | Automated Data Cleaning and Preprocessing System |
| 👤 Authors | Aman Rawat, Mr.Ritesh Kumar |
| 📘 Published Issue | Volume 8 Issue 6 |
| 📅 Year of Publication | 2025 |
| 🆔 Unique Identification Number | IJSRED-V8I6P105 |
📝 Abstract
Data preprocessing is a fundamental step in any machine learning pipeline, as the quality of input data directly influences the reliability and accuracy of predictive models. Real-world datasets often contain missing values, duplicate entries, inconsistent formats, outliers, and non-standardized features, making manual preprocessing time-consuming, error-prone, and difficult to reproduce. To address these challenges, this research presents an Automated Data Cleaning and Preprocessing System designed to streamline the transformation of raw data into structured, analysis-ready form with minimal user intervention. The system integrates automated detection of missingness, statistical and algorithmic imputation, outlier identification, categorical encoding, normalization, and comprehensive summary reporting through a modular pipeline architecture. Experiments conducted on diverse datasets from multiple domains demonstrate improvements in data consistency, distributional stability, and downstream machine learning performance. The results highlight that automated preprocessing not only reduces human effort but also ensures reproducibility, scalability, and enhanced model accuracy, making it suitable for academic research, industrial applications, and production-level data workflows.
