Scope
This document provides a data quality model, data quality measures, and guidance on reporting data quality in the context of analytics and machine learning (ML). This document builds on ISO 8000 series, ISO/IEC 25012 and ISO/IEC 25024.
The aim of this document is to enable organizations to achieve their data quality objectives and is applicable to all types of organizations
Purpose
Data quality is necessary for ML and Big Data systems to be safe, reliable, and interoperable. ISO/IEC 20547-3 states that “Data quality management is essential to big data systems, as poor data quality such as incomplete, false or outdated data can disable effective data mining processes, prevent useful findings or lead to wrong output”. Moreover, ISO/IEC TR 24028 addresses challenges of data quality in AI systems based on machine learning, including bias in the data used to train the AI system, data poisoning, and adversarial attacks. Organizations can use this document to select and implement the data quality measures that meet their requirements in data analytics and ML.
This standard defines data quality characteristics for analytics and ML upon data, especially big data, quality measurements related to the data quality characteristics, and guidelines to evaluate and report data qualities in the data workflow of analytics and ML.
Comment on proposal
Required form fields are indicated by an asterisk (*) character.