According to a 2021 Gartner report, poor data quality costs organizations an average of $13 million a year. In addition to this negative revenue impact, data quality issues increase the complexity of data ecosystems and lead to poor decision-making.
That’s why companies must overcome data quality issues if they ever hope to deliver data-driven customer experiences and use their data for competitive advantage. With that in mind, following are some of the primary elements that must be addressed to ensure good quality data.
To engender greater trust in their data, companies must start with the collection process. It’s important to determine who provided the data, where it came from, and why and how it was collected. Diversification is also critical, as it ensures that data isn’t biased by an individual’s preference or opinion. Following the chain of custody, post-data collection is another key step to ensure that quality isn’t negatively impacted after the fact. Finally, because data often changes over time, it’s important to ensure ongoing quality control processes to spot any problems and update the algorithms as needed.
Biased data is a fact of life in any large data set. The lion’s share of attention is typically focused on bias that occurs during analytics, but bias can occur or be interpreted at earlier stages in the data pipeline. The best approach to mitigating this bias is to identify it and work to correct it—for example, making small changes to sampling and representation. Another technique involves separating the people building the models from the fairness committee, as well as ensuring that developers cannot see any sensitive attributes so that they don’t accidentally use that data in their models. Just as organizations need continuous quality checks, they also must monitor for bias on an ongoing basis to ensure that data is not adversely impacted.
The Move to Data Fabrics
Given the critical importance of data quality, many organizations are implementing data fabrics to manage data collection, governance, and integration, and optimize their data’s potential. According to one report, data fabrics can reduce data management efforts by up to 70%. Additional benefits include enhanced operational efficiency, higher customer retention rates, and faster time to market.
For more on what you should do to get a handle on data quality concerns, check out this recent TechTarget article.