Dirty Data: Costing Companies $600 Billion Each Year!

by Jackie Biallas

Every company has dirty data, at least to some degree.  But how serious is it?  Consider this: the Data Warehousing Institute (TDWI) estimates that dirty data costs companies over $600 BILLION each year!


Dirty Data is a database record that contains errors.  These errors include spelling discrepancies, duplicate records, incomplete or inaccurate data, and data that is inconsistent across records.


The information contained in the database is used to analyze a business.  If the data is not clean and contains inaccuracies or duplications, the analysis will be flawed.  If you are not getting correct insights from your data, the decisions you make for your business will be flawed.  This will greatly affect your bottom line!  In fact, studies have shown that dirty data costs companies an average of 12% of their revenue each year!


  • Ensure that all employees understand the importance of clean data and its uses. Data integrity may be compromised when new products are set up or when a sale occurs.  By ensuring the data entry is complete and accurate, insights can be gleaned from the data, like analysis on product selling features and customer locations.
  • Dedicate resources to maintaining data integrity. Data cleansing may take some time, depending on the current condition of the database.  Businesses may also want to consider having an outside firm help them clean their data and advise them on practices that could help them get the most benefits from their data.
  • Set up alerts to notify users when data entry looks suspect. For example, a user could be notified if a field is left blank, or if an entry is spelled differently from entries already in the system, like “SM” instead of “Small” (the SM entry would cause that item to be excluded from analysis of the “Small” group).


Clean data is the foundation for good sales analysis and forecasting – and the benefits are far-reaching.  Just a few examples of the benefits of clean data include:

  • The ability to analyze sales based on different selling features of products. With complete and accurate data, businesses can determine the best-selling colors, sizes, or materials and determine what new products to sell.
  • The ability to segment marketing campaigns based on customer and product attributes. For example, one product feature may sell well in the Southwest while another feature only sells in the Northeast.
  • By ensuring there are no duplicate SKUs, businesses can maintain data integrity such as sales and inventory levels. This contributes to the profitability of a business by being able to forecast sales properly and not carrying excess inventory.

Quality analysis can only be completed if the data that is being used is complete, accurate and clean.  Otherwise, it’s “garbage in, garbage out.”  While maintaining clean data does take diligence and effort, the benefits realized can add significantly to the bottom line.

For more information about the importance of clean data, contact SAFIO Solutions.  We’d be happy to answer your questions about clean data or help you clean and organize your data so you can reap the benefits of great analysis and forecasting!