Is Your Data Deceiving You?
It takes a lot for a “tech news” article to grab my attention these days, but I was astounded to read that an estimated 83 million Facebook accounts are either fakes or duplicates. That’s one in every twelve accounts—a sobering thought for those who use Facebook to connect with new contacts, prospects, and acquaintances.
It’s easy to point the finger at companies like Facebook. You can imagine media commentators lambasting the social media group: “How on earth could they have so much incorrect data?” It’s a valid question, but these figures do need to be put in context.
Let me ask you this question: If you work for a corporate organization, how “clean," accurate, and up to date is your data? How much would you estimate is outdated, incomplete, or just plain wrong? Research by IBM shows the average is close to 23 percent. This is mind-boggling! Are you better or worse than average and, more significantly, could your data be so wrong it is deceiving you?
Bandying about statistics is a fool's game. The smart question is “How on earth have so many organizations allowed their data to become so unclean?” A smarter question is “How can they prevent this from happening in the future?” Of course, the specific answer will vary by organization, but the key is undertaking good quality analysis when implementing and changing their processes or IT systems.
If you want to increase your data quality, consider these four areas:
- Understand processes: Generally, data only exists because a business process creates it. As the old saying goes, “garbage in, garbage out.” Often, the best sustainable way to ensure that new data is correct is to understand and fix the processes that create it.
- Define the right rules: Generally, there is a set of business rules—either implicit or explicit—that determines what the value(s) of specific data items can be or whether a particular data item can be recorded at all. For example, “An account can only be saved if an email address is provided.” Creating the right rules—those that reflect the policies of the organization—will ensure cleaner data.
- Understand data: Defining and understanding the data itself is important. What does the data actually mean in business terms? Does “sales price” mean before or after sales tax? Subtle differences can be significant, and it’s important that users and staff have a common understanding.
- Watch out for interfaces: Whenever data is extracted, transformed, and loaded into a different system, there is the risk of errors. Ensure that interfaces are fully defined and tested.
Data is a main artery of organizations, and it’s important that it’s both accurate and actionable. If your organization doesn’t know the quality of its data, it’s time to find out!