"An extraordinary thinker and strategist" "Great knowledge and a wealth of experience" "Informative and entertaining as always" "Captivating!" "Very relevant information" "10 out of 7 actually!" "In my over 20 years in the Analytics and Information Management space I believe Alan is the best and most complete practitioner I have worked with" "Surprisingly entertaining..." "Extremely eloquent, knowledgeable and great at joining the topics and themes between presentations" "Informative, dynamic and engaging" "I'd work with Alan even if I didn't enjoy it so much." "The quintessential information and data management practitioner – passionate, evangelistic, experienced, intelligent, and knowledgeable" "The best knowledgeable, enthusiastic and committed problem solver I have ever worked with" "His passion and depth of knowledge in Information Management Strategy and Governance is infectious" "Feed him your most critical strategic challenges. They are his breakfast." "A rare gem - a pleasure to work with."

Sunday, 19 May 2013

Damn those definitions!

Communication is the key to establishing a common lexicon.

I recently responded to a enquiry on LinkedIn relating to the problem of achieving consistent data definitions:


While the online discussion was very worthwhile and raised several important points, I thought I’d take the opportunity to elaborate a bit further on the topic of establishing a common organisational lexicon. The issue of common data definitions (or more accurately, the lack of them) is one that’s currently exercising a lot of attention here at UNSW, and it is by no means the first time that I’ve been through this process.

In all the situations that I’ve encountered this “common definitions” problem, there are three characteristics that are always evident:
  1. Each party thinks their definition is the “correct” one, and it’s everyone else who’s got the problem.
  2. There is never “no definition” for a data item. There are always too many definitions (which are incomplete, contradictory, poorly documented uncommunicated etc.)
  3. The definition(s) in use don’t actually correlate with the data set that is available, leading to interpolation, extrapolation, assumption and caveat.

In the simplest terms, this all boils down to a communication gap. People just don’t talk to each other. And when they do, it usually ends up being driven by an initial anomaly being identified, which quickly leads to outright disagreement between factions, which invariably then results in battle lines being drawn up.

The role of the Data Governance agent in all this is to mediate, facilitate and communicate – proactively and before any trouble arises if possible. (In this case, the “DG agent” is whoever has picked up the task of resolving the definitional problem. It doesn’t need to be someone who is in a DG “job”; in many cases it will be a project Analyst, ETL programmer, Business Intelligence developer or other solutions person who has inherited the issue by default because they need to resolve the definitional issue in order to deliver their part of the solution).

Whatever the scenario, the agent will need to get the interested parties together and find out whether they're actually talking about the same thing or not. You'll get to one of two states:
  1. They may be using different contextual language to refer to the same root content - in which case you've only got one definition (It’s then ok if they want to use their own terminologies within their own context; that's just maintaining a lexicon of synonyms);
  2. They're using the same language to talk about two different things that don't agree - in which case you need to give each a specific and separate definition to each that also encapsulates the context.
In the forum discussion noted above, I discussed the example of "Average Revenue Per User" (ARPU) from when I worked for a major UK-based retailer of mobile phones.  For the purposes of this blog, I’ll offer a couple of other examples:

  1. Within an Australian Federal Agency that deals with bio-sciences, two researchers were having a significant disagreement about whether a particular organism was to be classified as a “pest” or a “parasite”. Because they both had their particular perspectives (one being from Biosecurity, the other being from Veterinary Science), they could not accept the other person’s definition. By bringing them together and highlighting that the bug in question could be classified in either manner, dependent on the context, the definitional contention was solved.
  2. Here at UNSW (and in common with every other university on the plan, I should imagine), the number of students we have within the institution is a pretty important measure! However, not all “students” are created equal, and the trick is to work out the wider context of why the question is being asked:

  • Are they full-time or part-time?
  • Are they enrolled in degree-award programs or non-award courses of study?
  • Are they studying for a single degree or dual award?
  • Are they a continuing student or a new enrolment?
  • Are they undergraduate or higher-degree students?

 These sort of definitional conflicts occur all the time, and it can take quite a bit of analytical skill, facilitation and communication (as well as patience!) to reach clarity. Good luck!