...and the last confession of a Turncoat.
Well, today I signed the paper that put me back into the ranks of the gainfully employed. There's a little bit of irony involved, as having taken Gartner to task a couple of weeks ago, I'm now going to go off and join them!
What? ADD as Turncoat Analyst? Gamekeeper turned poacher? Or is it the other way round?! I’d like to think of it more in terms of "If you can't beat them, join them." Or else, it's a unique opportunity to be a mole on the inside… However you call it, I'm not about to temper my outspoken approach, that's for sure!
It’ll be interesting, however it pans out!
I'll admit that I'm really excited by the prospect of become a Research Director at Gartner. I'm hoping that this new role gives me a unique opportunity to further explore many of the issues that we experience within the Information Management and Analytics sector, and influence the way we think and act with data. It should also mean that I will be able to delve into a lot more detail than my blogging enables me to. Even my "discussion paper" series doesn't really provide the channel to go in-depth into issues and compile the supporting evidence that I would ideally wish.
The down-side is that I will probably have to curtail my self-published content as I start putting most of my material out through the Gartner channels. (Though there's always the chance that a particular "too hot for TV" moment will need to come out under my own auspices!) I'll also continue to Tweet on a regular basis.
Now, the
first challenge is that there is still much disagreement about what constitutes
“Big Data”. The original suggestion that it is “any data that can’t be
processed by traditional methods” is hugely unhelpful, as would be any attempt
to define any thing as being “not another thing”. (Would we be comfortable in
defining a “dog” as “not a cat”?)
In the
past few years, the technology sector has generally settled upon defining “Big
Data” based on identifying certain characteristics of the data set, with those characteristics
all beginning the letter ‘V’. Gartner analyst Doug Laney originally proposed
three ‘V’ characteristics – Volume, Velocity, and Variety.
These
three ‘V’s help to establish characteristics and bound the problem of what “Big
Data” might look like from technical
perspective. The new breed of data tools certainly enable the engineering of
new and innovative methods of processing data that were previously out of reach
to all but the most well-funded of organisations.
There is no “so what?” factor that jumps out at us to make the problems of “Big
Data” meaningful in a business context.
I therefore suggest that a shift
in thinking is necessary, to examine the “Big Data” challenge not
from a technical perspective, but from a business one. To maintain consistency
with the original model, these additional business considerations for “Big
Data” can also be expressed as ‘V’s – Variability,
Veracity and Value:
- Variability:
Within any given data set, is the structure of that data regular and
dependable, or is subject to unpredictable change? If so, how can we understand
the nature of the “unstructured” text data content (or sound, or video) and
interpret it in a way that becomes meaningful for the required business analytic-ready
output?
- Veracity: How
do we know that the data is actually correct and fit for purpose? Can we test
the data against a set of defined criteria that establish the degree of
confidence and trustworthiness? What are the business rules that enable the
data to be tested and profiled? If there are issues with the data, what actions
can be taken to clean and correct the data before any analysis is carried out.
- Value: What
is the business purpose or outcome that we are trying to meet? What questions
are we seeking to answer, and what actions do we expect to take as a result?
What benefits do we expect to achieve from collecting and analysing the data?
Has the data been aligned with the desired outcome?
All three of these
additional characteristics require a clear understanding of the business context, which then is used to
frame the meaning and purpose of the data content. “Variability”, “Veracity”
and “Value” all express different aspects of the fitness-for-purpose of the
data sets in question, all of which need to be addressed in order to solve a
business problem in business terms.
If expanding the "Big Data" lexicon to a "Six 'V's Model" becomes my first contribution as a Gartner Analyst, then it's probably not a bad place to start.