I was delighted to chair Ark Group’s inaugural “Revolutionising the Data Warehouse and Business Analytics” conference this week.
The event featured participants from a diverse range of organisations such as Telstra, Australia Post, Asciano, Deloitte, PBT Group, Macquarie University, Uniting Care Heath, University of Technology Sydney and the Australian Sport Commission. Their though-provoking content provided stimulus for a highly interactive forum with some great debate.
A wide range of topics and themes were touched upon during the event, however some concepts were cropping up consistently throughout the various sessions. I’ve captured my own personal “Top 10 Takeaways” from the conference (not in any particular order):
1. Think about “Data Warehouse” as a set of services, not a single monolithic IT system. There needs to be a mind-shift away from the technology-centric view of a highly controlled Enterprise Data Warehouse.
Instead, think of the “warehouse” more as a set of organisational capabilities or services that facilitate information delivery and context-based analytics at point-of-need. The important aspect is not the technology, it’s the usage (and the action that is taken as a result).
A further enabler of this way of thinking is to adopt Agile methods within the data warehouse and analytic environment. Deliver to the requirements that achieve visible business value and accept that ongoing iteration and rework are simply a business-as-usual cost of doing effective analytic business.
2. Apply the principle of “Data First”. The organisation needs to understand the inventory of what data is being captured (or could be captured) in order to start exploring what that information could then be used for. Where possible and within reason, source all of the data that you can, even if you don’t have a specific business requirement in mind.
This also will help to facilitate the “second-sight” required to be ready to meet future emerging business requirements. For example, Google have adopted a philosophy of “throw nothing away” (right down to each mouse-click event and change of font colour on a web page), on the basis that they may be able to derive additional insight in the future that has not even been envisioned yet.
As a starting point, conduct a comprehensive audit of the organisation’s data holdings, so that you understand the information content that is available and can start to align it with the business context(s) that it can be applied to.
3. Ground everything you do in business requirements. Don’t start with the technology and then go looking for a problem to apply it to. Find the business requirement and then apply technology as a tool for enabling the solutions.
This also helps as a principle to guide your relationships with the technology vendors, who’ll happy try to sell you products as the solution to your requirements, even when you haven’t actually defined the problem yet!
(Note that this learning point is complementary to the “Data First” principle, not in conflict with it.)
4. Big Data: there are five “Vs”, not three. In addition to the now ubiquitous Big Data dimensions of “Volume”, “Variety” and “Velocity”, we should also be thinking about “Variability” and “Value”, and the most important considerations are these last two.
Complexity of the data landscape means that we’re not dealing with “structured” or “unstructured” information, we’re dealing with a diverse range of multi-structured data sets.
Meantime, if we don’t understand the context of the information that we’re dealing with and have mechanisms to navigate the impact on successful outcomes, then we need to question whether the activity is worthwhile.
Look for tangible value and ensure that you can explicitly identify and communicate sufficient value to justify the investment (accepting that there will be areas of benefit that may be less quantifiable.)
5. Information Governance is crucial. There needs to be an explicit information value chain that identifies what the decision rights and business rules are, and who is accountable.
There needs to be formal and visible measurement, with control points that then hold people to account. Successful organisations are investing in data quality as a human issue with process and cultural implications, not just technical ones.
6. The success of any data warehouse initiative is directly related to the level of Change Management capability. Transformational impact is achieved when data warehouse and analytics are rolled out pervasively, paying full attention to the human aspects of the change that arises.
This has to be done as a conscious discipline by members of the team who are practiced in Change Management techniques. Organisations will derive more benefit from a relatively small-scale solution that is well socialised and accepted as part of day-to-day business operations, than from a technically excellent analytic warehouse solution with no engagement.
7. A measure of information’s potential value is its likelihood of having unintended consequences. Think about the impact that knowing a particular piece of information could have. What unintended consequences could it lead to if that information is made available to the wrong person or used in the wrong way? If there are far-reaching consequences, then the information probably has significant intrinsic value.
e.g. Knowing someone’s favourite colour probably has little impact on them regardless of who that information was share with, however knowing their political allegiances could be potentially damaging if made known out of context.
8. There is still a lot of “biological ETL” going on. Human beings are still spending a lot of their time mashing up data in spreadsheets, re-keying data from one system to another, collating multiple reports to then derive a subsequent calculation.
Every function that can be rationalised or automated frees up more time for people to think and act.
9. Engage. Visible, active and regular communication is crucial. Share ideas, collaborate, ask awkward questions and expect them to be answered.
Use intellectual curiosity and sceptical scrutiny as tools to drive better thinking.
10. The future is the Cloud and Open-source, and it’s already here. ICT infrastructure is moving to the Cloud at an accelerating rate and organisations should expect (indeed demand) their data warehouse and analytics solutions to operate on Cloud-based platforms.
Open-source solutions are changing the game to the point where it becomes questionable whether costly, proprietary technologies are required any more. NOSQL will become an increasingly pervasive element of the analytic environment.
We all need to adapt our thinking to maximise the benefits of transitioning to Cloud and manage the legislative and regulatory implications (e.g. with respect to Australian Privacy legislation), but Cloud not going to be negotiable.