"An extraordinary thinker and strategist" "Great knowledge and a wealth of experience" "Informative and entertaining as always" "Captivating!" "Very relevant information" "10 out of 7 actually!" "In my over 20 years in the Analytics and Information Management space I believe Alan is the best and most complete practitioner I have worked with" "Surprisingly entertaining..." "Extremely eloquent, knowledgeable and great at joining the topics and themes between presentations" "Informative, dynamic and engaging" "I'd work with Alan even if I didn't enjoy it so much." "The quintessential information and data management practitioner – passionate, evangelistic, experienced, intelligent, and knowledgeable" "The best knowledgeable, enthusiastic and committed problem solver I have ever worked with" "His passion and depth of knowledge in Information Management Strategy and Governance is infectious" "Feed him your most critical strategic challenges. They are his breakfast." "A rare gem - a pleasure to work with."

Sunday, 13 January 2013

What’s what, where’s where and who’s who?

Developing an Information Asset Register for effective Data Governance.

(Note: this post also published by Image & Data Manager Magazine).

A significant aspect of effective Data Governance is about orchestrating the exchange of information between multiple parties; to facilitate (and arbitrate) a robust, repeatable approach to delivering content in context, in support of more effective and efficient business outcomes.

But how do you do this if you don’t know what data you’ve got, what state it’s in, or who is responsible for it?

For over 10 years now, I’ve been advocating the idea of maintaining an Information Asset Register, as part of an enterprise-wide approach to managing Information as an Asset. (this factsheet from the UK National Archives is a useful working definition of the term).

This approach goes beyond the systems and applications auditing process that takes place within IT departments. Rather, the Information Asset Register is about building up and maintaining a complete, reliable inventory of data holdings within the organisation, the different contexts within which the information is (or could be) used, and identification of the various interested parties  – if you will, it’s an index of “what’s what, where’s where and who’s who”.

The Register is then used as a tool for enabling more explicit and productive discussions about data between respective creator, collector and consumer parties. Crucially, by acting as a catalyst for discussion between information stakeholders, this approach encourages more collaboration across functional boundaries, establishes points of contact for more proactive information sharing and breaks down any existing protectionism within information silos (an approach that a Public Sector colleague of mine refers to as “POIM” – as in “P*ss Off, It’s Mine…”).

It can be seen that this is therefore not a project  - maintaining and publishing the Information Asset Register quickly becomes a key ongoing service provided by the Data Governance function. This requires some incremental level of investment in your Data Governance capability, if only to provide the resources and skills needed to enable the proactive brokering and facilitation of a data-oriented discussion. (Some organisations will require higher levels of investment if basic Information Management practices and capabilities are not yet in place).

The approach isn’t widespread yet, but some progressive government organisations have been taking the lead (notably the UK National Archives, Australian Commonwealth National Archives and Queensland GovernmentCIO Office). One challenge as I see it is that these organisations are taking an approach that is driven out of compliance and records-keeping requirements, rather than seeing Information Asset Management as a value-adding opportunity. I’d argue that if the drivers were based on business improvement and outcome benefits (rather than “meet the basic minimum”), organisations adopting an Information Asset Management approach would start to see real transformational change, almost by stealth. (See also my prior post on the concept of Information as a Service.)

Anyway, the Information Asset Register is certainly an approach that I’m adopting within my Data Governance role at UNSW – time will tell whether it proves to be successful!

Some specific online resources that you may find useful to help get you started:

Note also that, further to my blog post of earlier this month, I will be running a workshop on this topic at the Ark Group's Data Quality Asia Pacific Congress in March.

Wednesday, 9 January 2013

"What's your point, caller?!"

Think about Information Use Cases, not technology.

I really enjoyed Leena Rao's recent article on TechCrunch, "Why We Need to Kill "Big Data"", not least because it reflected pretty much how I've been feeling for a long time about the lazy, misleading use of jargon and never-ending hype cycle as the tech vendors jump upon the latest bandwagon. (Pick one, any one, because they're all at it. Yes, I'm looking at you, IBM, Oracle, Microsoft, Teradata, SAP...) 

Leena's perspective also provides a healthly counterpoint to the views currently being positioned by IDC (as reported in the Data Informed article "Now is the time to buy into Big Data"), which seems to offer a very tech-centric point of view and plays right into the hype cycle without actually saying anything.

We've been here before  - Decision Support, OLAP, Management Information Systems, Performance Measurement, Business Intelligence, Master Data Management - all are catch-all term that actually had little if any real meaning in and of themselves. "Big Data" is just the latest entry in the Buzzword Bingo Lexicon.

In the twenty-odd years I've been involved in business solutions delivery and management consulting, it seems that *insert tool of choice* is promoted as the "next big thing to solve all your problems", without giving any thought whatsoever to what the actual problem, scenario or challenge actually might be. The technology is almost always the totally wrong entry point for the conversation, because technology only becomes of relevance when it is applied to solving a particular problem - and depending on the problem at hand, some technologies are more equal than others. 

In the Information Management space, almost all problems relate to the capture, exchange, dissemination, sharing, interpretation and acting upon one or more data sets. (Data Governance then sets out to ensure that these tasks can happen in a repeatable, consistent, efficient and effective manner). Which got me thinking. Are there generic categories of "Information Use Case" that we can use to describe various business problem scenarios, so that we can then start to make more informed choices about what sorts of technologies might be appropriate?

Or to put it another way: "what's your point, caller?!"

Anyway, here are some information-related problem situations that I could identify (you might think of others, or take issue with some of mine. I'd be delighted to hear from you if you do, because it means you've been thinking about the business problem, and not about the technology!):

Note that there is likely to be interaction (or even significant overlap) between these classes of use case. Each individual class of Use Case should be considered a necessary, but not sufficient, element of the organisation's Data Governance and Information Management capability.

Data is of little value if all it does is sit in the data warehouse. As a result, the presentation layer is of very high importance.
Most On-Line Analytic Processing (OLAP) vendors have a front-end presentation layer that allows users to call up pre-defined reports or create ad hoc reports. The aim is to synthesise large quantities of raw data into meaningful views that can be acted upon in context.
As such, reporting against structured data can be viewed as a specific type of authoring process; any reporting output is likely to be produced and submitted to the more general publishing process.
A number of key considerations need to be taken into account as part of the reporting capability:

  • Number of reports: The higher the number of reports, the more likely it is that purchasing a pre-built vendor solution is the right approach. Reporting tools typically make creating new reports easier (by offering re-usable components) and also provide report management systems to make maintenance and support functions easier.
  • Desired Report Distribution Mode(s): reports will only be distributed in a single mode (for example, email only, or over the browser only), or will users access the reports through a variety of different channels?
  • Ad Hoc Report Creation: in most environments, it is expected that end- users will be able to create their own ad hoc reports. Ad hoc report creation necessarily relies on a strong metadata layer and shared understanding of what the information presented in the report is communicating.
  • Data source connection capabilities: in most modern environments, users will need to access data sources using both relational database and OLAP multidimensional data technologies.
  • Scheduling and distribution capabilities: in a realistic data usage scenario, senior executives will only have time to come in on Monday morning and look at the most important information from the previous week. To meet this need, the reporting tool must have scheduling and distribution capabilities. Weekly reports are scheduled to run on Monday morning, and the resulting reports are distributed to the senior executives either by email or web publishing.
  • Security Features: reporting tools are geared towards a number of users in different Business Units and teams, with different priorities and responsibilities. Therefore, ensuring that people see only what they are supposed to see is important. Most reporting tools have capabilities to manage security at different levels, including at the report level, folder level, column level, row level, or even individual cell level. Furthermore, they have a security layer that can interact with the common corporate login protocols and "single sign-on" policies.
  • Export capabilities: data export is commonly required for Excel, flat file, and PDF formats. It may also be desirable and time-saving to export the reporting format as well as the data itself.
  • Integration with the Microsoft Office environment:
  • It is likely that reporting information will need to be incorporated into documents created with Microsoft Office products, especially Excel, both for manipulating data and for publishing. Some reporting tools now offer a Microsoft Office-like editing environment for users, so all formatting can be done within the reporting tool itself, with no need to export the report into Excel.

Strategic Intelligence and Data Mining

Data mining is the process of discovering new patterns and inferences in large data sets, involving a range of methods and techniques such as artificial intelligence, machine learning, statistics and database systems
The goal of data mining is to extract knowledge from a data set in a human-understandable structure and may involve a complex process of database and data management, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of found structure, visualisation and online updating.
It is likely that a risk-based approach will need to incorporate information processing and data analytic features including:

  • Anomaly detection:  Identification of outliers, changes and deviations in the data records that might be interesting or data errors and require further investigation.
  • Association rule learning: searching for relationships and dependencies between variables.
  • Clustering: discovering groups and structures in the data that are in some way or another "similar", without using known structures in the data.
  • Classification: identifying and applying a known, generalised structure or categorisation to new data. (For example, an email program might attempt to classify an email as legitimate or spam.)
  • Regression: discovery of an approximation function that models the data with the least error.
  • Summarisation to provide a more compact representation of a data set, including visualisation and report generation.

At a minimum, content has to be written and it has to be posted. Between those two steps, it is usually checked for its writing quality and correctness. Legal and compliance may need to review it. Ideally, any publishable material will be reviewed by a high-level editor or editorial board to make sure that it is consistent in style and fact with other information already in the published domain.
As with the searching process, the authoring process will need to be context-aware to ensure that information is defined and used appropriately, within both the context within which it was created, and the context of any intended (or unintended) usage.

The capacity to provide timely, compelling and concise advice to inform senior decision makers and executives is a vital capability for any organisation.
The Executive Briefing process therefore requires departments and business units to be able to locate, collate, and interpret the available information, such that the context and rationale for any decision can be supported and substantiated.

In simple terms, education provides a knowledge base that underpins any other activities the individual may engage in at a later stage. Training is not as general and tends to concentrate on skills development for the purposes of a specific skill or task. Learning tends to be associated with the self-developed of the individual.
Capability for education, training and learning of staff is a key aspect of service improvement in the University. In support of this, organisations need to provide an information sharing capability that enables all staff to access the process, policy and knowledge resources pertinent to their role.
Many information users will be interested in finding material that has been authored by someone else in the organisation.
Assuming this content has been made available for others to access, the capability for finding, retrieving and accessing the required material may be many and varied, depending on a number of factors including: the nature of the content medium; the physical locations of the originator and consumer; the mechanisms available; other content that the consumer may wish to combine.
The nature of information content will also be dependent upon both the context within which it was created, and the context of the intended usage. Any search and retrieval process will need to be context-aware to ensure that information is used appropriately.
A technology-enabled approach to content search and retrieval will become increasingly important. However, it is also important to give due consideration to the governance authorities and control processes that define what content is to be made available.
Is should also be noted that any content search capability does not stand alone and needs to be fully integrated with content authoring and publication processes and systems. As such, the search process is likely to be implemented as part of integrating Records Management, Document Management and Knowledge Management solutions.

Records Management is the practice of maintaining the records of an organisation from the time they are created up to their eventual disposal. This may include classifying, storing, securing, and destruction (or in some cases, archival preservation) of records. A record can be either a tangible object or digital information, such as office documents, databases, application data, and e-mail.
The ISO 15489-1: 2001 Standard defines Records Management as "[the] field of management responsible for the efficient and systematic control of the creation, receipt, maintenance, use and disposition of records, including the processes for capturing and maintaining evidence of and information about business activities and transactions in the form of records". The standard defines “records” as "information created, received, and maintained as evidence and information by an organisation or person, in pursuance of legal obligations or in the transaction of business"
Records Management is primarily concerned with the evidence of an organisation's activities, and is usually applied according to the value of the records rather than their physical format. While there are many purposes of and benefits to records management, as both these definitions highlight, a key feature of records is their ability to serve as evidence of an event. Proper records management can help preserve this feature of records.

Many jurisdictions now make legislative provision based on the principle that government information is a public resource to be managed in the public interest. Such instruments give citizens the right to make requests to access Government documents. Similarly, where personal information is retained by an Agency, the individual has the right to request access to those records.

Tuesday, 8 January 2013

Data Quality 2013: Asia Pacific Congress

I'm delighted to be participating in the Ark Group's forthcoming Data Quality Asia Pacific Congress, to be held from 5-7 March at Citigate Central in Sydney.

The DQ Congress has been running annually since 2005 and features international keynote presenters and practitioner case studies, bringing attendees up to date information on such topics as: managing unstructured and complex data, master data management, effective data governance, new innovations and technologies; communicating about data quality.

As well as contributing to an "Ask the Experts" trouble-shooting panel, I will also be hosting a half-day interactive workshop, "Maximising value of your information assets: Creating and maintaining the Information Asset Register".

This workshop will equip participants with the understanding of key factors associated with a successful organisational approach to Information Asset Management, and focuses on practical methods and processes of building and maintaining the Information Asset Register as a building-block for maximising business value. Key topics include:
  • What do we mean by “Information as an Asset”?
  • What do I need to include in the Information Asset Register?
  • Who needs to be accountable for information?
  • How does the Information Asset Management process work?
  • How do I measure the value of an Information Asset?
  • What other Information Management best-practices do I need to be aware of? 
I hope to see you there!

Thursday, 3 January 2013

Who is the Highlander?

Whenever we discuss the need for Data Governance, one area that rightly receives significant attention is to start explicitly identifying those stakeholders who need to be more formally accountable for the organisation’s data. The roles and responsibilities of “Data Owner” and “Data Steward” are debated, and Steering Committees, Information Boards and other collective groups are eventually formed which bring the participants together on an ongoing basis.

With a fair following wind, people will start talking to each other for a change and data will be the topic of conversation. If you’re really lucky, everyone might start playing nicely in the sandpit and you might actually get some positive action.

So far, so egalitarian. However, this collective, co-operative approach is not without its challenges. I’ve worked with several organisations that have started out on a path towards more formal Data Governance but have failed to get beyond the “talking shop” stage. They’ve established working parties to focus on data quality issues, they’ve agreed formal Data Principles that define the expectations, and some groups have even put in place monitoring, measurement and cleansing services to correct “bad” data. Yet somehow the overall result is that nothing much actually changes and that “Data Governance” is ultimately seen as delivering little real value. (If things go really badly, then behaviours of distrust, protectionism, and entrenchment across territorial boundaries can even become reinforced).

Not so when I was working on a major data and analytics project with my colleagues in Germany a while back.

The project had solid foundations. The overall benefits case was strong, the investment had been approved, there was clear focus on the data rather than the technology, we had trialed the solution within our UK business operations and there was already active and willing participation from the leaders of each business line. To this point, the initiative had progressed to investment approval stage through collaboration and consensus and I was expecting this to continue as the decision-making approach for the project proper. However, when I brought everyone together to confirm their roles within the collective Governance model, the project sponsor (Projektsführer) asked the room “Who is the Highlander?” (Imagine this spoken with a thick German accent for best effect…)

At first, I thought this was a Teutonic joke at my expense – as the token “Ausländer” on the project and a proud Scotsman, we’d already exchanged plenty of jokes about kilts, bagpipes, haggis and whisky. But my amusement (and slight bemusement) wasn’t shared by the rest of the group, who were clearly treating this as a serious question. What the heck was going on?!

The next part of the conversation went something like this:

Projektsführer: “Ach so. You have seen ze film, ‘Ze Highlander’ ja? Mit Christopher Lambert und Sean Connery?”
Me: “Ummm, yes….”
PF: “Zere can be only vun.”
Me: “Errrr. Pardon?”
PF: “There can be only one. In ze film, ze immortals all die until zere is only one left, und ze Highlander is ze one who gets all their power. So for our data project, who is the Highlander? Who is ze person that we all trust to make it work?”

And then the penny dropped. Somewhat bizarrely, my German colleagues had established a common metaphor for their Governance model, based on a poor-quality but highly entertaining science fiction from the 1980’s in which a French actor plays the Scottish hero and a world-famous Scottish actor plays a Egyptian swordsman working for the King of Spain. (http://en.wikipedia.org/wiki/Highlander_(film) and http://www.imdb.com/title/tt0091203/ )

By asking “Who is the Highlander”, everyone in the room (except me) knew implicitly what was expected – that whoever was assigned the role of leading the whole initiative would be granted the authority to act on everyone’s behalf, but would be ultimately answerable to the whole group. They would be need drive the whole process, facilitate and arbitrate, and have to ensure that everything came together and be held totally accountable for achieving a successful outcome. The German approach to Governance requires strong leadership from a committed individual, as well as active and willing participation from the wider stakeholder community.

And then they all turned to me and started laughing. Because for this project, the Highlander was, quite literally, going to be a Highlander…