Data is accumulating in firms at an exponentially faster rate than firms are learning to analyze it. That mismatch, between the growth of data and firms' ability to make decisions based on it, is the biggest disjunction in global commerce.
Narrowing the gap between data generation and analytical capability is a strategic imperative for firms. Unfortunately, in making their long term plans for data analytics capability many firms have the wrong commercial instincts. They are planning to keep their data closely guarded and develop in-house analytics expertise, supplemented by a single external analytics partner. That is the opposite of what many firms should do. Instead, they should be planning to out-source the analysis of their data in a framework in which analytics firms compete for compensation based on how much value they create.
There is a firm named Kaggle (now owned by Google) that organises competitions in which a prize is offered for the best solution to big data problems. The founder and CEO, Anthony Goldbloom, is an alumnus of the University of Melbourne.
In one Kaggle competition the Spanish bank Santander asked contestants (data analysts) to solve the problem of early identification of dissatisfied customers.
Kaggle's most famous competition, thus far, asked contestants to analyse MRI images of a large number of heart disease patients. The task was to develop a way of automatically calculating the pumping efficiency of hearts from MRI scans. It is worthwhile listing the profound results of that competition:
- The winning algorithm is as accurate as cardiologists in using MRI images to measure the pumping efficiency of patients' hearts. But what takes cardiologists 20 minutes can be completed in seconds with the winning algorithm running on very powerful computers.
- Nearly 1000 contestants participated, but the winning team members, Tencia Lee and Qui Lui, came from a completely unexpected field - they are both former hedge fund employees.
- Lee and Lui had not met before the competition began. They were placed 8th and 9th in the first round of the competiton and then teamed up to form the winning team in the second round.
- They used a technique (convolutional neural networks) that they had no previous experience with. Further, they used open source software to implement the technique.
I love that story - everything about it is so 2017 (except that it actually happened in 2016). But, before drawing any conclusions from it we should at least acknowledge the reporting bias. If that competition had been won by radiologists, instead of hedge fund quants, then we would never have heard about it. It is only news because it is unexpected. "Man bites dog" is news but "dog bites man" does not get reported. We should not then infer that "Man bites dog" is the norm just because it was reported.
I recounted the story because it illustrates guiding principles for firms that are creating their data analytics strategies. Three principles in particular come out - openness, outsourcing and competition.
1. Openness of data: There are big benefits to firms opening up their data to analysis by outsiders. Many firms need to flip their mindset about allowing outsiders to access their privately collected data.
2. Outsourcing (make or buy): For most firms it is hopelessly naïve to think that they can develop world class data analytics in-house;
3. Competition: The best analysts of a corporate data problem are not necessarily experts in that industry. Firms can't know who will be best at analysing their data, but competition is helpful in bringing forward the best techniques.
Openness of data
Please note that when I refer to openness of data I don't mean the 'open data movement' or 'data democracy' or any other utopian vision of data sharing. I am talking about firms allowing a well defined set of expert data analysts to access their data from a private data room.
A lot of CEOs and boards have exactly the wrong mindset about the openness of their data to analysis by outsiders. They should be thinking 'why wouldn't we allow outsiders to create value by analysing our data, in exchange for a share of the value created?'. But boards often can't think in terms of openness because they are so attuned to scanning their environment for governance issues and risk minimisation. Instead, they default to 'why would we let anyone see our data?'.
That mindset severely narrows the possible strategies for closing the gap between data generation and analysis. And, it is unnecessay. Yes, there are factors that point in the direction of keeping data in-house: especially, customer privacy, competition issues and regulation. But it is easy to exaggerate those concerns.
Customer privacy is more an issue in selling data. In any case, it is usually straightforward to de-personalise data before it is shared with outside analysts.
Competition concerns are sometimes real, but often over-stated. Firms that are engaged in innovation races, or exploration, will want to hide their data. Firms for which their data is a direct input into the creation of their product, such as banks undertaking credit analysis, will worry about competitors accessing their data. Firms that must make choices, such as pricing, which can be 'gamed' by competitors have reason to fear leakage of those decisions.
But, for those fears to be realised external analysts would have to breach their non-disclosure agreements and the competitors would be engaging in industrial espionage. Competitors mostly need to focus on analysing their own data if they are to keep up.
Finally, regulators' concerns about data openness sometimes need to be managed, but in most industries there are no rules that govern the sharing of data with external analysts.
CEOs and boards should be clear on where their firm sits on the openness spectrum. A few firms will need to keep their data close to their chest. They are on the right hand side of the spectrum. Others can be very open with their data and they are on the left.
If data analytics is so central to value creation then why wouldn't firms develop their own analytics capability in-house? If data analytics is actually the product, such as is it for firms like Palantir, Google or Facebook, then of course the capability will be in-house. The same applies if data analytics enters directly into the creation of the product, such as it does for Uber in route planning or ANZ in credit scoring -- then in-house data analytics capability is needed.
But for most firms data analytics enters into a function, not a product. It goes into better marketing, better design, operations, procurement, recruitment, etc. In that case data analytics is an input to production. Like any input to production the firm faces a 'make or buy' decision. Should a manufacturer buy parts that go into production, or alternatively, make them in the firm? Should a mining firm conduct mining with its own employees and equipment, or should it use a mining contractor? Should we have a computing department or a contract for provision of computing services?
Data analytics is no different. Each firm must decide how much capability will be built inside the firm and how much will be brought in through contracts with expert outsiders. And, where data analytics is contracted for, the firm must choose the types of relationships that will be developed with the data analytics firms.
If a firm does outsource its data analytics capability then the role for the firm's insiders is: the collection of the right data; improving the quality of the data; framing the right questions. That is more than enough for the insiders of the firm to focus on. Let the outsiders analyse the data and combine it with other data.
Relationships with suppliers
If a firm decides to out-source its data analytics capability, then should the firm have an exclusive arrangement with a single data analytics provider? Or, take the opposite approach of making the data available to all comers in a form of competition in which firms are rewarded for the value that they create from analysing the data? Naturally, most firms will do something between these extremes.
Exclusive relationships are important if the data security is a key issue or if the analysis of the data requires a deep understanding of the firm. It is easy to exaggerate the importance of either of these conditions - data security or firm specificity - in data analytics. Moreover, many parties have an incentive to overstate these conditions. CEOs and boards need to push back on that.
For many firms the best relationship with outside suppliers of data analytics capability won't be deep, exclusive relationships with single analytics suppliers, but instead a competition between a set of suppliers who all have access to the firm's data. More suppliers means more competition. Even more importantly competition will reveal which outsiders are most capable of analysing your firm's data. Otherwise, how would you know? - that is a key point in the Kaggle example. Who could have known, without a competition, that the hedge fund pair had the best technique?
Open, outsourced, competition
A few people have told me that I am naïve to think that firms can open up their data and introduce competition into its analysis by outsiders. My response is that is naïve to think that firms with an ever widening gap between their volume of data collected and capacity to analyse it can survive and thrive into the future.
CEOs and boards need to be brave about data analytics. Many firms will never close the gap between data generation and analytics capability unless they adopt a more open, outsourced, competitive mindset for their data strategy.
Copyright 2017 Sam Wylie