He only heard the term “big data” about three months ago, but Judea Pearl could wind up being the man who has the biggest impact on the way organizations of every kind derive meaning from vast amounts of information.
Pearl, a professor at UCLA, was in Toronto this week to deliver the 2012 Turing Award Lecture at the AAAI Conference on Artificial Intelligence. The A.M. Turing Award is sometimes described as the Nobel Prize for computer science, and Pearl’s work was honoured earlier this year for developing a mathematical theory of probability that might get us closer to machines that can really reason in the way that human beings do.
For example, while it’s reasonably straightforward to teach a computer to see things in black and white, recognizing shades of gray is more difficult. Pearl’s work, which he calls metasynthesis, helps IT systems grapple with uncertainty and cause-and-effect relationships. For all the complaints you hear about Apple’s virtual assistant on the iPhone 4S, Siri, the research Pearl has conducted makes such innovations possible.
Now Pearl believes his theory could address some of the issues around big data, which he defines as the task of taking data from a variety of disparate environments, each conducted under different conditions, and trying to get an answer to a question.
“I see a lot of expectations from big data built on the assumption that there will be an increase of understanding and an increase of interpretability, the ability to predict,” he told expertIP. “Those things do not normally come together. You can have infinite samples of something and not be able to interpret anything.”
Pearl used the example of taking data from medical studies of thousands of hospitals and millions of patients and trying to figure out whether a specific group of patients should get an operation. “How can you squeeze out of (medical studies) that which is common, and that which is common with the question?” he asked. “What people do in medicine is take a weighted average of all the studies. That’s like taking the weighted average of apples and oranges to discover the properties of bananas.”
The key, according to Pearl, is identifying areas of “invariance” – the anomalies that pinpoint what causes things to happen.
This is not an easy task, and Pearl’s theory still needs to be translated into analytic applications that will probably require major network capabilities, but his work is a good reminder that there are no short cuts to doing big data right. As this kind of research reaches the commercial market, we need to start talking more about probability before we try to guarantee predictability.