The biggest of the big data challenges

At this week’s World Congress on Information Technology in Montreal, a Microsoft researcher makes some interesting points about how complex the tools we’re using to sift through information have become


Some time ago, Danah Boyd was sitting on a panel next to an executive from Coca-Cola, talking about social media in general, and in particular about being “liked” on Facebook.

“He talked about all the people who were saying they like Coke, and that they were so happy to have all these followers,” she said. “I burst out laughing, because that was not the ‘coke’ many of them were talking about.”

Boyd, a senior researcher with Microsoft, made her comments on yet another panel discussion at the 18th World Congress on Information Technology (WCIT), which took place this week in downtown Montreal. The session was really talking about how social media was creating a new kind of “public square,” but many of Boyd’s comments seemed to explain a lot about why big data is becoming such a controversial subject in organizations that aren’t sure how they are going to handle all the unstructured information found on things like Facebook, Twitter and in their own messed-up databases.

While most experts talk about big data in the context of three “Vs” – volume, variety and velocity – there’s a sort of assumption that with the right tools and the necessary sorting of data, accurate insights will help organizations improve the way they serve customers. But Boyd warned that we may be too trusting in the tools that are themselves becoming more complex than ever before.

“In order to navigate these data issues that we face, we have all this work being done by algorithms, so that even engineers don’t even know what’s being done,” she said. “If you think about PageRank at Google, no one even knows anymore what the complex nature of PageRank looks like.”

As the algorithmic nature of systems go beyond anyone’s knowledge, Boyd said she believes the biggest problem may be the biases that gets baked into them.

“We want them to be neutral, but they’re not,” she said. “As that evolves you’ll see more ssumptions being made about people based on misperceptions about them. This is actually a much more messy space, and we’re seeing a lot more of that as these things scale.”

So many organizations I talk to are just trying to figure out what represents “big data” in their organization or industry sector. They are still evaluating the tools, and often ignoring the recommendation engines and other systems that already exist which are processing big data right in front of them. Given the pressure on CIOs to be customer-focused and innovative, there may be a rush to get insights as soon as possible, but maybe IT departments need to have a more sustained conversation about “false positives” in big data first, and how to recognize them.

Many WCIT speakers talked about the challenges of complexity in IT, but there’s no getting away from the fact that much of what IT departments will be running in the future will be a lot like the sophisticated circuitry in the average car today. Most of us don’t even bother to look under the hood. Boyd’s comments are a good reminder that while CIOs may not be able to master what’s under the hood either, they need to make sure the car they’re buying isn’t going to take them – and their organization – in the wrong direction.

The foundation for big data behinds with the right network. Download our free whitepaper, “The IP Imperative: Why the Time to Upgrade Your Network is Now,” by Frost & Sullivan.

Comments are closed.