Where the Possibilities Are

Phil Bowermaster — Wed, 28 Jan 2015 22:36:37 +0000

Where does the value of big data truly present itself, in the data itself or in the algorithms we use to make sense of it? Bill Franks of Teradata comes down sharply on the side of the data:

…I’m convinced that new information will beat new algorithms and new metrics based on existing information almost every time. Indeed, new information can be so powerful that, once it is found, analytics professionals should stop worrying about improving existing models with existing data and focus instead on incorporating and testing that new information.

By “new information,” he means information that didn’t exist before or that we now have to a level of depth never before possible. Sensor data in Internet of Things environments can represent either of these kinds of data. For example, we may have always used temperature data in performing some calculation, but back in the day we used a daily average. Now we have sensors providing temperature data every few minutes (or seconds.) That’s data to a greater depth. For data that we didn’t have before, Bill cites sensors on cars that track wear and tear as the vehicle is driven. Previously, vehicle repair occurred in a primarily reactive way. Now we can begin to anticipate repairs before they are needed.

Somehow this reminds me of a talk that Eliezer Yudkowsky gave at the Singularity Summit back in 2007. He said:

In the intelligence explosion the key threshold is criticality of recursive self-improvement. It’s not enough to have an AI that improves itself a little. It has to be able to improve itself enough to significantly increase its ability to make further self-improvements, which sounds to me like a software issue, not a hardware issue. So there is a question of, Can you predict that threshold using Moore’s Law at all?

Geordie Rose of D-Wave Systems recently was kind enough to provide us with a startling illustration of software progress versus hardware progress. Suppose you want to factor a 75-digit number. Would you rather have a 2007 supercomputer, IBM’s Blue Gene/L, running an algorithm from 1977, or a 1977 computer, an Apple II, running a 2007 algorithm? And Geordie Rose calculated that Blue Gene/L with 1977′s algorithm would take ten years, and an Apple II with 2007′s algorithm would take three years.

There is a progression here, albeit a counter-intuitive one. We might be inclined to think that hardware adds more value than “mere” software and that software is inherently more valuable than “mere” (or the term we like to throw around a lot is “raw”) data. The opposite turns out to be the truth. The data itself is where the value is. Hardware and software only help us to focus on the potentialities, the possibilities, that it already contains.

Datafication in Three Easy Steps

Phil Bowermaster — Sat, 20 Sep 2014 04:48:25 +0000

The relentless wave of change that is transforming our world from being one made primarily out of stuff to one made primarily out of data has a name. It’s called datafication.

Over the past few decades, we have witnessed the datafication of business, of society, and of everyday life. There appear to be three major phases of datafication. In the first phase, an activity or process becomes increasingly reliant on data. In the second, data begins to transform the activity or process by taking a central role in its execution. In the third phase, the activity is moved entirely into the data substrate.

Take the movie business. Putting artistic considerations aside, the success of any film has always been a measurement of how much revenue it generates. Originally, this was a pretty straightforward matter of counting box office receipts. (Today, what with many and varied distribution channels and considerations such as licensing and merchandising that often come into play, the math for calculating success is considerably more complex.) The film industry entered the first phase of datafication relatively early on, as studios began trying develop formulas for repeat box office success. have witnessed the datafication of business, of society, and of everyday life. There appear to be three major phases of datafication. In the first phase, an activity or process becomes increasingly reliant on data. In the second, data begins to transform the activity or process by taking a central role in its execution. In the third phase, the activity is moved entirely into the data substrate.

The data points were, at first, relatively few and far between: geographic differences in box office; one star’s draw vs. another; westerns vs. romances vs. war movies vs. musicals; summer releases vs. Christmas releases. Over time, the analysis evolved in terms of sophistication until the industry reached the second phase of datafication. This is how we came to live in an age of scripts written for a target adolescent male audience and re-edits and even rewrites following test screenings. The data began to drive the process.

But data wasn’t done with the movies yet. The film industry is moving rapidly into the third phase of datafication. Once upon a time, filmmakers made films. Long strips of celluloid with images on them. We’ve all heard of efforts to preserve decaying movies from the early part of the last century. Film was a chemical and mechanical process resulting in a physical artifact. But not today. The product of the film-making process is now essentially a data artifact. Movies are consumed over digital networks on TVs, laptops, and smartphones. And, in fact, they can now be made entirely on smartphones. Short messages, tweets, motions pictures…it’s all the same. It’s all data.

The big data revolution is ultimately about this kind of transformation in all sectors of all industries. The movie and music businesses are obvious examples of industries that have made it at least part of the the way to phase three. But then so is the telecommunications industry. Shipping and logistics have become as much about data as they are about moving stuff around. Even manufacturing is moving in that direction — and will continue to do so as digital fabrication and 3D printing become increasingly mainstream.

Right now the world as whole is really just beginning to move from phase 1 to phase 2. Data is beginning to influence and direct the world in ways never before considered. And we are still in the very early days.

The Speculist » Big Data

Where the Possibilities Are

Datafication in Three Easy Steps