The (Shrinking) Growing Data Footprint

By | November 22, 2014

footprintAt the recent SAP Teched && decode in Berlin (and, no — for those unfamiliar, there are no typos in my presentation of the event’s name) Bernd Leukert, a member of the executive board of SAP SE Products and Innovation, led a keynote session touching on several of the themes I have been writing about here recently. Using as a guide Nicholas Negroponte’s vision as outlined in his book Being Digital (1996), Leukert makes the case that we are, indeed, in transition “from a world made out of stuff to a world made out of data…and stuff” — to quote my recent re-articulation of Negroponte’s basic idea.

And he adds an interesting wrinkle.

Where I have been making the case that, part and parcel with the big data phenomenon, the data footprint of real-world stuff is growing exponentially (even as “stuff” as we know it becomes smaller and less substantial in almost every other regard) Leukert makes the case that new in-memory database technologies are actually going to shrink the data footprints of businesses, by eliminating data indices and aggregates.

On the one hand, this is hardly an unfamiliar argument. Before in-memory was a thing, the columnar database vendors made very similar claims for data warehouses running on, say Vertica or Sybase IQ. With a columnar database, the argument went, you could make any query into the data and get an answer back fast without having to create all these copies of the data, which ultimately is what summaries, indexes, aggregates, cubes and even data marts are. So your data warehouse could become a lot smaller and a lot faster. Win win!

Now Leukert expands that argument, referring to SAP HANA environments, showing how the reduction of indices and aggregates from both the operational and analytical data within an organization can lead to a significant reduction of the overall enterprise data footprint. He gives the example of a typical financial booking which updates 15 separate database records, showing that in a simplified in-memory enterprise environment that number can be reduced to four database records. He goes on to claim that SAP has itself managed a 14X reduction in data footprint, with a 30X reduction expected overall.

Those are massive reductions, and should map to massive savings. And so an interesting race begins, between the explosive growth of data that the Internet of Things and other big data drivers are bringing about, and the substantial reduction that columnar databases, in-memory processing, and other technological developments can bring about. Which side will win? Maybe we really can do more with less. Or maybe these technologies simply help to curb the otherwise uncontrollable growth of big data.

Stay tuned.

Oh, here is Leukert (and associates’) entire talk. Most of the stuff about Negroponte is at the beginning, but he does come back to it at least once in the middle and then again at the end. At nearly an hour and 45 minutes, this is not as trying on the patience as many keynotes I have endured. Plus if you watch the whole thing, you’ll learn about the Internet of Toilets.

And, no, I’m not making that up!