I couldn’t help but think of these two together, because I happened to read them within hours.

The bright future of applied statistics:

I think that the data revolution is just getting started. Datasets are currently being, or have already been, collected that contain, hidden in their complexity, important truths waiting to be discovered. These discoveries will increase the scientific understanding of our world. Statisticians should be excited and ready to play an important role in the new scientific renaissance driven by the measurement revolution.

And Stephen Few ranting about the term “Big Data” in A More Thoughtful but No More Convincing View of Big Data:

I have a problem with Big Data. As someone who makes his living working with data and helping others do the same as effectively as possible, my objection doesn’t stem from a problem with data itself, but instead from the misleading claims that people often make about data when they refer to it as Big Data. I have frequently described Big Data as nothing more than a marketing campaign cooked up by companies that sell information technologies either directly (software and hardware vendors) or indirectly (analyst groups such as Gartner and Forrester).

Isn’t Big Data just a (marketing) term for the category of data sets that are difficult to store and analyze with traditional tools? Obviously what size and tools we’re talking about is changing over time…

Banks Using Big Data to Discover ‘New Silk Roads’:

JPMorgan Chase & Co., the largest commercial bank in the U.S., generates a vast amount of credit card information and other transactional data about U.S. consumers. Several months ago, it began to combine that database, which includes 1.5 billion pieces of information, with publicly available economic statistics from the U.S. government. Then it used new analytic capabilities to develop proprietary insights into consumer trends, and offer those reports to the bank’s clients. The technology allows the bank to break down the consumer market into smaller and more narrowly identified groups of people, perhaps even single individuals.

Now even banks start selling their customers, just like all the Web 2.0 startups. Read further to get some additional ideas what banks can do with the data that they have about their customers that’s less about selling customer data and more about knowing the customer better to offer him/her better services.

Splunk rides machine data wave, expands enterprise footprint:

Splunk is turning departmental deals into enterprise wide license agreements and becoming a machine data staple for many companies. Splunk has managed to ride security and a bevy of other machine-data use cases to emerge as an enterprise-wide platform.

There’s your high-level plan on how to get into the enterprise. Well done, Splunk!

Big Data brings intelligence-based security, RSA chief says:

Big data will transform the way enterprises architect and manage security and will finally help get the good guys out in front of the bad guys, said Art Coviello, executive vice president of EMC and executive chairman of RSA.

He said an “intelligence-driven model can be made future proof. It evolves and learns from change”. He added that such a system can detect anomalies and respond to them.

Another industry where the mere availability of Big Data changes everything… not. But used well, Big Data can certainly help find additional threats that otherwise would have gone unnoticed for longer.

Are these the world’s most innovative big data companies?:

Fast Company, the American business-minded glossy magazine, has published the latest version of its “Most Innovative Companies” franchise, and there’s a breakout section for what it calls “big data companies.” (Though, if the hype is to be believed, we are all big data companies.

  1. Operations-improver Splunk
  2. Tech-trend tracker Quid
  3. Data scientist tournament host Kaggle
  4. Credit rating revolutionary ZestFinance
  5. Electronic medical record streamliner Apixio
  6. Business intelligence visualizer Datameer
  7. Marketing modeler BlueKai
  8. Enterprise social media simplifier Gnip
  9. Brick-and-mortar customer analyzer RetailNext
  10. Compliance catalyst Recommind

Haven’t even heard of half of them, got some reading to do :-)

Venture capital pouring into Big Data is not really slowing down… DataStax Raises $25 Million in Third Round of Funding:

DataStax today announced the completion of a $25 million C round of funding led by Meritech Capital Partners, with participation from existing investors Lightspeed Venture Partners and Crosslink Capital. This latest round of capital comes in the wake of explosive demand for DataStax, with company bookings increasing by over 400 percent in 2012. DataStax will use the funds to further enhance its Big Data platform and increase the value for current customers while driving global customer acquisition

via $25 Million in C Round for DataStax.

5 ideas to help everyone make the most of big data:

1. Hadoop isn’t for everything

2. Big data makes data science easier

3. “Sometimes it’s more important to know what to kill.”

4. Context adds value

5. Transaction data trumps search data.

Distilled wisdom of two days of IE Group’s Big Data Innovation event.

Data Collective

August 10th, 2012

Data Collective, a single-minded VC firm:

We invest in entrepreneurs building big data companies.

With a new VC model, as Sarah Lacy explains. They got a point – you do need deep know-how to find the right early-stage Big Data startup needing seed funding!

Dilbert Digs Big Data

July 30th, 2012

Dilbert comic strip for 07/29/2012… It hears you!

GigaOm kicked off some good discussion in Why the days are numbered for Hadoop as we know it:

Hadoop is everywhere. For better or worse, it has become synonymous with big data. In just a few years it has gone from a fringe technology to the de facto standard. Want to be big bata or enterprise analytics or BI-compliant? You better play well with Hadoop.

It’s therefore far from controversial to say that Hadoop is firmly planted in the enterprise as the big data standard and will likely remain firmly entrenched for at least another decade. But, building on some previous discussion, I’m going to go out on a limb and ask, “Is the enterprise buying into a technology whose best day has already passed?”

Realtime is king. Low latency his queen. Or the other way around ;-) I don’t think anybody who would refuse getting their results faster, they’re just not complaining about Hadoop and MapReduce today because they don’t know better. Or rather, because better solutions aren’t available at the right price point yet.