Oracle doing Hadoop and NoSQL

September 30th, 2011

Oracle docs show plans for Hadoop, NoSQL about the Oracle Loader for Hadoop, and Added Session: Big Data Appliance about Oracle’s Big Data Appliance.

Hadoop Update

September 24th, 2011

There are a few articles collecting dust in my feed reader from a period a couple months ago when I didn’t have much time staying up to date on what’s happening… here’s the rundown of the most important stuff about Hadoop.

Hadoop Appliances?

Yahoo!’s Hortonworks Hadoop distribution

And another one leaving the relational world for their DBaaS offering. It’s probably easier to manage as a service than Oracle…

Today SAP announced that they are using MongoDB as a core component of SAP’s platform-as-a-service (PaaS) offering. MongoDB was selected for the enterprise content management (ECM) section of the platform, as its flexibility and scalability will enable SAP to scale its content management service on its PaaS offering to meet customers’ requirements while managing data from different applications

Via MongoDB Selected as the Core Content Management Component of SAP’s PaaS Offering.

NoSQL Updates

September 21st, 2011

There’s a lot going on in the NoSQL world, or maybe Derrick Harris was just exceptionally busy last night…


Focus on Big Data and Exadata Mini

September 19th, 2011

Oracle Openworld must be close, because the rumour mill starts heating up… Piper Jaffray is predicting that Oracle will release an Exadata Mini machine that will fit under ones desk (via DBMS2). And Jean-Pierre Dijcks compiled a list of Big Data related sessions at Openworld, Big Data may very well be the key note topic, I hear, so it’s worth spending some time at these sessions.

David Menninger writes a nice intro to Splunk in Splunk Makes Machine-Generated Big Data Serve Analytics:

Splunk focuses on a specific segment of the big-data market: machine-generated data. This type of data originates constantly from many sources throughout an organization and in large quantities. The other common characteristic of machine-generated data is that generally it is less structured than data in typical relational databases. Often the information is captured as logs consisting of text files containing various record lengths and record structures. To effectively utilize this loosely structured information in real time, two challenges must be overcome: loading the data quickly and easily navigating through and analyzing the information once it is loaded.

Big Data Application Platform

September 6th, 2011

Nati Shalom throws one in for Big Data Application Platforms:

Big Data Application platforms are unique in the sense that they need to be able handle massive amounts of data and therefore need to come with built-in support for things like Map/Reduce, Integration with external NoSQL databases, parallel processing, and data distribution services and on top of that, they should make the use of those new patterns simple from a development perspective.Below is a more concrete list of the specific characteristics and features that define what Big Data Application Platform ought to be. I’ve tried to point to the specific Java EE equivalent API and how it would need be extended to support Big Data application.

via the High Scalability blog.

Good 28 page whitepaper on NoSQL for SQL Server developers, first familiarizing the reader with NoSQL, then showing what NoSQL options there are in the Microsoft and Azure stack. Also a fair bit of positioning and what are appropriate use cases for NoSQL.

Reported and analysed by Tony Baer in OnStrategies Perspectives, and reported by Derrick Harris in GigaOm’s in EMC, NetApp Make It a Big Day for Big Data Star Hadoop, we learn that EMC is using the on-going EMC World conference to its potential, and is announcing that they’re growing the Database division with the decision to sell their own Hadoop distribution with value add management tools and integration. I expect to see more soon.

Yahoo is considering to turn Hadoop into a business, as reported by the Wall Street Journal. Ovum’s Tony Baer has a more detailed analysis at his blog in Yahoo to Hadoop: Show me the Money.

In the long run, we also expect IBM to make a stab at Hadoop and related technologies by extending its InfoSphere offerings -– it can see Cloudera-Informatica and Cloudera-MicroStrategy raise it one with its own InfoSphere DataStage and Cognos offerings, before it even talks about partnerships. Today we saw a shot from left field – Yahoo which invented the technology – is now saying it might spin off its Hadoop business to go up against Cloudera, and potentially IBM. In a way, its closing the doors after the horses left the barn as the creator of Hadoop is now part of Cloudera.



For Yahoo, this would clearly be a shot out of its comfort zone, as it is not a tools company. But it is hungry for monetizing its intellectual property, even if that property has already been open sourced. It’s redolent of Sun striving to monetize Java and we all know how that went. Obviously this will be an uphill battle for Yahoo, but at least this would be a spinoff so hopefully there won’t be distractions from the mother ship. Given Yahoo’s fortunes, we shouldn’t be surprised that they are now looking to maximize what they can get out of the family jewels.

More commercial offerings in NoSQL can only be a good thing.