September 30th, 2011
September 24th, 2011
There are a few articles collecting dust in my feed reader from a period a couple months ago when I didn’t have much time staying up to date on what’s happening… here’s the rundown of the most important stuff about Hadoop.
- Hardware for Hadoop
- Why you would want an appliance — and when you wouldn’t
- Some notes on Hadoop (mainly) and appliances
- EMC Puts Database & Hadoop in Same Big Data Analytics Box
Yahoo!’s Hortonworks Hadoop distribution
September 21st, 2011
And another one leaving the relational world for their DBaaS offering. It’s probably easier to manage as a service than Oracle…
Today SAP announced that they are using MongoDB as a core component of SAP’s platform-as-a-service (PaaS) offering. MongoDB was selected for the enterprise content management (ECM) section of the platform, as its flexibility and scalability will enable SAP to scale its content management service on its PaaS offering to meet customers’ requirements while managing data from different applications
September 21st, 2011
There’s a lot going on in the NoSQL world, or maybe Derrick Harris was just exceptionally busy last night…
- EMC throws lots of hardware at Hadoop
- DataStax gets $11M, fuses NoSQL and Hadoop
- Neo raises $10.6M for Neo4j as graph DBs take off
- NoSQL Database Company Neo Technology Raises $10.6 Million
September 19th, 2011
Oracle Openworld must be close, because the rumour mill starts heating up… Piper Jaffray is predicting that Oracle will release an Exadata Mini machine that will fit under ones desk (via DBMS2). And Jean-Pierre Dijcks compiled a list of Big Data related sessions at Openworld, Big Data may very well be the key note topic, I hear, so it’s worth spending some time at these sessions.
September 12th, 2011
David Menninger writes a nice intro to Splunk in Splunk Makes Machine-Generated Big Data Serve Analytics:
Splunk focuses on a specific segment of the big-data market: machine-generated data. This type of data originates constantly from many sources throughout an organization and in large quantities. The other common characteristic of machine-generated data is that generally it is less structured than data in typical relational databases. Often the information is captured as logs consisting of text files containing various record lengths and record structures. To effectively utilize this loosely structured information in real time, two challenges must be overcome: loading the data quickly and easily navigating through and analyzing the information once it is loaded.
September 6th, 2011
Nati Shalom throws one in for Big Data Application Platforms:
Big Data Application platforms are unique in the sense that they need to be able handle massive amounts of data and therefore need to come with built-in support for things like Map/Reduce, Integration with external NoSQL databases, parallel processing, and data distribution services and on top of that, they should make the use of those new patterns simple from a development perspective.Below is a more concrete list of the specific characteristics and features that define what Big Data Application Platform ought to be. I’ve tried to point to the specific Java EE equivalent API and how it would need be extended to support Big Data application.
via the High Scalability blog.
May 17th, 2011
Good 28 page whitepaper on NoSQL for SQL Server developers, first familiarizing the reader with NoSQL, then showing what NoSQL options there are in the Microsoft and Azure stack. Also a fair bit of positioning and what are appropriate use cases for NoSQL.
May 9th, 2011
Reported and analysed by Tony Baer in OnStrategies Perspectives, and reported by Derrick Harris in GigaOm’s in EMC, NetApp Make It a Big Day for Big Data Star Hadoop, we learn that EMC is using the on-going EMC World conference to its potential, and is announcing that they’re growing the Database division with the decision to sell their own Hadoop distribution with value add management tools and integration. I expect to see more soon.
April 28th, 2011
In the long run, we also expect IBM to make a stab at Hadoop and related technologies by extending its InfoSphere offerings -– it can see Cloudera-Informatica and Cloudera-MicroStrategy raise it one with its own InfoSphere DataStage and Cognos offerings, before it even talks about partnerships. Today we saw a shot from left field – Yahoo which invented the technology – is now saying it might spin off its Hadoop business to go up against Cloudera, and potentially IBM. In a way, its closing the doors after the horses left the barn as the creator of Hadoop is now part of Cloudera.
For Yahoo, this would clearly be a shot out of its comfort zone, as it is not a tools company. But it is hungry for monetizing its intellectual property, even if that property has already been open sourced. It’s redolent of Sun striving to monetize Java and we all know how that went. Obviously this will be an uphill battle for Yahoo, but at least this would be a spinoff so hopefully there won’t be distractions from the mother ship. Given Yahoo’s fortunes, we shouldn’t be surprised that they are now looking to maximize what they can get out of the family jewels.
More commercial offerings in NoSQL can only be a good thing.