musings and one liners

Tag: MapReduce

For fast, interactive Hadoop queries, Drill may be the answer

Drill is Apache’s Dremel.

In the era of big data, there is increasing demand for ever-faster ways to analyze — preferably in an interactive way — information sitting in Hadoop. Now the Apache Foundation is backing an open-source version of Dremel, the tool Google uses for these jobs, as a way to bring that speedy analysis to the masses.… Continued

August 21, 2012
Now it’s VMware’s turn: Meet Spring Hadoop

Now it’s VMware’s turn: Meet Spring Hadoop. MapReduce in Spring, MapReduce in Excel… exciting times!

March 2, 2012
Forthcoming Oracle Appliances

Curt Monash about Forthcoming Oracle appliances, based on information from Oracle’s earnings call (full transcript) last week. There will be an IMDB appliance based on TimesTen for high speed analytics, and a Hadoop appliance for MapReduce jobs, targetted at data preprocessing and feeding into Oracle.… Continued

July 5, 2011
Realtime Scalable NoSQL with Yahoo! S4

There’s movement in the realtime NoSQL world. As GigaOm reports in Yahoo Open-Sources Real-Time MapReduce, Yahoo! is the first to release a large scale implementation of a more realtime oriented NoSQL system (don’t think it’s Hadoop or even MapReduce based), which will allow to query data pretty much as it’s added to the system.… Continued

November 5, 2010
What’s Essential – And What’s Not – In Big Data Analytics

Very good article about What’s Essential – And What’s Not – In Big Data Analytics. Starts with a Big Data Analytics overview, then dives into the columnar vs. row based DBs debate (only to find that that’s ultimately not generally important, as all these systems are built to scale, and it depends on your data and requirements which DB engine handles it best).… Continued

October 19, 2010
MapReduce and Hadoop Future

Following up on Google dumping MapReduce, there are now a couple articles available that shed more light onto that decision and what it means for MapReduce. Go read MapReduce and Hadoop Future and then Google’s Dremel – or, Can MapReduce Itself Handle Fast, Interactive Querying?… Continued

October 12, 2010
eBay replaces Greenplum with Teradata

A quicky: eBay followup — Greenplum out, Teradata > 10 petabytes, Hadoop has some value, and more. Interesting to see that the impression is that Greenplum got thrown out more for reliability reasons than performance. EBay also was repeatedly mentioned as a key customer using the MapReduce integration piece in the past, there’s also an update on that.

October 7, 2010
Why There Won’t Be a LAMP For Big Data

Stephen O’Grady puts some thoughts into words in Why There Won’t Be a LAMP For Big Data that I also had when reading Edd Dumbill’s The SMAQ stack for big data (Storage, MapReduce and Query).

It is not clear to me that we will have, at any point in the future, a LAMP equivalent for big data.… Continued

October 4, 2010
Teradata, Cloudera team up on Hadoop data warehousing

Does anybody remember as far back as two months ago? That’s when I asked

All these connectors being announced makes me think there’s somebody out there with a matrix of RDBMS and NoSQL systems, looking at which combinations don’t have a marketable connector yet so he can be first to market.… Continued

September 16, 2010