I’ve been wanting to follow up on this for a while, so without further ado, here are some more Sybase IQ 15.2 New Features.

Chen Shapira takes a look at Mike Stonebraker’s 2007 paper The End of an Architectural Era about HStore, his modern implementation of an OLTP database system. Her article It’s the End of the World As We Know It (NoSQL Edition) is a good write-up and analysis for those (such as me) who don’t have time to read and digest the full paper. I would only disagree on one point – HStore is still very much a SQL database, and doesn’t fall into the NoSQL category.

Curiously enough she doesn’t mention VoltDB, the commercial implementation of HStore Stonebraker is now bringing to the markets. I therefore didn’t notice that her post was related to what I was writing about VoltDB last week when I flagged it for later reading.

Now, depending on your viewpoint, that title could just as well read Keep a Hadoop Cluster in Your Back Pocket. So we’re talking about when it makes sense to combine an old fashioned SQL RDBMS with a fancy and modern NoSQL system from both angles.

In defence of SQL (found via myNoSQL) is voting to keep an RDBMS around for the use cases where SQL excels.

Awkward it may be, but SQL is a lot more succint and readable than multiple lines of API calls or crazy, math-like relational algebra languages. And there’s nothing intrinsically slow about the language itself. If you could run “SELECT * FROM table WHERE …” on Cassandra, it would be no slower than specifying the same conditions via API calls.

Netezza blogger Phil Francisco, on the other hand, explains how it makes sense for some of their customers to use Hadoop as large online archive for their colder data.

We have seen customers deploy [patterns] in which the Hadoop Cluster is used for long-term data retention, or as a “queryable archive”. Here one could think of Hadoop as a complementary analytic extension of the Netezza TwinFin when there is far less premium placed on low-latency or high-performance. [...] the queryable archive could also retain long-term copies of structured data that had previously been loaded into the high-performance TwinFin appliance.

There you go. Let me know if you have any other thoughts about how to combine SQL and NoSQL for useful use cases.

A quick post on a Linux for Oracle Tuning Checklist I found in Ronny Egner’s blog. If you’re doing Oracle on Linux, this is a must read (best while sitting at a Linux terminal), else just ignore this post.

Greg Rahn’s post about The Core Performance Fundamentals Of Oracle Data Warehousing – Set Processing vs Row Processing is so good, everybody considering a migration from a standard RDBMS to a VLDB platform (such as Exadata, as in Greg’s example) should be forced to read it.

To paraphrase Greg: the performance improvements of the new system not only allow to run today’s jobs faster, but will allow you to do jobs that were entirely impossible with the old system – if you’re willing to do a little re-engineering, and throw away old assumptions and ‘optimizations’ that make your code slow instead of fast on the new platform.

What is Analytics?

July 22nd, 2010

Sybase’s Phil Bowermaster is trying to shed some light on the question of What is Analytics?

The vital distinction is this: advanced analytics involves more than just slicing and dicing of the data. [...] Ultimately, it’s this reliance on models that sets advanced analytics apart from other types of BI analysis. When a business takes a look at data to try to improve decisions and performance, that’s business intelligence. When a business compares incoming data with a model in order to achieve deeper understanding, deal with human behavior in real time, or predict what’s going to happen next, that’s advanced analytics.

Oracle Licensing

July 21st, 2010

Go read the Licensing Consulting blog! I know this is a non-technical post for once, but it’s a very good read for anybody remotely interested in the financials and business methods of Oracle and other Database vendors, to some extent. Good posts to start with are  Oracle ULA contract agreement risk factors and The Oracle Support Recalculation issue.

Cloudera and Netezza Team Up to Bring Hadoop to Customers, so we read. All these connectors being announced makes me think there’s somebody out there with a matrix of RDBMS and NoSQL systems, looking at which combinations don’t have a marketable connector yet so he can be first to market.

Via 451 CAOS Theory, and GigaOM comments as well.

CouchDB 1.0

July 19th, 2010

The folks behind Object DB CouchDB released version 1.0, marking an important milestone for the still new DB.

  • Speed — writes are 300% faster for large documents, compared to the previous release;
  • Microsoft Windows support;
  • Authentication system — write CouchApps without having to create a user model;
  • Replicator options — flexibility to use replication to build custom systems.

I like that CouchIO, the commercial arm behind CouchDB, is offering free hosting, so everybody can try it out, no questions asked.

MyNoSQL also has some coverage.

Tech that is not tech

July 17th, 2010

Great post by Couchio guy Mikeal Rogers about tech that is not tech. I can relate so much to that – I guess it’s an age or maturity thing.

I don’t know if it’s just because I got a little older, or because I started working so much with JavaScript and writing web stuff, but I can’t stand anything that is hard to use or requires me to maintain it in any way. I have plenty of work to do. On my laptop I’ve got TextMate and iTerm and a browser but anything else that is happening on the machine I don’t want to worry about. I actually find myself annoyed by System Updates.

For me it’s not the Web stuff nor Apple products, but I clearly spend a lot less time tinkering with system internals, and I like it when something Just Works™ that I know is complex to do with computers. Unfortunately these events are still rare today, at least outside the Apple universe…