Now that’s a thing: Kevin Closson Joins EMC Data Computing Division To Focus On Greenplum Performance Engineering! Kevin’s been the public voice of Exadata in the blogosphere for much of four years, so that’s quite a loss for the folks at Oracle. And a big win for EMC, I would say. Good luck, Kevin!

Intel’s McAfee Acquires Sentrigo To Boost Database Security Offerings. That’s not surprising, given that Sentrigo has the best product in that space. Oracle already acquired Secerno last year, so other vendors now have to build their portfolio.

Dave DeWalt, president of McAfee said of the acquisition: McAfee is continuing to broaden its security portfolio to now-secure databases, as well as endpoints, networks, email and web. He added that the company is also announcing a “complete database security platform” which includes products across the McAfee portfolio.

McAfee’s Vulnerability Manager for Databases will automatically  discover all databases on the network, collect a full inventory of  configuration details, and determine if the latest patches have been  applied and scans for vulnerabilities. McAfee’s Database Activity  Monitoring (DAM), not only tracks database changes, but also protects  data from external threats and malicious insiders with real-time alerts  and session termination.

Let’s see what the combined companies are going to bring.

A good one, if you want to make your application future proof, as MS is working hard to bring stand-alone SQL Server in line with SQL Azure for future releases, and therefore is giving up on some of the concepts available in SQL Server today:

Comparing SQL Server with SQL Azure.

SQL Azure Database is a cloud based relational database service  from Microsoft. SQL Azure provides relational database functionality as a  utility service. Cloud-based database solutions such as SQL Azure can  provide many benefits, including rapid provisioning, cost-effective  scalability, high availability, and reduced management overhead. This  paper provides an architectural overview of SQL Azure Database, and  describes how you can use SQL Azure to augment your existing on-premises  data infrastructure or as your complete database solution.

Get ready for the cloud!

EDW Without A Database?

December 22nd, 2010

Forrester’s James Kobelius asks: An Enterprise Data Warehouse Without A Database—Is That Even Conceivable? Turns out the discussion is more around RDBMS vs. non-relational DBMS such as Hadoop, and he’s suggesting that we’ll see a rise of non-relational systems because of the rise of less structured content. Forrester’s school of thought also marks it perfectly reasonable to have a distributed design:

This points to another key trend in EDW evolution: the continued transformation of these infrastructures away from traditional centralized and hub-and-spoke topologies toward the new worlds of cloud-oriented and federated architectures. The EDW itself is evolving away from a single master “schema” and more toward a semantic abstraction layer and use of distributed in-memory information as a service (IaaS).

Good read on the EDW basics from today’s standpoint, thx James!

Oracle took the popular DBA Views poster and now provide a Flash based digital version for download at Oracle 11g Interactive Quick Reference. I’m not sure how helpful that’s ultimately going to be as it doesn’t allow for the same kind of visual browsing, but wanted to post this anyway.

Via the Oracle DB Insider blog.

The Windows Azure blog has a long and good interview with Jonathan Ellis about Cassandra and a lot of other relevant topics. Take a break to read Thought Leaders in the Cloud: Talking with Jonathan Ellis, Co-Founder of Riptano. When asked about Cassandra vs. RDBMS, this following is interesting:

I think relational databases are going to stay important. They solve some important problems, and there’s a very rich ecosystem of tools around them, which keeps time to market low. I see Cassandra as particularly appealing to companies that started on something like SQL Server and then reached the point where favorable price/performance to buy larger machines isn’t there anymore. The pain they’re feeling from the pressure to scale is greater than the pain of learning a new technology like Cassandra.

[...]

So people using relational databases are looking to move to Cassandra, mostly because of the scaling aspect, also sometimes for the reliability aspect. Cassandra deals very well with multiple data centers, in terms of preparing for one or more of them failing and clients having to access a different one.

NoSQL Primer for RDBMS folks

September 17th, 2010

Chen Shapira apparently went through a similar information gathering and facts finding exercise as I did with NoSQL, but she’s much better at writing it up all in this concise and complete article NoSQL Deep Dive – The Missing White Paper.

Highly recommended for all folks who understand SQL RDBMS, and need a quick way to understand NoSQL’s theoretical underpinnings well enough to make sense at the next cocktail party…

The problems with ACID

September 2nd, 2010

Danial Abadi throws a lengthy post at the DB world: The problems with ACID, and how to fix them without going NoSQL, introducing an even longer paper that they’re going to present at this months VLDB2010 in Singapore: The Case for Determinism in Database Systems.

This is good stuff, basically argueing that the reason people are going NoSQL to scale is because traditional ACID compliant RDBMS can’t scale as well, so he’s now looking at ways to get around the scaleability problem implications of ACID.

Just say No to NoSQL

August 10th, 2010

If today’s title sounds familiar to you, then that’s because yesterday’s title was Just say NoSQL. Here’s a little report on Why NoSQL is bad for startups from a person who has actually tried it, instead of the usual SQL fanboys who tout traditional RDBMS’ because that’s all they know.

Now, depending on your viewpoint, that title could just as well read Keep a Hadoop Cluster in Your Back Pocket. So we’re talking about when it makes sense to combine an old fashioned SQL RDBMS with a fancy and modern NoSQL system from both angles.

In defence of SQL (found via myNoSQL) is voting to keep an RDBMS around for the use cases where SQL excels.

Awkward it may be, but SQL is a lot more succint and readable than multiple lines of API calls or crazy, math-like relational algebra languages. And there’s nothing intrinsically slow about the language itself. If you could run “SELECT * FROM table WHERE …” on Cassandra, it would be no slower than specifying the same conditions via API calls.

Netezza blogger Phil Francisco, on the other hand, explains how it makes sense for some of their customers to use Hadoop as large online archive for their colder data.

We have seen customers deploy [patterns] in which the Hadoop Cluster is used for long-term data retention, or as a “queryable archive”. Here one could think of Hadoop as a complementary analytic extension of the Netezza TwinFin when there is far less premium placed on low-latency or high-performance. [...] the queryable archive could also retain long-term copies of structured data that had previously been loaded into the high-performance TwinFin appliance.

There you go. Let me know if you have any other thoughts about how to combine SQL and NoSQL for useful use cases.