Getting Into The Cloud With CDH In Minutes

A few weeks ago we wrote an article on the pros and cons of running your Hadoop capabilities in the Cloud compared to on-premise.  The conclusion was that there isn’t a right or wrong answer.  If you’re investing in data centers, it probably makes sense to run Hadoop on-premise.  If you’re not investing in data centers, then lean towards the cloud.

With Cloudera’s release of Cloudera Director 1.1, you can see enterprise customer’s continued interest in running Hadoop in the cloud.  One of the advantages we see of using Cloudera Director is that while you may be locking in on a Hadoop distribution, you’re not locking on where you run it.  If you start in the cloud, you can come back to on-premise if it makes sense.  And vice versa.  Moreover, you could start in AWS and presumably switch to Azure or Google in the future.

This differs from something like Amazon’s Elastic Map Reduce (EMR).  With EMR, you’re locking in on both a Hadoop distribution and the infrastructure it runs on.  EMR is a very nice offering so in no way are we suggesting enterprises not look at it, we’re just pointing out that there are two lock-in properties with EMR.

phData expects Hadoop distributors (Cloudera, Hortonworks, MapR, etc.) to continue investment in providing tools to get their distributions running both on-premise and in the cloud.  Cloud is inevitable for many use cases so it only makes sense to look at distributions that embrace both options.