AWS EMR

Cloudera Altus – First Look

I was lucky enough to attend StrataEU 2017 and one of the sessions was Deploying and managing Hive, Spark, and Impala in the public cloud led by Philip Langdale, Eugene Fratkin, and Jennifer Wu. I assumed this was a Cloudera Director session which we have lots of experience with, but I decided to pop my […]

Read More

Getting Into The Cloud With CDH In Minutes

A few weeks ago we wrote an article on the pros and cons of running your Hadoop capabilities in the Cloud compared to on-premise.  The conclusion was that there isn’t a right or wrong answer.  If you’re investing in data centers, it probably makes sense to run Hadoop on-premise.  If you’re not investing in data […]

Read More

Hadoop Versions In Vendor Distributions

Hadoop distributions typically come with between 20-30 open source projects, all bundled together to make a “big data platform” enterprises can deploy and maintain in a sustained manor. E.g. Hadoop core, HBase, Hive, Spark, etc. The foundation is Hadoop core, with the others sitting alongside or on top. Two common questions come up with enterprises […]

Read More