Disaster Recovery

How Many Hadoop Clusters Should A Company Have?

One of the questions we get asked is “How many Hadoop clusters should we have?” And like all good technology answers, we generally respond with “It depends.” That being said, here are a few general rules we’ve seen applied across enterprise organizations. PROD vs non-PROD – Many organizations physically separate PROD and non-PROD infrastructure with […]

Read More

Replication Patterns for Hadoop

Replication patterns in Hadoop is a good question and something that we get asked more and more.  phData breaks the conversation into two parts. Hadoop Distributed File System (HDFS) – Hadoop has a cross cluster copy utility called distcp that one can use to move data between clusters.  That’s good for one-off copies, but enterprises […]

Read More