September 21, 2021

What To Do With Unsupported CDH 6

By Melissa Barrett

Let’s say you have thought through your long-term CDH 6 migration plans, and have found that you will not be able to complete a full migration to another platform before the EOSL date in March 2022. Due to this, you find you will need to continue to use your CDH 6 platform for some time following the EOSL date.

If this is the case, you will need to be ready to have an unsupported platform. If you are unsure if going unsupported is the best option for you, you may want to read more about the other possible choices in our blog, Cloudera CDH 6 Support is Ending, Now What?

In this blog, we’ll unpack what happens after the EOSL date with an unsupported platform and go over some common support issues you could run into. 

What Happens After the EOSL Date?

CDH6.3.3 is the first version of CDH that is behind a paywall. If you are on this version of CDH 6, you will not be able to continue using Cloudera Manager following the EOSL date in any way. Internal testing at phData has shown that once a license expires for a CDH6.3.3 cluster, Cloudera Manager becomes unusable. 

The only page Cloudera Manager will show is a page where you have to upload a new license, which will not be possible following the EOSL date. The image below shows the page we encountered after the license expired on a CM/CDH6.3.3 cluster. Any attempt to navigate away from this page will redirect back to it.

A screenshot from the Cloudera Manager license page

You will not be able to administrate the cluster, stop or start it, view cluster health, or change configurations. You will, however, still be able to downgrade your cluster node’s Cloudera Manager server and agent packages via YUM. If you downgrade the YUM packages from Cloudera Manager 6.3.3 to Cloudera Manager 6.3.1, when the UI comes up you will have an option to enable Cloudera Express, which does not require a license to use. Once Cloudera Express is enabled, you will be able to utilize the UI to downgrade CDH from CDH6.3.3 to CDH6.3.2 via the Parcels page. 

Cloudera Express lacks many features that you may depend on, such as Cloudera Navigator, configuration history/versioning, rolling restarts, LDAP authentication for CM, the HDFS File Browser in CM, and Impala usage reports. You should gauge if you can go without these features prior to deciding to move to Cloudera Express.

Note: Cloudera Express will ONLY work if there are less than 100 servers managed by Cloudera Manager. If there are more than 100 servers in Cloudera Manager when Cloudera Express is in use, the only command you will be able to use is to stop the cluster. You will not be able to restart or start a cluster. You will have to remove servers from Cloudera Manager down to below 100 before you can use Cloudera Express normally.

Cloudera provides a datasheet that can be used to fully gauge the differences between Cloudera Enterprise and Cloudera Express. The Managing Licenses section of the Cloudera Enterprise 6.3.x documentation has additional information as well.

When downgrading to CM 6.3.1 from CM 6.3.3, you will lose the features and fixes described in the CM 6.3.3 release notes. When downgrading to CDH 6.3.2 from CDH 6.3.3, you will lose the features and fixes described in the CDH 6.3.3 release notes. Of particular severity, you will lose the fix for HDFS Snapshot Corruption, and may need to stop using snapshots altogether if you want to avoid a major issue.

You should ensure that your Cloudera contract does not require proof of uninstallation of CDH software when support ends or your contract or license lapses. This is rare, but we have seen it before, and it would preclude your company from being able to use CDH 6 unsupported.

4 Real Examples of Support Issues You Could Run Into

Going unsupported with CDH 6 presents some real risk and usability issues to your platform that may be difficult to remediate. We’ll go over some of the issues that you may encounter as a result of maintaining a downgraded version of CDH 6 past the EOSL date here.

1. HDFS Snapshot Corruption

This is an issue that is fixed in CDH6.3.3, so maintaining a CDH 6 platform past the EOSL date on CDH6.3.2 or earlier will put you at risk for running into this. This issue can be triggered when an HDFS snapshot gets deleted. This may happen automatically on a regular basis depending on what your HDFS Snapshot Policy looks like. 

The gist of the issue is that when a snapshot gets deleted, the HDFS fsimage on the Standby NameNode can become corrupted and then cause the checkpoint operation to fail. This in turn causes the Standby NameNode to go down when the corrupted fsimage is detected. The fsimage must be repaired and placed into both NameNodes fsimage directory to fix the issue. 

We recommend deleting all HDFS snapshot policies and snapshots in Cloudera Manager to avoid the issue entirely. Consider increasing HDFS Trash retention settings significantly in lieu of using HDFS snapshots. If it’s not possible to go without HDFS snapshots, the ideal way to fix the issue when encountered is to shutdown HDFS normally after the Standby Namenode goes down. Make backups of the fsimages on both NameNodes before proceeding. 

During the HDFS outage, when both NameNodes are stopped, copy the fsimage from the previously Active NameNode over to the Standby NameNode. Replace the fsimage on the Standby NameNode with the fsimage from the Active NameNode, and then bring HDFS back up once they match. The recovery of the corrupt fsimage can result in the loss of snapshots. 

Take extreme caution when working with the fsimage files. Losing the files will result in catastrophic data loss from HDFS that may be impossible to recover from.

2. HDFS Data S3 Backup & Migration

If you face having to move forward with an unsupported CDH platform, you will have to contend with how you will get your HDFS data out of the cluster when you are ready to move from CDH 6 to a new platform technology. You will lose the ability to BDR data into Amazon S3 with Cloudera Express when you are unsupported.

With S3Guard setup, you should be able to DistCp HDFS data into an Amazon S3 bucket even when on Cloudera Express. If possible, S3Guard is something you will want to set up and test before your CDH 6 platform falls out of support, in case you need Cloudera Support to help set it up. Once HDFS data is in Amazon S3, you will be able to use that to feed it into the majority of cloud platform technology options, such as Snowflake.

3. Hue Horizontal Scroll Bar Exhibits Erratic Behavior

Another issue that is fixed in CDH6.3.3 is HUE-9027 which will cause issues when viewing, especially wide tables in Hue. Essentially when trying to use the horizontal scrollbar to scroll right or left, it will frequently move as far to the right or left as is possible to be. This can make it especially hard to view the middle columns of wide tables.

The workaround to this issue is to download the results for wide tables into an Excel spreadsheet and view your results there when you’re experiencing this issue.

4. Queries That Select from a Complicated View Can Fail

An issue in Hive that was fixed in CDH6.3.3 may impact you if you downgrade so you can go unsupported. HIVE-22236 defines an issue that occurs when trying to SELECT from a complicated View that uses a SELECT statement with NOT IN as a subquery. Queries that try to SELECT from complicated Views that meet that condition will fail with an ‘IllegalArgumentException: replace: range invalid’ error.

You may need to refactor Hive SQL code that meets these conditions to workaround this issue in CDH6.3.2.

In Closing

Going unsupported with your current CDH 6 platform is a serious decision, and the pros and cons of doing so need to be considered carefully. Consider if your platform administration team has the skills and abilities to be able to maintain the CDH 6 platforms without Cloudera support. Consider the inherent value your company derives from your CDH 6 platform. 

If you are unable to migrate prior to the CDH 6 EOSL date, we highly recommend working with Cloudera to see if you can come to an agreement on extended support before deciding to go unsupported. Roll the estimated cost of that extended support into the total cost of the migration, and use it in your ROI calculations

Regardless of which path you decide to take, phData can help along the way. With years of experience managing multiple CDH 6 platforms, we can help you maintain your current CDH 6 platform past the EOSL date and work together on a migration that best suits your business.

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit