Posts

Showing posts from November, 2014

Upgrading our cluster from CDH4 to CDH5

A little over a year ago we wrote about  upgrading from CDH3 to CDH4  and now the time had come to upgrade from CDH4 to CDH5 . The short version: upgrading the cluster itself was easy, but getting our applications to work with the new classpaths, especially MapReduce v2 (YARN), was painful. The Cluster Our cluster has grown since the last upgrade (now 12 slaves and 3 masters), and we no longer had the luxury of splitting the machines to build a new cluster from scratch. So this was an in-place upgrade, using CDH Manager. Upgrade CDH Manager The first step was upgrading to CDH Manager 5.2 (from our existing 4.8). The Cloudera documentation  is excellent so I don't need to repeat it here. What we did find was that the management service now requests significantly more RAM for it's monitoring services (minimum "happy" config of 14GB), to the point where our existing masters were overwhelmed. As a stop gap we've added a 4th old machine to the "masters" group...