Posts

Showing posts from January, 2011

Setting up a Hadoop cluster - Part 1: Manual Installation

Introduction In the last few months I was tasked several times with setting up Hadoop clusters. Those weren't huge - two to thirteen machines - but from what I read and hear this is a common use case especially for companies just starting with Hadoop or setting up a first small test cluster. While there is a huge amount of documentation in form of official documentation, blog posts, articles and books most of it stops just where it gets interesting: Dealing with all the stuff you really have to do to set up a cluster, cleaning logs, maintaining the system, knowing what and how to tune etc. I'll try to describe all the hoops we had to jump through and all the steps involved to get our Hadoop cluster up and running. Probably trivial stuff for experienced Sysadmins but if you're a Developer and finding yourself in the "Devops" role all of a sudden I hope it is useful to you. While working at  GBIF I was asked to set up a Hadoop cluster on 15 existing and 3 new mach...