Hadoop on Windows

A lot of people are interested in Hadoop technology these days and currently I am participating in one project related with this technology. The first task was to install Hadoop - easy, right? But I had been told that we were going to use Windows OS and there is no opportunity to use Linux instead. I conducted a quick research and found the Hortonworks distribution which can work on Windows. BTW, the Hortonworks is the only Hadoop distribution that works in Windows :). As you see, I didn't have much choice, so I downloaded the software and installation guide.

The first surprise is that Windows Server (2008 and 2012) is supported OS only, but I could manage to find the Windows Server 2008 machine. After that you have to install required software (jdk, python, .net and Visual C++ Distributable Package). No problem. Everything is done. The next step is to run the msi file. I run it and specified all parameters I required. Installation started but suddenly McAfee windows appeared and showed the message that one of the Hadoop files contains a virus and just deleted it. Nice. After that very weird things started happens - I started to get errors like "There is a problem with Windows Installer package. A program run as part of setup didn't finish as expected. Contact the vendor". Hm... I got this error even after I stopped all McAfee services. I am not a Windows expert, therefore I didn't even know what to do :)

After one week struggling with Windows Server 2008 and our firm's security policies I could finally install the Hadoop on Windows. The next post will be about the loading data into Hadoop and Hive.

PS. I would like to thank David Goodhand from Hortonworks support team for his help and very fast response (I didn't expect this to be honest :) ).


