Set up quickly a Hadoop 2.3 project on Mac OS X

Install Hadoop on Mac OS X using Homebrew

brew install hadoop

Create a quick Maven-based Java project

mvn archetype:generate -DgroupId=org.xmao.hadoop -DartifactId=wordcount -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

Configure Java project to support Hadoop

Add the follow dependencies into pom.xml

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.3.0</version>
</dependency>

Develop and compile Hadoop project using Maven

Now you have all the stuff for a Hadoop project, and you can start with WordCount example on Hadoop web site and then package all the Java binary classes into a final jar.

mvn package

Then you can run your first Hadoop example like this:

hadoop jar target/wordcount-1.0-SNAPSHOT.jar org.xmao.hadoop.WordCount INPUT_FILE OUTPUT_DIR

Pretty easy, right? Enjoy!

Advertisements

Load flat text file into a Berkeley DB database

cat INPUT_FILE | sed ‘s/\\/\\\\/g’ | db_load -T -t hash DB_FILE

“-T” is requirable if input file is just flat text file instead of being from db_dump. The input file consists of two-line pairs, in which the first line in a pair is key and the second is value.

Debug in R

http://www.vcasmo.com/video/drewconway/8556

An excellent quick tour for Objective-C

http://www.otierney.net/objective-c.html