Set up quickly a Hadoop 2.3 project on Mac OS X

Install Hadoop on Mac OS X using Homebrew

brew install hadoop

Create a quick Maven-based Java project

mvn archetype:generate -DgroupId=org.xmao.hadoop -DartifactId=wordcount -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

Configure Java project to support Hadoop

Add the follow dependencies into pom.xml


Develop and compile Hadoop project using Maven

Now you have all the stuff for a Hadoop project, and you can start with WordCount example on Hadoop web site and then package all the Java binary classes into a final jar.

mvn package

Then you can run your first Hadoop example like this:

hadoop jar target/wordcount-1.0-SNAPSHOT.jar org.xmao.hadoop.WordCount INPUT_FILE OUTPUT_DIR

Pretty easy, right? Enjoy!

Golang vs. Python

It’s time to migrate to Golang ?

Migrate from Python to Golang?

Recycling memory buffers in Go

Google Publishes C++, Go, Java and Scala Performance Benchmarks

Upgrade RHEL 5 to 6 in place

Fixed slow MATLAB UI response on mountain lion

The slow UI is due to a recent Apple Java update (build 1.6.0_51-b11-456-10M4508), which impacts <= 2012b and made me feel awful. To check, just type command in MATLAB console:

version -java

If you get 1.6.0_51-b11-456-10M4508, you may encounter this problem.

Apple has fixed in another patch, and just installing this patch manually fixes the issue finally. Here is the link:

After installing, check again with version -java, you should get the msg like:

Java 1.6.0_51-b11-457-11M4509

Manage iptables log

Enable iptables log

-A INPUT -m state --state INVALID -j LOG --log-prefix "IPTABLES INPUT INVALID" --log-level 7 --log-tcp-options --log-ip-options
-A INPUT -i ! lo -j LOG --log-prefix "IPTABLES INPUT " --log-level 7 --log-tcp-options --log-ip-options
-A FORWARD -m state --state INVALID -j LOG --log-prefix "IPTABLES FORWARD INVALID" --log-level 7 --log-tcp-options --log-ip-options
-A FORWARD -p tcp -m tcp --dport 25 -j LOG
-A FORWARD -i ! lo -j LOG --log-prefix "IPTABLES FORWARD " --log-level 7 --log-tcp-options --log-ip-options
-A OUTPUT -m state --state INVALID -j LOG --log-prefix "IPTABLES OUTPUT INVALID" --log-level 7 --log-tcp-options --log-ip-options
-A OUTPUT -o ! lo -j LOG --log-prefix "IPTABLES OUTPUT " --log-level 7 --log-tcp-options --log-ip-options

Save log into a separate file

Add a line into /etc/syslog.conf:

kern.=debug /var/log/kern.debug.log

Create a log rotate configuration for kern.debug.log:

vim /etc/logrotate.d/kern.debug

/var/log/kern.debug.log {
rotate 7
size 100M
/sbin/killall -HUP syslogd

All about ggplot2 in R

Ggplot2 Guide

Manage linux logs with logrotate

Understand logrotate

Logrotate and move to backup directory