Set up quickly a Hadoop 2.3 project on Mac OS X

Install Hadoop on Mac OS X using Homebrew

brew install hadoop

Create a quick Maven-based Java project

mvn archetype:generate -DgroupId=org.xmao.hadoop -DartifactId=wordcount -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

Configure Java project to support Hadoop

Add the follow dependencies into pom.xml

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.3.0</version>
</dependency>

Develop and compile Hadoop project using Maven

Now you have all the stuff for a Hadoop project, and you can start with WordCount example on Hadoop web site and then package all the Java binary classes into a final jar.

mvn package

Then you can run your first Hadoop example like this:

hadoop jar target/wordcount-1.0-SNAPSHOT.jar org.xmao.hadoop.WordCount INPUT_FILE OUTPUT_DIR

Pretty easy, right? Enjoy!

Golang vs. Python

It’s time to migrate to Golang ?

Migrate from Python to Golang?
http://blog.repustate.com/migrating-code-from-python-to-golang-what-you-need-to-know/
http://tech.t9i.in/2013/01/why-program-in-go/

Recycling memory buffers in Go
http://blog.cloudflare.com/recycling-memory-buffers-in-go
https://github.com/songgao/bufferManager.go

Google Publishes C++, Go, Java and Scala Performance Benchmarks
http://readwrite.com/2011/06/06/cpp-go-java-scala-performance-benchmark#awesm=~onRHyHp6x8VnKK

Upgrade RHEL 5 to 6 in place

http://bitc.bme.emory.edu/~lzhou/blogs/?p=203

Fixed slow MATLAB UI response on mountain lion

The slow UI is due to a recent Apple Java update (build 1.6.0_51-b11-456-10M4508), which impacts <= 2012b and made me feel awful. To check, just type command in MATLAB console:

version -java

If you get 1.6.0_51-b11-456-10M4508, you may encounter this problem.

Apple has fixed in another patch, and just installing this patch manually fixes the issue finally. Here is the link:

http://support.apple.com/kb/DL1572

After installing, check again with version -java, you should get the msg like:

Java 1.6.0_51-b11-457-11M4509

Manage iptables log

Enable iptables log

-A INPUT -m state --state INVALID -j LOG --log-prefix "IPTABLES INPUT INVALID" --log-level 7 --log-tcp-options --log-ip-options
-A INPUT -i ! lo -j LOG --log-prefix "IPTABLES INPUT " --log-level 7 --log-tcp-options --log-ip-options
-A FORWARD -m state --state INVALID -j LOG --log-prefix "IPTABLES FORWARD INVALID" --log-level 7 --log-tcp-options --log-ip-options
-A FORWARD -p tcp -m tcp --dport 25 -j LOG
-A FORWARD -i ! lo -j LOG --log-prefix "IPTABLES FORWARD " --log-level 7 --log-tcp-options --log-ip-options
-A OUTPUT -m state --state INVALID -j LOG --log-prefix "IPTABLES OUTPUT INVALID" --log-level 7 --log-tcp-options --log-ip-options
-A OUTPUT -o ! lo -j LOG --log-prefix "IPTABLES OUTPUT " --log-level 7 --log-tcp-options --log-ip-options

Save log into a separate file

Add a line into /etc/syslog.conf:

kern.=debug /var/log/kern.debug.log

Create a log rotate configuration for kern.debug.log:

vim /etc/logrotate.d/kern.debug

/var/log/kern.debug.log {
rotate 7
daily
size 100M
compress
missingok
notifempty
postrotate
/sbin/killall -HUP syslogd
endscript
}

All about ggplot2 in R

Ggplot2 Guide

http://sharpstatistics.co.uk/r/ggplot2-guide/?utm_source=rss&utm_medium=rss&utm_campaign=ggplot2-guide

Manage linux logs with logrotate

Understand logrotate

http://www.rackspace.com/knowledge_center/article/understanding-logrotate-part-1

Logrotate and move to backup directory

http://www.ashishnepal.com/logrotate-and-move-to-backup-directory/

Follow

Get every new post delivered to your Inbox.

Join 141 other followers