Friday, August 21, 2009

Clustering RabbitMQ servers for High Availability

Clustering guide on rabbitmq website is a good start. But it does not seem to work out of the box.

Default rabbitmq-server script uses -sname (short node name) option on the erlang shell command line. This makes it impossible to setup the cluster as it is not possible to put a node into the cluster when short name is given. So first step is remove that line from the rabbitmq-server script

By default erlang shell generate a random cookie and save that in the .erlang.cookie file in the home directory. Since the cookie generated is random, unless the same cookie file copied onto the other nodes, it is not possible to setup the cluster. Better idea is to set the cookie on the command line itself. This can done via the environment variable RABBITMQ_SERVER_START_ARGS.

Also by default, rabbitmq-server do not set the full node name. One need to specify that on the command line argument. Better place is to set the environment variable RABBITMQ_SERVER_START_ARGS (same as the one used to set the cookie)

Erlang's DNS lookup is done via inet module. By default it uses /etc/hosts as the first one to resolve the hostname. Therefore in some environments (at least in my environment), short name does not resolve to long name when I use inet:gethostbyname(Shortname) function. I get short name back in hostent structure. Workaround for this is to force the use of native library for dns lookup. This can be done via inetrc file. By default Erlang uses $HOME/.inetrc file (if exists). Non-standard location can be specified via ERL_INETRC environment variable. Don't forget to set the full path name.

Content of the inetrc file will look like

{lookup, [native]}.

Other possible options are file,yp,dns.

It is good to know some other environment variables.

RABBITMQ_LOG_BASE -> log location (default /var/log/rabbitmq)
RABBITMQ_NODENAME -> node name to be used (default rabbit)
RABBITMQ_MNESIA_BASE -> location for mnesia database (default /var/lib/rabbitmq/mnesia)

If your machine has multiple network interfaces, by default RabbitMQ binds to all the network interfaces. In some cases it is not desirable. One can force it to bind to only one network interface by setting the environment varaible RABBITMQ_NODE_IP_ADDRESS to the ip address of the network interface.

Here is a summary of all environment variables required

RABBITMQ_SERVER_START_ARGS="-name rabbit@`hostname` -setcookie rabbit"
RABBITMQ_LOG_BASE=/home/baliga/rabbit
RABBITMQ_NODENAME=rabbit
RABBITMQ_MNESIA_BASE=/home/baliga/rabbit
ERL_INETRC=/home/baliga/inetrc

Now it is time to start rabbitmq server on all the nodes.

rabbitmq-server -detached

To setup the cluster rabbitmqctl script is used. By default this script too has -sname option setup on the command line. Delete this line from rabbitmqctl file. In order for this script to communicate with the node, long name and cookie required. Set this via RABBITMQ_CTL_ERL_ARGS environment variable

export RABBITMQ_CTL_ERL_ARGS="-name rabbitmqctl@`hostname` -setcookie rabbit"

Note that the cookie should be same as that of the rabbitmq-server instance.

Now you can run the rabbitmqctl to setup the cluster.

On each of the node, run the following command

1. rabbitmqctl stop_app
2. rabbitmqctl reset
3. rabbitmqctl cluster
4. Repeat step 3 for all nodes in the cluster except for itself.
4. rabbitmqctl start_app

Now the cluster is ready to be used.

Some questions are not answered here like how the clustering works when nodes are behind separate firewalls. What ports need to be opened in firewall for clustering to work. I am still doing the research on it. Will post the findings soon.

Thursday, August 13, 2009

Is parallel computing/concurrency hard to get?

Regarding this subject, first question I would like to ask is why one need concurrency? My one line answer would be "As we go into the future, number of cores available on even desktops will increase dramatically". That's why one need to think in terms of parallel operations instead of serial operations. Many libraries are already there to achieive this like MPM.

In my opinion, these libraries are hard to use and still need to deal with the underlying network architecture. But it is not necessary. It can be achieved without getting into the low level functions. Only thing is require is "changing the way we think of designing the system".

Functional is my answer to this problem. Just thinking functional is not enough. But one should get the real feel of it. One need to forget everything he/she knows and start thinking like a child. Start to learn things from the begining.

When I started thinking functional, even though I was thinking functional but that functional always turned out to be serial. Then I got introduced to this wonderful language called "Erlang". This really force one to think and do things not only in functional but do things in concurrency.

Tuesday, August 4, 2009

New and improved delicious released

Today Yahoo! Delicious released a new cool features.

* Integration with Twitter. See the related tweets.
* Share interesting links not just with delicious users, but also with your friends via email.
* Search is ever more improved now. You can see the search results on timeline. Find the activity for your search terms over time (even though it is not done with Yahoo! Search)
* You can watch youtube and other videos inline now. No need to leave delicious search page.
* Flickr images are also displayed inline.

Congratulations to delicious team.

Monday, August 3, 2009

Module inherittance in Erlang

Erlang support module inheritance via extends module property.

Here is the example code:

parent.erl
-------------

-module (parent).
-export( [fun1/0, fun2/0] ).

fun1() ->
io:format( "In parent::fun1/0~n" ).

fun2() ->
io:format( "In parent:fun2/0~n" ).


child.erl
----------

-module (child).
-extends(parent).
-export( [fun1/0] ).

fun1() ->
io:format( "In child:fun1/0~n" ).


Testing this:

erl

> c(parent), c(child).
{ok,child}.
> parent:fun1().
In parent:fun1/0
> parent:fun2().
In parent:fun2/0.
> child:fun1().
In child:fun1/0.
> child:fun2().
In parent:fun2/0

Even though fun2 is not defined/exported in child module, calling child:fun2/0 is a valid call as parent has exported fun2/0 function.

If you call module_info() on child, you won't see fun2 function in the exported function list. Erlang VM find the extended module via attributes property of the module.

Book Promotion