Concurrent future...: March 2009

Tuesday, March 31, 2009

RPC from Erlang Linked-in driver port

Developing SAX based XML parser (using libexpat) as a linked-in driver for Erlang, I came across the requirement to do the callback within linked-in driver. I have done it earlier in the c-node via ei_rpc C API. This require linked-in driver to run as a c-node. I was looking for a way to avoid it.

Then I came across this function driver_send_term. This can be used to send a term to any PID within the local VM. It is easy to send the term to the same process which did port_cmd. The PID for that process can be obtained using driver_caller C API.

My requirement is to send it to some other PID than the calling process. There is no standard API to do so. But going through the source code for erlang, I figured the way to convert erlang_pid sent by the caller to the one usable within driver_send_term call.

ErlDrvTermData pid = ((ErlDrvTermData) ( ((callbackPid.serial << 15 | callbackPid.num)) << 4 | (0x0 << 2 | 0x3)) );

This works with R12 and R13 version of Erlang. This is not guaranteed to work in the future releases. But till then I am going to use it.

I wonder why Erlang did not provide the standard interface to convert erlang_pid to ErlDrvTermData to be used within driver_send_term.

Monday, March 30, 2009

Sharing binary data and reference counting

This email thread discussion concludes that

constants in erlang code are stored in a constant pool memory instead of process memory. So there is no copying of this data in the process memory. This is more efficient when one knows that the data is not going to change.

Example for this kind of data is configuration. If configuration is known at the compile time, well and good. But what if the configuration is read from the file during run-time. This data will get copied into the process memory.

I got the suggestion to generate the code during run-time and load that code. Thus erlang VM will ensure that these configuration elements are in constant pool memory and the data is not copied into process memory.

I decided to give it a try. Generated the code and stored it in a file.

ConfigFetcher = list_to_atom("fetch_" ++ atom_to_list(Module) ++ "_config"),
FileName = code:priv_dir(Module) ++ "/" ++ atom_to_list(ConfigFetcher) ++ ".erl",
file:write_file(FileName, GeneratedCode).

Compiled it using

compile:file(FileName, [{outdir, code:lib_dir(Module, ebin)}]).

Load the compiled file,

code:load_file(ConfigFetcher)

Let's say that fetch is a function exported in the generated module. Using this fetch function for fetching the configuration for a given value, ran a small test to find the performance improvement.

The numbers I saw were mind boggling. Using old method I could get 11k fetches per second. With new method (code generation and loading), I got 500k fetcher per second.

Sunday, March 29, 2009

AMPQ Client gotchas!

There is a small change required in this article on Introducing The Erlang AMQP Client.

In this article, code for subscribing to an queue is

#'basic.consume_ok'{consumer_tag = ConsumerTag}
= amqp_channel:call(Channel, BasicConsume, self()),

This throws exception

Channel 1 is shutting down due to: {{badmatch,false},
[{rabbit_writer,assemble_frames,4},
{rabbit_writer,
internal_send_command_async,5},
{rabbit_writer,handle_message,2},
{rabbit_writer,mainloop,1}]}

Change it to amqp_channel:subscribe(Channel, BasicConsume, self())

NOTE: Download rabbitmq erlang client from here. The version from this page does not work.

Thursday, March 26, 2009

TCP server in Erlang

Came across this good article on writing a TCP server in Erlang. Author forgot to mention one thing here.

connect(Listen) ->
{ok, Socket} = gen_tcp:accept(Listen),
inet:setopts(Socket, ?TCP_OPTS),
% kick off another process to handle connections concurrently
spawn(fun() -> connect(Listen) end),
recv_loop(Socket),
gen_tcp:close(Socket).

In the above code snipper (from the above mentioned blog), it is to be noted that after getting the connection via gen_tcp:accept/1, a process is spawned to continue listen on the socket and the client is served from the same process unlike in any other language where a thread is spawned to serve the incoming client request and main thread continue to listen on the socket. This is very important to note as the incoming client socket has a ownership relationship with the process. If you spawn a process to serve the client, it won't work as the messages won't be delivered to the new process instead are delivered to the process in which gen_tcp:accept/1 was executed.

Tuesday, March 24, 2009

string:tokens/2 does not handle empty tokens

string:tokens("A,,,,,", ",") returns ["A"] where as I expect it to return ["A", "", "", "", ""]. This can be solved using regexp:split function.

regexp:split("A,,,,,", ",") returns ["A", [], [], [], []]

Saturday, March 21, 2009

Imperative language constructs in Erlang

"How can I do for loop in Erlang?", "How can I do while loop in Erlang?"....

These are the common questions people ask when they get into the world of functional programming from imperative language(s). It is easy to switch to functional programming in Python, Ruby etc as they are the mixture of functional and imperative language constructs. Programming languages like Erlang (probably Haskell) do not provide explicit for, while constructs. Here are the equivalents way to implement some of the imperative language constructs.

For Loop

for_loop(N, Fun) ->
lists:foreach(Fun, lists:seq(1, N))

for_loop(0, Fun) -> ok;
for_loop(N, Fun) ->
Fun(N),
for_loop(N-1, Fun).

While loop

while_loop(Predicate, Fun, State) ->
while_loop(Predicate(State), Predicate, Fun, State).

while_loop(PredResult, Predicate, Fun, State) when PredResult == false -> ok;
while_loop(_PredResult, Predicate, Fun, State) ->
NewState = Fun(State),
whle_loop(PredResult(State), Predicate, Fun, NewState).

Wednesday, March 18, 2009

Auto increment in Mnesia database

Mnesia provides some obscure API called dirty_increment_update/3. This can be used to generate unique ids. Here is an example.

-module (uniqueid).

-export( [test/0] ).

-record( unique_ids, {type, id} ).

test() ->
mnesia:start(),
mnesia:create_table( unique_ids, [{attributes, record_info(fields, unique_ids)}] ),
Id = mnesia:dirty_update_counter(unique_ids, record_type, 1),
io:format( "Id => ~p~n", [Id] ),
Id1 = mnesia:dirty_update_counter(unique_ids, record_type, 1),
io:format( "Id => ~p~n", [Id1] ),
Id2 = mnesia:dirty_update_counter(unique_ids, another_type, 1),
io:format( "Id => ~p~n", [Id2] ),

The output you will get is

Id => 1
Id => 2
Id => 1

A single table can be used to generate the unique ids for other tables. In this example, unique ids are generated for record_type and another_type.

Thursday, March 12, 2009

Tuning TCP stack on Linux

TCP stack on most of the Linux distributions are tuned for the desktop. Using Linux distribution on servers having high load require some tuning on TCP stack. Here is what I found.

Add the following lines into /etc/sysctl.conf

# tcp tuning
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.ipv4.tcp_wmem=4096 65536 16777216
net.ipv4.tcp_rmem=4095 87380 16777216
net.ipv4.tcp_no_metrics_save=1

Run

sudo sysctl -p

Restart your application(s).

This is effective especially when clients are on slow connection. These configuration change will change the tcp write and read memory size so that lot many bytes can be sent over to client in a single packet.

Wednesday, March 4, 2009

Gracefully terminating erlang VM

Once started usually it is not required to stop erlang VM. We can stop the services running as well as erlang VM in a graceful manner with init:stop/0 function. Trick is to call this function on the target VM via remote shell.

For example, to stop the VM running as foo@example.foo.com with cookie foo

erl -name bar@example.foo.com -remsh foo@example.foo.com -setcookie foo -s init stop

The above command will not work as init:stop() is executed within bar@localhost VM, not in foo@localhost. Another way is to do spawn in the target.

erl -name bar@localhost -remsh foo@localhost -setcookie foo -s proc_lib spawn 'foo@example.foo.com' init stop "" -s init stop

The above command will try to execute proc_lib:spawn( 'foo@example.foo.com', init, stop, []). There is a small caveat here. -s option pass all the paramter to proc_lib:spawn/4 as a list, not as a individual parameter.

proc_lib:spawn( ['foo@example.foo.com', init, stop, [] ] ).

Since there is no such function defined, it will not work.

Only other solution is to create a module with one exported function which will spawn a function init:stop/0 on a remote shell.

-module (ctrl).

-export([stop/1]).

stop(Node) ->
proc_lib:spawn( hd(Node), init, stop, [] ).

Note that ctrl:stop/1 function gets list as an argument and it taking only the head element from it.

erl -name bar@example.foo.com -remsh foo@example.foo.com -setcookie foo -s ctrl stop 'foo@example.foo.com' -s init stop

The above command line execute ctrl:stop(['foo@example.foo.com']), init:stop()

UPDATE: This can be done from command line without using any other module. All you have to do is to use -eval option on erl command line.

erl -name bar@example.foo.com -setcookie foo -eval "rpc:call('foo@example.foo.com', init, stop, []), init:stop()."