Friday, January 22, 2010

http request pipeline in Erlang

I tried to use Erlang's http module for high concurrent requests. It was not performing well due to pipelining and persistent connection issues. This seems to be solved in R13 version. I figured out how to use the http profiles to do selective pipelining/persistent connections to one server but not for others [if application is sending requests to multiple hosts].

First step in the process is to create a new http profile. It can be done in 2 ways. First one is to run a stand along http connection manager (httpc_manager).


{ok, Pid} = inets:start( httpc, [{profile, other}] ).


As per the documentation, this is not desirable as all benefits of OTP framework is lost.

Dynamically started services will not be handled by application takeover and failover behavior when inets is run as a distributed application. Nor will they be automatically restarted when the inets application is restarted, but as long as the inets application is up and running they will be supervised and may be soft code upgraded. Services started as stand_alone, e.i. the service is not started as part of the inets application, will lose all OTP application benefits such as soft upgrade. The "stand_alone-service" will be linked to the process that started it. In most cases some of the supervision functionality will still be in place and in some sense the calling process has now become the top supervisor


2nd method is to run it as a part of inets application via configuration file

Have a config file with the following content (say inets.config)

[{inets,
[{services,[{httpc,[{profile, server1}]},
{httpc, [{profile, server2}]}]}]
}].


Run the erlang shell as


erl -config inets.config


This will start 3 http profiles [server1, server2 and default].

Now the question is how to use the newly created profiles. Let's say the application is using 2 web services hosted at foo1.example.com and foo2.example.com. Web service hosted at foo1.example.com is hosted on a web server which can support lot of persistent connections [keep alive connections]. Web service hosted foo2.example.com is hosted on a normal web server which is not optimized for large number of persistent connectinons.

In the application set the profile for server1 for the connections to foo1.example.com. This can be done by changing the http options listed here.


http:set_options([{max_sessions, 20}, {pipeline_timeout, 20000}], server1).


NOTE:It is required to set the pipeline timeout in order to enable http pipelining.

Profile can be specified during the request time.


http:request( "http://foo1.example.com/v1/get_info/dudefrommangalore", server1).


There is no interface provided by httpc_manager or inets to get the info on the number of sessions open to a server. But good news is that the session information is kept in the ets table. One can query the ets table to get the list of persistent connections.


ets:tab2list(httpc_manager_server1_session_db).


Output is something like

{tcp_session,{{"fo11.example.com",80},
<0.103.0>},
false,http,#Port<0.1032>,...}


<0.103.0> is a Pid of httpc_handler gen server process. It is possible to get the status of this process via standard OTP sys module.


sys:get_status(erlang:list_to_pid("<0.103.0>")).


It is also possible to get all the pipelined requests on each persistent connections. For that it is necessary to get the pid of the httpc_manager via inets:services_info(). This call will return the pid of the httpc_manager.


[{httpc,<0.52.0>,[{profile,server1}]},
{httpc,<0.53.0>,[{profile,server2}]},
{httpc,<0.41.0>,[{profile,default}]}]


From the pid, get the status of httpc_manager gen server process.


sys:get_status(erlang:list_to_pid( "<0.52.0>")).


ets table name is in bold here.


15> sys:get_status(erlang:list_to_pid("<0.52.0>")).
{status,<0.52.0>,
{module,gen_server},
[[{'$ancestors',[httpc_profile_sup,httpc_sup,inets_sup,
<0.36.0>]},
{'$initial_call',{httpc_manager,init,1}}],
running,<0.40.0>,[],
[httpc_manager_server1,
{state,[],24596,
{undefined,28693},
httpc_manager_server1_session_db,httpc_manager_server1,
{options,{undefined,[]},
0,2,5,120000,2,disabled,false,inet,default,...}},
httpc_manager,infinity]]}


Get the content of the ets table to get the pipelined connection


ets:tab2list(24596).


Application can tune the http options to utilize the network bandwidth better, get the most of the machine and network.

No comments:

Book Promotion