Monday, March 30, 2009

Sharing binary data and reference counting

This email thread discussion concludes that
constants in erlang code are stored in a constant pool memory instead of process memory. So there is no copying of this data in the process memory. This is more efficient when one knows that the data is not going to change.


Example for this kind of data is configuration. If configuration is known at the compile time, well and good. But what if the configuration is read from the file during run-time. This data will get copied into the process memory.

I got the suggestion to generate the code during run-time and load that code. Thus erlang VM will ensure that these configuration elements are in constant pool memory and the data is not copied into process memory.

I decided to give it a try. Generated the code and stored it in a file.

ConfigFetcher = list_to_atom("fetch_" ++ atom_to_list(Module) ++ "_config"),
FileName = code:priv_dir(Module) ++ "/" ++ atom_to_list(ConfigFetcher) ++ ".erl",
file:write_file(FileName, GeneratedCode).

Compiled it using

compile:file(FileName, [{outdir, code:lib_dir(Module, ebin)}]).

Load the compiled file,

code:load_file(ConfigFetcher)

Let's say that fetch is a function exported in the generated module. Using this fetch function for fetching the configuration for a given value, ran a small test to find the performance improvement.

The numbers I saw were mind boggling. Using old method I could get 11k fetches per second. With new method (code generation and loading), I got 500k fetcher per second.

No comments:

Book Promotion