5 How do I...

This section is intended for the impatient. Most of these questions would resolved by working through the (on-line) Erlang manuals, but sometimes we just want a quick answer...

Keep in mind that the program fragments are intended to illustrate an idea, not serve as re-useable, robust modules!

5.1 ...compare numbers?

The operators for comparing numbers are >, >=, <, =<, == and =/= . Some of these are a little different to C for historical reasons. Here are some examples:

	Eshell V4.9.1  (abort with ^G)
	1> 13 > 2.
	true
	2> 18.2 >= 19.
	false
	3> 3 == 3.
	true
	4> 4 =/= 4.
	false
	5> 3 = 4.
	** exited: {{badmatch,4},[{erl_eval,expr,3}]} **

The last example is a (failed) pattern match rather than a comparison.

5.2 ...represent a text-string?

As a list of characters. You can write

A = "hello world".

which is exactly the same as writing

A = [104,101,108,108,111,32,119,111,114,108,100].

and also the same as writing

A = [$h,$e,$l,$l,$o,$ ,$w,$o,$r,$l,$d].

Each character consumes 8 bytes of memory on a 32 bit machine (a 32 bit integer and a 32 bit pointer) and twice as much on 64 bit machines. Access to the nth. element takes O(n) time. Access to the first element takes O(1) time, as does prepending a character.

The bit-syntax, which is described in the manual provides an alternative, but somewhat limited, way to represent strings compactly, for instance:

	1> A = <<"this string consumes one byte per character">>.
	<<116,104,105,115,32,115,116,114,...>>
	2> size(A).
	43
	3> <<_:8/binary,Q:3/binary,R/binary>> = A.
	<<116,104,105,115,32,115,116,114,...>>
	4> Q.
	<<105,110,103>>
	5> binary_to_list(Q).
	"ing"

This style of pattern-matching manipulation of strings provides O(1) equivalents of some string operations which are O(N) when using the list representation. Some other operations which are O(1) in the list representation become O(N) when using the bit-syntax.

There are general ways to improve string performance.

5.3 ...convert a string to lower case?

        1> string:to_lower("Five Boxing WIZARDS").
	"five boxing wizards"
	2> string:to_upper("jump QuickLY").
	"JUMP QUICKLY"

If you have an older version of Erlang (prior to R11B-5), you can find these two functions in the httpd_util module.

5.4 ...convert a text string representation of a term to a term?

        1> {ok,Tokens,_} = erl_scan:string("[{foo,bar},x,3].").
        {ok,[{'[',1}, {'{',1}, {atom,1,foo}, {',',1}, {atom,1,bar},...
        2> {ok,Term} = erl_parse:parse_term(Tokens).
        {ok,[{foo,bar},x,3]}

5.5 ...use unicode/UTF-8?

Erlang represents strings as lists of integers, so it is directly capable of handling unicode. The unicode library module helps with converting between various representations of unicode.

5.6 ...destructively update data, like in C

Not being able to do this is considered a feature of Erlang. The Erlang book explains this in chapter 1.

Having said that, there are common methods to achieve effects similar to destructive update: store the data in a process (use messages to manipulate it), store the data in ETS or use a data structure designed for relatively cheap updates (the dict library module is one common solution).

5.7 ...run an Erlang program directly from the unix shell?

As of Erlang/OTP R11B-4, you can run Erlang code without compiling by using Escript. The manual page has a complete example showing how to do it.

Escript is intended for short, "script-like" programs. A more general way to do it, which also works for older Erlang releases, is to start the Erlang VM without a shell and pass more switches to the Erlang virtual machine. Here's hello world again:

	-module(hello).
	-export([hello_world/0]).

	hello_world() ->
		io:fwrite("hello, world\n").

Save this as hello.erl, compile it and run it directly from the unix (or msdos) command line:

	matthias >erl -compile hello
	matthias >erl -noshell -s hello hello_world -s init stop
	hello, world

5.8 ...communicate with non-Erlang programs?

Erlang has several mechanisms for communicating with programs written in other languages, each with different tradeoffs. The Erlang documentation includes a guide to interfacing with other languages which describes the most common mechanisms:

Distributed Erlang
Ports and linked-in drivers
EI (C interface to Erlang, replaces the mostly deprecated erl_interface)
Jinterface (Java interface to Erlang)
IC (IDL compiler, used with C or Java)
General TCP or UDP protocols, including ASN.1
NIFs (Functions implemented in C)

Some of the above methods are easier to use than others. Loosely coupled approaches, such as using a custom protocol over a TCP socket, are easier to debug than the more tightly coupled options such as linked-in drivers. The Erlang-questions mailing list archives contain many desperate cries for help with memory corruption problems caused by using Erl Interface and linked-in drivers incorrectly...

There are a number of user-supported methods and tools, including

5.9 ...write a unix pipe program in Erlang?

Lots of useful utilities in unix can be chained together by piping one program's output to another program's input. Here's an example which rot-13 encodes standard input and sends it to standard output:

	-module(rot13).
	-export([rot13/0]).

	rot13() ->
	  case io:get_chars('', 8192) of
	    eof -> init:stop();
	  Text ->
	    Rotated = [rot13(C) || C <- Text],
	    io:put_chars(Rotated),
	    rot13()
	end.

	rot13(C) when C >= $a, C =< $z -> $a + (C - $a + 13) rem 26;
	rot13(C) when C >= $A, C =< $Z -> $A + (C - $A + 13) rem 26;
	rot13(C) -> C.

After compiling this, you can run it from the command line:

	matthias> cat ~/.bashrc | erl -noshell -s rot13 rot13 | wc

5.10 ...communicate with a DBMS?

There are two completely different ways to communicate with a database. One way is for Erlang to act as a database for another system. The other way is for an Erlang program to access a database. The former is a "research" topic. The latter is easily accomplished by going via ODBC, which allows you to access almost any commercial DBMS. The OTP ODBC manual explains how to do this.

5.11 ...decode binary Erlang terms

Erlang terms can be converted to and from binary representations using bifs:

	1> term_to_binary({3, abc, "def"}).
	<<131,104,3,97,3,100,0,3,97,98,99,107,0,3,100,101,102>>
	2> binary_to_term(v(1)).
	{3,abc,"def"}

If you want to decode them using C programs, take a look at EI.

5.12 ...decode Erlang crash dumps

Erlang crash dumps provide information about the state of the system when the emulator crashed. The manual explains how to interpret them.

5.13 ...estimate performance of an Erlang system?

Mike Williams, one of Erlang's original developers, is fond of saying

"If you don't run experiments before you start designing a new system, your entire system will be an experiment!"

This philosophy is widespread around Erlang projects, in part because the Erlang development environment encourages development by prototyping. Such prototyping will also allow sensible performance estimates to be made.

For those of you who want to leverage experience with C and C++, some rough rules of thumb are:

Code which involves mainly number crunching and data processing will run about 10 times slower than an equivalent C program. This includes almost all "micro benchmarks"
Large systems which spent most of their time communicating with other systems, recovering from faults and making complex decisions run at least as fast as equivalent C programs.

Like in any other language or system, experienced developers develop a sense of which operations are expensive and which are cheap. Erlang newcomers accustomed to the relatively slow interprocess communication facilities in other languages tend to over-estimate the cost of creating Erlang processes and passing messages between them.

5.14 ...measure performance of an Erlang system?

The timer module measures the wall clock time elapsed during execution of a function:

	7> timer:tc(lists, reverse, ["hello world"]).
	{27,"dlrow olleh"}
	8> timer:tc(lists, reverse, ["hello world this is a longer string"]).
	{34,"gnirts regnol a si siht dlrow olleh"}

The eperf library provides a way to profile a system.

5.15 ...measure memory consumption in an Erlang system?

Memory consumption is a bit of a tricky issue in Erlang. Usually, you don't need to worry about it because the garbage collector looks after memory management for you. But, when things go wrong, there are several sources of information. Starting from the most general:

Some operating systems provide detailed information about process memory use with tools like top, ps or the linux /proc filesystem:

	cat /proc/5898/status

	VmSize:     7660 kB
	VmLck:         0 kB
	VmRSS:      5408 kB
	VmData:     4204 kB
	VmStk:        20 kB
	VmExe:       576 kB
	VmLib:      2032 kB

This gives you a rock-solid upper-bound on the amount of memory the entire Erlang system is using.

erlang:system_info reports interesting things about some globally allocated structures in bytes:

        3> erlang:system_info(allocated_areas).
	[{static,390265},
	 {atom_space,65544,49097},
	 {binary,13866},
	 {atom_table,30885},
	 {module_table,944},
	 {export_table,16064},
	 {register_table,240},
	 {loaded_code,1456353},
	 {process_desc,16560,15732},
	 {table_desc,1120,1008},
	 {link_desc,6480,5688},
	 {atom_desc,107520,107064},
	 {export_desc,95200,95080},
	 {module_desc,4800,4520},
	 {preg_desc,640,608},
	 {mesg_desc,960,0},
	 {plist_desc,0,0},
	 {fixed_deletion_desc,0,0}]

Information about individual processes can be obtained from erlang:process_info/1 or erlang:process_info/2:

	2> erlang:process_info(self(), memory).
	{memory,1244}

The shell's i() and the pman tool also give useful overview information.

Don't expect the sum of the results from process_info and system_info to add up to the total memory use reported by the operating system. The Erlang runtime also uses memory for other things.

A typical approach when you suspect you have memory problems is

1. Confirm that there really is a memory problem by checking that memory use as reported by the operating system is unexpectedly high.

2. Use pman or the shell's i() command to make sure there isn't an out-of-control erlang process on the system. Out-of-control processes often have enormous message queues. A common reason for Erlang processes to get unexpectedly large is an endlessly looping function which isn't tail recursive.

3. Check the amount of memory used for binaries (reported by system_info). Normal data in Erlang is put on the process heap, which is garbage collected. Large binaries, on the other hand, are reference counted. This has two interesting consequences. Firstly, binaries don't count towards a process' memory use. Secondly, a lot of memory can be allocated in binaries without causing a process' heap to grow much. If the heap doesn't grow, it's likely that there won't be a garbage collection, which may cause binaries to hang around longer than expected. A strategically-placed call to erlang:garbage_collect() will help.

4. If all of the above have failed to find the problem, start the Erlang runtime system with the -instr switch.

5.16 ...estimate productivity in an Erlang project?

A rough rule of thumb is that about the same number of lines of code are produced per developer as in a C project. A reasonably complex problem involving distribution and fault tolerance will be roughly five times shorter in Erlang than in C.

The traditional ways of slowing down projects, like adding armies of consultants halfway through, spending a year writing detailed design specifications before any code is written, rigidly following a waterfall model, spreading development across several countries and holding team meetings to decide on the colour of the serviettes used at lunch work just as well for Erlang as for other languages.

5.17 ...use multiple CPUs (or cores) in one server?

Erlang has SMP support on all major platforms and it's enabled by default.

5.18 ...run distributed Erlang through a firewall?

The simplest approach is to make an a-priori restriction to the TCP ports distributed Erlang uses to communicate through by setting the (undocumented) kernel variables 'inet_dist_listen_min' and 'inet_dist_listen_max'. Example:

	application:set_env(kernel, inet_dist_listen_min, 9100).
	application:set_env(kernel, inet_dist_listen_max, 9105).

This forces Erlang to use only ports 9100--9105 for distributed Erlang traffic. In the above example, you would then need to configure your firewall to pass ports 9100--9105 as well as port 4369 (for the erlang port mapper).

There are other approaches, such as tunnelling the information through SSH or writing your own distribution handler.

5.19 ...distribute the Erlang programs I write to my friends/colleagues/users?

Erlang programs only run on the Erlang VM, so every machine which is going to run an Erlang program needs to have a copy of the Erlang runtime installed.

Installing the entire Erlang system from erlang.org (or, perhaps, indirectly via a packaging system such as Debian's or BSD's) is the simplest option in many cases.

A historical footnote from 2007 is SAE: stand-alone Erlang. SAE allowed an Erlang program to be distributed as just two files, totalling about 500k. SAE fell into disuse and is no longer maintained.

5.20 ...write to standard error (stderr)?

In R13B and later, use the atom standard_error:

            io:put_chars(standard_error, "this text goes to stderr\n").

That also works with io:format/3 and all other functions that take an IoDevice parameter.