Advanced Topics Erlang logo

Advanced Topics



Scope of Variables

Variables in a clause exist between the point where the variable is first bound and the last textual reference to the variable.

Consider the following code:

1...	f(X) ->
2...		Y = g(X),
3...		h(Y, X),
4...		p(Y),
5...		f(12).
  • line 1 - the variable X is defined (i.e. it becomes bound when the function is entered).
  • line 2 - X is used, Y is defined (first occurrence).
  • line 3 - X and Y are used.
  • line 4 - Y is used. The space used by the system for storing X can be reclaimed.
  • line 5 - the space used for Y can be reclaimed.

Scope of variables in if/case/receive

The set of variables introduced in the different branches of an if/case/receive form must be the same for all branches in the form except if the missing variables are not referred to after the form.

f(X) ->
    case g(X) of
	true -> A = h(X), B = 7;
	false -> B = 6
    end,
    ...,
    h(A),
    ...
If the true branch of the form is evaluated, the variables A and B become defined, whereas in the false branch only B is defined.

Whether or not this an error depends upon what happens after the case function. In this example it is an error, a future reference is made to A in the call h(A) - if the false branch of the case form had been evaluated then A would have been undefined.

back to top


Catch and Throw

Suppose we have defined the following:


-module(try).
-export([foo/1]).

foo(1) -> hello;
foo(2) -> throw({myerror, abc});
foo(3) -> tuple_to_list(a);
foo(4) -> exit({myExit, 222}).
try:foo(1) evaluates to hello.

try:foo(2) tries to evaluate throw({myerror, abc}) but no catch exists. The process evaluating foo(2) exits and the signal {`EXIT',Pid,nocatch} is broadcast to the link set of the process.

try:foo(3) broadcasts {`EXIT', Pid, badarg} signals to all linked processes.

try:foo(4) since no catch is set the signal {`EXIT',Pid,{myexit, 222}} is broadcast to all linked processes.

try:foo(5) broadcasts the signal {`EXIT',Pid,function_clause} to all linked processes.

catch try:foo(1) evaluates to hello.
catch try:foo(2) evaluates to {myError,abc}.
catch try:foo(3) evaluates to {`EXIT',badarg}.
catch try:foo(4) evaluates to {`EXIT',{myExit,222}}.
catch try:foo(5) evaluates to {`EXIT',function_clause}.

back to top


Use of Catch and Throw

Catch and throw can be used to:
  • Protect from bad code
  • Cause non-local return from a function
Example:
f(X) ->
    case catch func(X) of
	{`EXIT', Why} ->
            ... error in BIF ....
            ........ BUG............
	{exception1, Args} ->
            ... planned exception ....
	Normal ->
            .... normal case ....
    end.

func(X) ->
    ...

func(X) ->
   bar(X),
   ...
...

bar(X) ->
   throw({exception1, ...}).
...

back to top


The module error_handler

The module error_handler is called when an undefined function is called.

If a call is made to Mod:Func(Arg0,...,ArgN) and no code exists for this function then
undefined_call(Mod, Func,[Arg0,...,ArgN]) in the module error_handler will be called. The code in error_handler is almost like this:


-module(error_handler).
-export([undefined_call/3]).

undefined_call(Module, Func, Args) ->
    case code:if_loaded(Module) of
	true ->
            %% Module is loaded but not the function
		...
            exit({undefined_function, {Mod, Func, Args}});
        false ->
 	    case code:load(Module) of
                {module, _} ->
                    apply(Module, Func, Args);
                false ->
                    ....
    end.
By evaluating process_flag(error_handler, MyMod) the user can define a private error handler. In this case the function:MyMod:undefined_function will be called instead of error_handler:undefined_function.

Note:This is extremely dangerous

back to top


The Code loading mechanism

Consider the following:

-module(m).
-export([start/0,server/0]).

start() ->
    spawn(m,server,[]).

server() ->
    receive
	Message ->
            do_something(Message),
            m:server()
    end.
When the function m:server() is called then a call is made to the latest version of code for this module.

If the call had been written as follows:

server() ->
    receive
	Message ->
            do_something(Message),
            server()
    end.   
Then a call would have been made to the current version of the code for this module.

Prefixing the module name (i.e. using the : form of call allows the user to change the executing code on the fly.

The rules for evaluation are as follows:

  • Must have the module prefix in the recursive call ( m:server() ) if we want to change the executing code on the fly.
  • Without prefix, the executing code will not be exchanged with the new one.
  • We can't have more than two versions of the same module in the system at the same time.

back to top


Ports

Ports:
  • Provide byte stream interfaces to external UNIX processes.
  • Look like normal Erlang processes, that are not trapping exits, with a specific protocol. That is, they can be linked to, and send out/react to exit signals.
  • Communicates with a single Erlang process, this process is said to be connected.
The command:
    Port = open_port ( {spawn,Process} , {packet,2} )
Starts an external UNIX process - this process reads commands from Erlang on file descriptor 0 and sends commands to Erlang by writing to file descriptor 1.

back to top


Port Protocols

Data is passed as a sequence of bytes between the Erlang processes and the external UNIX processes. he number of bytes passed is given in a 2 bytes length field.

Erlang process passig data to Unix process

"C" should check return value from read. See p.259 in the book for more info.

back to top


Binaries

  • A binary is a reference to a chunk of untyped memory.
  • Binaries are primarily used for code loading over the network.
  • Also useful when applications wants to shuffle around large amount of raw data.
  • Several BIF's exist for manipulating binaries, such as: binary_to_term/1, term_to_binary/1, binary_to_list/1, split_binary/2 concat_binary/1 , etc..
  • open_port/2 can produce and send binaries.
  • There is also a guard called binary(B) which succeeds if its argument is a Binary

back to top


References

References are erlang objects with exactly two properties:
  • They can be created by a program (using make_ref/0), and,
  • They can be compared for equality.

Erlang references are unique, the system guarantees that no two references created by different calls to make_ref will ever match. The guarantee is not 100% - but differs from 100% by an insignificantly small amount :-).

References can be used for writing a safe remote procedure call interface, for example:


ask(Server, Question) ->
    Ref = make_ref(),
    Server ! {self(), Ref, Question},
    receive
        {Ref, Answer} ->
	    Answer
    end.

server(Data) ->
    receive
	{From, Ref, Question} ->
            Reply = func(Question, Data),
            From ! {Ref, Reply},
            server(Data);
	...
    end.

back to top


Space Saving Optimisations

Here are two ways of computing the sum of a set of numbers contained in a list. The first is a recursive routine:
sum([H|T]) ->
    H + sum(T);
sum([]) ->
    0.
Note that we canot Evaluate '+' until both its arguments are known. This formulation of sum(X) evaluates in space O(length(X)).

The second is a tail recursive which makes use of an accumulator Acc:

sum(X) ->
    sum(X, 0).

sum([H|T], Acc) ->
   sum(T, H + Acc);
sum([], Acc) ->
    Acc.
The tail recursive formulation of sum(X). Evaluates in constant space.

Tail recursive = the last thing the function does is to call itself.

back to top


Last Call Optimisation

The last call optimisation must be used in persistant servers.

For example:

server(Date) ->
    receive
	{From, Info} ->
            Data1 = process_info(From, Info, Data),
            server(Data1);
	{From, Ref, Query} ->
             {Reply, Data1} = process_query(From, Query,Data),
             From ! {Ref, Reply},
             server(Data1)
    end.
Note that the last thing to be done in any thread of computation must be to call the server.

back to top


Process Dictionary

Each process has a local store called the "Process Dictionary". The following BIFs are used to manipulate the process dictionary:
  • get() returns the entire process dictionary.
  • get(Key) returns the item associated with Key (Key is any Erlang data structure), or, returns the special atom undefined if no value is associated with Key.
  • put(Key, Value) associate Value with Key. Returns the old value associated with Key, or, undefined if no such association exists.
  • erase() erases the entire process dictionary. Returns the entire process diction before it was erased.
  • erase(Key) erases the value associated with Key. Returns the old value associated with Key, or, undefined if no such association exists.
  • get_keys(Value) returns a list of all keys whose associated value is Value.
Note that using the Process Dictionary:
  • Destroys referencial transparency
  • Makes debugging difficult
  • Survives Catch/Throw
So:
  • Use with care
  • Do not over use - try the clean version first

back to top


Obtaining System Information

The following calls exist to access system information:
  • processes() returns a list of all processes currently know to the system.
  • process_info(Pid) returns a dictionary containing information about Pid.
  • Module:module_info() returns a dic tionary containing information about the code in module Module.
If you use these BIFs remember:
  • Use with extreme care
  • Don't assume fixed positions for items in the dictionaries.
But you can do some fun things like:
  • Writing real filthy programs, e.g. message sending by remote polling of dictionaries Why should anybody want to do this?
  • Killing random processes
  • Write Metasystem programs
  • Poll system regularly for zomby processes
  • Poll system to detect or break deadlock
  • Analyse system performance

back to top

Powered by Erlang Web