Advanced Topics

Scope of variables
Catch/throw
Use of Catch and Throw
The module error_handler
The Code loading mechanism
Ports
Port Protocols
Binaries
References
Space saving optimisations
Last Call Optimisation
Process Dictionary
Obtaining System Information

Scope of Variables

Variables in a clause exist between the point where the variable is first bound and the last textual reference to the variable.

Consider the following code:

1...	f(X) ->
2...		Y = g(X),
3...		h(Y, X),
4...		p(Y),
5...		f(12).

line 1 - the variable X is defined (i.e. it becomes bound when the function is entered).
line 2 - X is used, Y is defined (first occurrence).
line 3 - X and Y are used.
line 4 - Y is used. The space used by the system for storing X can be reclaimed.
line 5 - the space used for Y can be reclaimed.

Scope of variables in if/case/receive

The set of variables introduced in the different branches of an if/case/receive form must be the same for all branches in the form except if the missing variables are not referred to after the form.

f(X) ->
    case g(X) of
	true -> A = h(X), B = 7;
	false -> B = 6
    end,
    ...,
    h(A),
    ...

If the true branch of the form is evaluated, the variables A and B become defined, whereas in the false branch only B is defined.

Whether or not this an error depends upon what happens after the case function. In this example it is an error, a future reference is made to A in the call h(A) - if the false branch of the case form had been evaluated then A would have been undefined.

Catch and Throw

Suppose we have defined the following:

-module(try).
-export([foo/1]).

foo(1) -> hello;
foo(2) -> throw({myerror, abc});
foo(3) -> tuple_to_list(a);
foo(4) -> exit({myExit, 222}).

try:foo(1) evaluates to hello.

try:foo(2) tries to evaluate throw({myerror, abc}) but no catch exists. The process evaluating foo(2) exits and the signal {`EXIT',Pid,nocatch} is broadcast to the link set of the process.

try:foo(3) broadcasts {`EXIT', Pid, badarg} signals to all linked processes.

try:foo(4) since no catch is set the signal {`EXIT',Pid,{myexit, 222}} is broadcast to all linked processes.

try:foo(5) broadcasts the signal {`EXIT',Pid,function_clause} to all linked processes.

catch try:foo(1) evaluates to hello.
catch try:foo(2) evaluates to {myError,abc}.
catch try:foo(3) evaluates to {`EXIT',badarg}.
catch try:foo(4) evaluates to {`EXIT',{myExit,222}}.
catch try:foo(5) evaluates to {`EXIT',function_clause}.

Use of Catch and Throw

Catch and throw can be used to:

Protect from bad code
Cause non-local return from a function

Example:

f(X) ->
    case catch func(X) of
	{`EXIT', Why} ->
            ... error in BIF ....
            ........ BUG............
	{exception1, Args} ->
            ... planned exception ....
	Normal ->
            .... normal case ....
    end.

func(X) ->
    ...

func(X) ->
   bar(X),
   ...
...

bar(X) ->
   throw({exception1, ...}).
...

The module error_handler

The module error_handler is called when an undefined function is called.

If a call is made to Mod:Func(Arg0,...,ArgN) and no code exists for this function then
undefined_call(Mod, Func,[Arg0,...,ArgN]) in the module error_handler will be called. The code in error_handler is almost like this:

-module(error_handler).
-export([undefined_call/3]).

undefined_call(Module, Func, Args) ->
    case code:if_loaded(Module) of
	true ->
            %% Module is loaded but not the function
		...
            exit({undefined_function, {Mod, Func, Args}});
        false ->
 	    case code:load(Module) of
                {module, _} ->
                    apply(Module, Func, Args);
                false ->
                    ....
    end.

By evaluating process_flag(error_handler, MyMod) the user can define a private error handler. In this case the function:MyMod:undefined_function will be called instead of error_handler:undefined_function.

Note:This is extremely dangerous

The Code loading mechanism

Consider the following:

-module(m).
-export([start/0,server/0]).

start() ->
    spawn(m,server,[]).

server() ->
    receive
	Message ->
            do_something(Message),
            m:server()
    end.

When the function m:server() is called then a call is made to the latest version of code for this module.

If the call had been written as follows:

server() ->
    receive
	Message ->
            do_something(Message),
            server()
    end.

Then a call would have been made to the current version of the code for this module.

Prefixing the module name (i.e. using the : form of call allows the user to change the executing code on the fly.

The rules for evaluation are as follows:

Must have the module prefix in the recursive call ( m:server() ) if we want to change the executing code on the fly.
Without prefix, the executing code will not be exchanged with the new one.
We can't have more than two versions of the same module in the system at the same time.

Ports

Ports:

Provide byte stream interfaces to external UNIX processes.
Look like normal Erlang processes, that are not trapping exits, with a specific protocol. That is, they can be linked to, and send out/react to exit signals.
Communicates with a single Erlang process, this process is said to be connected.

The command:

Port = open_port ( {spawn,Process} , {packet,2} )

Starts an external UNIX process - this process reads commands from Erlang on file descriptor 0 and sends commands to Erlang by writing to file descriptor 1.

Port Protocols

Data is passed as a sequence of bytes between the Erlang processes and the external UNIX processes. he number of bytes passed is given in a 2 bytes length field.

Erlang process passig data to Unix process

"C" should check return value from read. See p.259 in the book for more info.

Binaries

A binary is a reference to a chunk of untyped memory.
Binaries are primarily used for code loading over the network.
Also useful when applications wants to shuffle around large amount of raw data.
Several BIF's exist for manipulating binaries, such as: binary_to_term/1, term_to_binary/1, binary_to_list/1, split_binary/2 concat_binary/1 , etc..
open_port/2 can produce and send binaries.
There is also a guard called binary(B) which succeeds if its argument is a Binary

References

References are erlang objects with exactly two properties:

They can be created by a program (using make_ref/0), and,
They can be compared for equality.

Erlang references are unique, the system guarantees that no two references created by different calls to make_ref will ever match. The guarantee is not 100% - but differs from 100% by an insignificantly small amount :-).

References can be used for writing a safe remote procedure call interface, for example:

ask(Server, Question) ->
    Ref = make_ref(),
    Server ! {self(), Ref, Question},
    receive
        {Ref, Answer} ->
	    Answer
    end.

server(Data) ->
    receive
	{From, Ref, Question} ->
            Reply = func(Question, Data),
            From ! {Ref, Reply},
            server(Data);
	...
    end.

Space Saving Optimisations

Here are two ways of computing the sum of a set of numbers contained in a list. The first is a recursive routine:

sum([H|T]) ->
    H + sum(T);
sum([]) ->
    0.

Note that we canot Evaluate '+' until both its arguments are known. This formulation of sum(X) evaluates in space O(length(X)).

The second is a tail recursive which makes use of an accumulator Acc:

sum(X) ->
    sum(X, 0).

sum([H|T], Acc) ->
   sum(T, H + Acc);
sum([], Acc) ->
    Acc.

The tail recursive formulation of sum(X). Evaluates in constant space.

Tail recursive = the last thing the function does is to call itself.

Last Call Optimisation

The last call optimisation must be used in persistant servers.

For example:

server(Date) ->
    receive
	{From, Info} ->
            Data1 = process_info(From, Info, Data),
            server(Data1);
	{From, Ref, Query} ->
             {Reply, Data1} = process_query(From, Query,Data),
             From ! {Ref, Reply},
             server(Data1)
    end.

Note that the last thing to be done in any thread of computation must be to call the server.

Process Dictionary

Each process has a local store called the "Process Dictionary". The following BIFs are used to manipulate the process dictionary:

get() returns the entire process dictionary.
get(Key) returns the item associated with Key (Key is any Erlang data structure), or, returns the special atom undefined if no value is associated with Key.
put(Key, Value) associate Value with Key. Returns the old value associated with Key, or, undefined if no such association exists.
erase() erases the entire process dictionary. Returns the entire process diction before it was erased.
erase(Key) erases the value associated with Key. Returns the old value associated with Key, or, undefined if no such association exists.
get_keys(Value) returns a list of all keys whose associated value is Value.

Note that using the Process Dictionary:

Destroys referencial transparency
Makes debugging difficult
Survives Catch/Throw

So:

Use with care
Do not over use - try the clean version first

Obtaining System Information

The following calls exist to access system information:

processes() returns a list of all processes currently know to the system.
process_info(Pid) returns a dictionary containing information about Pid.
Module:module_info() returns a dic tionary containing information about the code in module Module.

If you use these BIFs remember:

Use with extreme care
Don't assume fixed positions for items in the dictionaries.

But you can do some fun things like:

Writing real filthy programs, e.g. message sending by remote polling of dictionaries Why should anybody want to do this?
Killing random processes
Write Metasystem programs
Poll system regularly for zomby processes
Poll system to detect or break deadlock
Analyse system performance