[Erlang Systems]

11 Release Handling

11.1 Release Handling Principles

A new release is assembled into a release package. Such a package is installed in a running system by giving commands to the release handler, which is an SASL process. A system has a unique system version, which is updated whenever a new release is installed. The system version is the version of the entire system, not just the OTP version.

If the system consists of several nodes, each node has its own system version. Release handling can be synchronized between nodes, or be done at one node at a time.

Changes may require a node to be brought down. If that is the case and the system consists of several nodes, the release upgrade can be done as follows;

  1. move all applications from the node to be changed to other nodes,

  2. take down the node,

  3. do the change,

  4. restart the node and move the applications back.

There are several different types of releases:

Operating system change.
Can only be done by taking down the node. This kind of change is not supported by the release handler and therefore has to be performed manually. It is not possible to roll back automatically to a previous release, if there is an error.
Application code or data change.
The release is installed without bringing down the running node. Some changes, for example change of C-programs, may be done by shutting down and restarting the affected processes.
Erlang emulator change.
Can only be made by taking down the node. However, the release handler supports this type of change.

11.2 Administering Releases

This section describes how to build and install releases. Also refer to the SASL Reference Manual, release_handler, for more details.

The following steps are involved in administering releases:

  1. A release package is built by using release building commands in the systools module. The package is assembled from application specification files, code files, data files, and a file, which describes how the release is installed in the system.

  2. The release package is transferred to the target machine, e.g. by using ftp.

  3. The release package is unpacked, which makes the system version in the release package available for installation by the release_handler, which interprets the release upgrade script, containing instructions for updating to the new version. If an installation fails in some way, the entire system is restarted from the old system version.

  4. When the installation is complete, the system version must be made permanent. When permanent, the new version is used if the system restarts.

It is also possible to reinstall an old version, or reboot the system from an old version. There are functions to remove old releases from disk as well.

11.3 File Structure

The file structure used in an OTP system is described in Release Directories. There are two ways of using this file structure together with the release handler.

The simplest way is to store all user-defined applications under $OTP_ROOT/lib in the same way as other OTP applications. The release handler takes care of everything, from unpacking a release to the removal of it. The release packages should be stored in the releases directory (default $OTP_ROOT/releases). This is where release_handler:unpack_release/1 searches for the packages, and where the release handler stores its files. Each package is a compressed tar file. The files in the tar file are named relative to the $OTP_ROOT directory. For example, if a new version (say 1.3) of the application snmp is contained in the release package, the files in the tar file should be named lib/snmp-1.3/*.

The second way is to store all user-defined applications in some other place in the file system. In this case, some more work has to be done outside the release handler. Specifically, the release packages must be unpacked in some way and the release handler must be notified of where the new release is located. The following three functions are available in the module release_handler to handle this case:

11.4 Release Installation Files

The following files must be present when a release is installed. All file names are relative to the releases directory.

The location of the releases directory is specified with the configuration parameter releases_dir (default $OTP_ROOT/releases). In a target system, the default location is preferred, but during testing it may be more convenient to let the release handler write its files in a user specified directory, than in the $OTP_ROOT directory.

The files listed above are either present in the release package, or generated at the target machine and copied to their correct places using release_handler:install_file/2.

Vsn is the system version string.

11.4.1 ReleaseFileName.rel

The ReleaseFileName.rel file contains the name of the system, version of the release, the version of erts (the Erlang runtime system) and the applications, which are parts of the release. The file must contain the following Erlang term:

    {release, {Name, Vsn}, {erts, EVsn}, 
     [{App, AVsn} | {App, AVsn, AType} | {App, AVsn, [App]} |
        {App, AVsn, AType, [App]}]}.
      

Name, Vsn, EVsn and AVsn are strings, App and AType are atoms. ReleaseFileName is a string given in the call to release_handler:unpack_release(ReleaseFileName). Name is the name of the system (the same as found in the boot file). This file is further described in Release Structure.

11.4.2 relup

The relup file contains instructions on how to install the new version in the system. It must contain one Erlang term:

    {Vsn, [{FromVsn, Descr, RuScript}], [{ToVsn, Descr, RuScript}]}.
      

Vsn, FromVsn and ToVsn are strings, RuScript is a release upgrade script. Descr is a user defined parameter, which is not processed by any release handling functions. It can be used to describe the release to an operator. Finally, it will be returned by release_handler:install_release/1 and release_handler:check_install_release/1.

There is one tuple {FromVsn, Descr, RuScript} for each old system version which can be upgraded to the new version, and one tuple {ToVsn, Descr, RuScript} for each old version to which the new version can be downgraded.

11.4.3 start.boot

The start.boot file is the compiled start.script file. It is used to boot the Erlang machine.

11.4.4 sys.config

The sys.config is the system configuration file.

11.5 Release Handling Principles

The following sections describe the principles for updating parts of an OTP system.

11.5.1 Erlang Code

The code change feature in Erlang is made possible because Erlang allows two versions of a module to be present in the system: the current version and the old version. There is always a current version of a loaded module, but an old version of a module only exists if the module has been replaced in run-time by loading a new version. When a new version is loaded, the previously current version becomes the old version, and the new version becomes the current version. However, if there are both a current and old version of a module, a new version cannot be loaded, unless the old version is first explicitly purged.

A global function call is a call where a qualified module name is used, i.e. the call is of the form M:F(A) (or apply(M, F, A)). A global call causes M:F to be dynamically linked into the run-time code, which means that M:F(A) will be evaluated using the latest available version of the module, i.e. the current version.

A local function call is a call without a qualified module name, i.e. the call is of the form F(A). The reference to F is resolved at compile time (irrespective of whether F is exported or not). By the very nature of F(A) being a local function call, F can only be called by a function that is defined in the very same module as that where F is defined. Hence a local function call is always evaluated in the same version of a module as that of the caller.

A fun is a function without a name. Like ordinary functions (i.e. functions which have names) its implementation is always bound to some module, and therefore funs are affected by code change as well. A reference to a fun is always indirect, as is the case for a global function call, where the reference is M:F (through an export table entry for the module), but the reference is not necessarily global. In fact, if a fun is called in the same module where it is defined, its reference will be resolved in the same way as a local function call is resolved. If a fun is called from a different module, its reference will be resolved as if the call was a global call, but with the additional requirement that the reference also match the particular implementation of the module where the fun was defined.

For each process there is a current function, i.e. the function that the process is currently evaluating. That function resides in some module. Hence a process has always a reference to at least one module. It may of course have references to other modules as well, because of nested, not yet finished calls.

Before a new version of a module can be loaded, the current version must be made old. If there is no old version, the new version is merely loaded, making the previously current version to the old version, and the new version becomes current. All processes that execute the version, which became old, will continue to do so, until they have no unfinished calls within the old version.

If there is an old version, it must first be purged to make room for the current version to become old. However, an old version should not be purged if there are processes that have references to it. Such processes must either be terminated, or the loading of the new version must be postponed until they have terminated by themselves or no longer have references to the old version. There are options for controlling this in release upgrade scripts.

To prevent processes from making calls to other processes during the release installation, they may be suspended. All processes implemented with the standard behaviors, or with sys, can be suspended. When suspended a process enters a special suspend loop instead of its usual main process loop. In the suspend loop, the process can only receive system messages and shut-down messages from its supervisor. The code change message is a special system message, and this message causes the process to change code to the new version, and possibly to transform its internal state. After the code change a process is resumed, i.e. it returns to its main loop.

We highlight here three different types of modules.

Functional module.
A module, which does not contain a process loop, i.e. no process has constant references to this kind of module. lists is an example of a functional module.
Process module.
A module, which contains a process loop, i.e. some process has constant reference to the module. init is an example of a process module.
Call-back module.
A special case of a functional module which serves as a call-back module for a generic behavior such as gen_server. file is an example of a call-back module. A call to a call-back module is always a global call (i.e. it refers to the latest version of the module). This has some impacts upon how updates must be handled.

Modules of the above types are handled differently when changing code.

11.5.1.1 Functional Module

If the API of a new version of a functional module is backward compatible, as may be the case of a bug fix or new functionality, we simply load the new version. After a short while, when no processes have references to the old version, the old module is purged.

A more complicated situation arises if the API of a functional module is changed so it is not longer backwards compatible. We must then make sure that no processes, directly or indirectly, try to call functions that have changed. We do this by writing new versions of all modules that use the API. Then, when performing the code change, all potential caller processes are suspended, new versions of the modules that uses the API are loaded, the new version of the functional module is loaded, and finally all suspended processes are resumed.

There are two alternatives available to manage this type of change:

  1. Find all calls to the module, change them, and write dependencies in your release upgrade script. This may be manageable, if a function that has been incompatibly changed is called from only a few other functions.

  2. Avoid this type of change. This is the only reasonable solution, if an incompatible function is called from many other modules. Instead a completely new function should be introduced, and the original function should be kept for backward compatibility. In the next release, when all other modules are changed as well, the original function can be deleted.

11.5.1.2 Process Module

A process module should never contain global calls to itself (except for code that makes explicit code change). Therefore, a new version of a process module is merely loaded and all processes which are executing the module are told to change their code and, if required, to transform their internal state.

In practice, few modules are pure in the sense that they never contain global calls to themselves. If you use higher-order functions such as lists:map/2 in a process module, there will be global calls to the module. Therefore, we cannot merely load the module because a process might, still running the old version of the module, make a call to the new version, which might be incompatible.

The only safe way to change code for a process module, is to have its implementation to understand system messages, and to change code by first suspending all processes that run the module, then order them to change code, and finally resume them.

11.5.1.3 Call-back Module

As long as the type of the internal state of a call-back module has not changed, we can just simply load the new version of the module without suspending and resuming the processes involved in the code change. This case is similar to the case of a functional module.

If the type of the internal state has changed, we must first suspend the processes, tell them to change code and at the same time give them the possibility to transform their states, and finally resume them. This is similar to the case of a process module.

11.5.1.4 Dependencies Between Processes

It is possible that a group of processes, which communicate, must perform code changes while they are suspended. Some of the processes may otherwise use the old protocol while others use the new protocol. On the other hand, there may be time-out dependencies which restrict the number of processes that can perform a synchronized code change as one set. The more processes that are included in the set, the longer the processes are suspended.

There may also be problems with circular dependencies. The following scenario illustrates this situation.

The following sequence of events may occur:

  1. a is suspended.

  2. the release handler tries to suspend b, but some microsecond before this happens, b tries to communicate with a which is now suspended

  3. If b hangs in its call to a, the suspension of b fails and only a is updated.

  4. If b notices that a does not answer and is able to deal with it, then b receives the suspend message and is suspended. Then both modules are updated and the processes are resumed.

  5. When a resumes, there is a message waiting from b. This message may be of an old format which a does not recognize.

Situations of the type described, and many others, are highly application dependent. The author of the release upgrade script has to predict and avoid them. If the consequences are too difficult to manage, it may be better to entirely shut down and restart all affected processes. This reduces the problem of introducing new code and removes the need to do a synchronized change.

11.5.1.5 Finding Processes

For each application the .appup file specifies how the application is upgraded. The file contains specifications of which modules to change, and how to change them. The relup file is an assembly of all the .appup files.

For each application the release handler searches for all processes that have to perform a code change. It traverses the application supervision tree to find all child specifications of every supervisor in the tree. Each child specification lists all modules of the application that the child uses.

Hence it is by combining the list of modules to change with all children of supervisors that the release handler finds all processes that are subject to code change.

11.5.2 Port Programs

A port program runs as an external program in the operating system. The simplest way to do code change for a port program is to terminate it, and then start a new version of it.

If that is not adequate, code change may be performed by sending the port program a message telling it to return any data that must survive the termination. Then the program is terminated, and the new version is started and the survived data is to the new version of the port program.

Changing code for port programs is very application dependent. There is no special support for it in SASL.

11.5.3 Application Specification and Configuration Parameters

In each release, each application specification (i.e. the contents of the .app file of the application) is known to the release handler. Before any code change is performed for an application, the new environment variables are are made available for the application, i.e. those parameters specified by the env tag in the application specification. When the new version of an application is running it will be informed of any changed, new or removed environment variables (see application(Module) in the KERNEL Reference Manual). This means that old processes may read new variables before they are informed of the new release. We advise against the immediate removal of the old variables. Neither do we recommend that they be syntactically changed, although they may of course change their values. They can be safely removed in the next release, by which time it is known that no processes will read the old variables.

11.5.4 Mnesia Data or Schema Changes

Changing data or schemas in Mnesia is similar to changing code for functional modules. Many processes may read or write in the same table at the same time. If we change a table definition, we must make sure that all code which uses the table is changed at the same time.

One way of doing it is to let one process be responsible for one or several tables. This process creates the tables and changes the table definitions or table data. In this way a set of tables is connected with a module (process module or call-back module). When the process performs a code change, the tables are changed as well.

11.5.5 Upgrade vs. Downgrade

When a new release is installed, the system is upgraded to the new release. The release handler reads the relup file of the new release, and finds the upgrade script that corresponds to an upgrade from the current version to the new version of the system.

When an old release is reinstalled, the release handler reads the relup in the current release, and finds the downgrade script that corresponds to an downgrade from the current version to the old version of the system.

Usually a relup file for a new release contains one upgrade script and one downgrade script for each old version. If a soft downgrade is not wanted (an alternative is to reboot the system from the old release) the downgrade script is left out.

For each modified module in the new release, there are some instructions that specifies how to install that module in a system. When performing an upgrade, the following steps are typically involved:

  1. Suspend the processes running the module.

  2. Load the new code.

  3. Tell the processes to switch to new code.

  4. Tell the processes to change the internal state. This usually involves calling, in the new module, a code_change function that is responsible for state updates, e.g. transforming the state from the old format to the new.

  5. Resume the processes.

The code change step is always performed when new code has been loaded and all processes are running the new code. The reason for this is that it is always the new version of the module that knows how to change the state from the old version.

When performing a downgrade the situation is different. The old module does not know how to transform the new state to the old version: the new format is unknown to the old code. Therefore, it is the responsibility of new code to revert the state back to the old version during downgrade. The following steps are involved:

  1. Suspend the processes running the module.

  2. Tell the processes to change the internal state. This usually involves calling, in the current module, a code_change function that is responsible for state reversals, i.e. transforming the state from the current format to the old.

  3. Load the new code.

  4. Tell the processes to switch code.

  5. Resume the processes.

We note that for a process module, it is possible to load the code before a process change its internal state (since a process module never contains global calls to itself), thus making the steps needed for downgrade almost the same as for upgrade. The difference between the two cases is still in the order of switching code and changing state.

For a call-back module it is not actually necessary to tell the processes to switch code, since all calls to the call-back module are global calls. The difference between upgrade and downgrade is still in the order of loading code and performing state change.

The difference between how process modules and a call-back modules are handled in the downgrade case comes from the fact that a process module never contains global calls to itself. The code is thus static in the sense that a process executing a process module does not spontaneously switch to new loaded code. The opposite situation is a dynamic module, where a process executing the module spontaneously switches to the new code when it is loaded. A call-back module is always dynamic, and a process module static. A functional module is always dynamic.

11.6 Release Handling Instructions

This section describes the release upgrade and downgrade scripts. A script is a list of instructions which are interpreted by the release handler when an upgrade or downgrade is made.

There are two levels of instructions; the high-level instructions and the low-level instructions. High- and low-level instructions may be mixed in one script. However, the high-level instructions are translated to low-level instructions by the systools:make_relup/3 command, because the release handler understands only low-level instructions.

Scripts have to be placed in the .appup file for each application. systools:make_relup/3 assembles the scripts in all .appup files to form a relup file containing low-level instructions.

11.6.1 High-level Instructions

The high-level instructions are:

11.6.2 Low-level instructions

The low-level instructions are:

11.7 Release Handling Examples

This section includes several examples that show how different types of upgrades are handled. In call-back modules having the gen_server behavior, all call-back functions have been provided for reasons of clarity.

11.7.1 Update of Erlang Code

Several update examples are shown. Unless otherwise stated, it is assumed that all original modules are in the application foo, version "1.1", and the updated version is "1.2".

11.7.1.1 Simple Functional Module

This example is about a pure functional module, i.e. a module the functions of which have no side effects. The original version of the module lists2 has the following contents:

-module(lists2).
-vsn(1).

-export([assoc/2]).

assoc(Key, [{Key, Val} | _]) -> {ok, Val};
assoc(Key, [H | T]) -> assoc(Key, T);
assoc(Key, []) -> false.

The new version of the module adds a new function:

-module(lists2).
-vsn(2).

-export([assoc/2, multi_map/2]).

assoc(Key, [{Key, Val} | _]) -> {ok, Val};
assoc(Key, [H | T]) -> assoc(Key, T);
assoc(Key, []) -> false.

multi_map(Func, [[] | ListOfLists]) -> [];
multi_map(Func, ListOfLists) ->
    [apply(Func, lists:map({erlang, hd}, ListOfLists)) |
     multi_map(Func, lists:map({erlang, tl}, ListOfLists))].

The release upgrade instructions are:

[{load_module, lists2, soft_purge, soft_purge, []}]
        

Alternatively, the low-level instructions are:

[{load_object_code, {foo, "1.2", [lists2]}},
 point_of_no_return,
 {load, {lists2, soft_purge, soft_purge}}]
        
11.7.1.2 A More Complicated Functional Module

Here we have a functional module bar that uses the module lists2 of the previous example. The original version is only dependent on the original version of lists2.

-module(bar).
-vsn(1).

-export([simple/1, complicated_sum/1]).

simple(X) ->
    case lists2:assoc(simple, X) of
        {ok, Val} -> Val;
        false -> false
    end.

complicated_sum([X, Y, Z]) -> cs(X, Y, Z).

cs([HX | TX], [HY | TY], [HZ | TZ]) ->
    NewRes = cs(TX, TY, TZ),
    [HX + HY + HZ | NewRes];
cs([], [], []) -> [].

The new version of bar uses the new functionality of lists2 in order to simplify the implementation of the useful function complicated_sum/1. It does not change its API in any way.

-module(bar).
-vsn(2).

-export([simple/1, complicated_sum/1]).

simple(X) ->
    case lists2:assoc(simple, X) of
        {ok, Val} -> Val;
        false -> false
    end.

complicated_sum(X) ->
    lists2:multi_map(fun(A,B,C) -> A+B+C end, X).

The release upgrade instructions, including instructions for lists2, are as follows:

[{load_module, lists2, soft_purge, soft_purge, []},
 {load_module, bar, soft_purge, soft_purge, [lists2]}]
        

Note!

We must state that bar is dependent on lists2 to make the release handler to load lists2 before it loads bar.

The low-level instructions are:

[{load_object_code, {foo, "1.2", [lists2, bar]}},
 point_of_no_return,
 {load, {lists2, soft_purge, soft_purge}}
 {load, {bar, soft_purge, soft_purge}}]
        
11.7.1.3 Advanced Functional Module

Suppose now that we modify the return value of lists2:assoc/2 from {ok, Val} to {Key, Val}. In order to do an upgrade, we would have to find all modules that call lists2:assoc/2 directly or indirectly, and specify that these modules are dependent on lists2. In practice this might an unwieldy task, if if many other modules are using the lists2 module, and the only reasonable way to perform an upgrade which restarts the whole system.

If we insist on doing a soft upgrade, the modification should be made backward compatible by introducing an new function (assoc2/2, say) that has the new return value, and not make any changes to the original function at all.

11.7.1.4 Advanced gen_server

This example assumes that we have a gen_server process that must be updated because we have introduced a new function, and added a new data field in our internal state. The contents of the original module are as follows:

-module(gs1).
-vsn(1).
-behaviour(gen_server).

-export([get_data/0]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2, 
         terminate/2, code_change/3]).

-record(state, {data}).

get_data() -> 
    gen_server:call(gs1, get_data).

init([Data]) ->
    {ok, #state{data = Data}}.

handle_call(get_data, _From, State) ->
    {reply, {ok, State#state.data}, State}.

handle_cast(_Request, State) ->
    {noreply, State}.

handle_info(_Info, State) ->
    {noreply, State}.

terminate(_Reason, _State) ->
    ok.

code_change(_OldVsn, State, _Extra) ->
    {ok, State}.

The new module must translate the old state into the new state. Recall that a record is just syntactic sugar for a tuple:

-module(gs1).
-vsn(2).
-behaviour(gen_server).

-export([get_data/0, get_time/0]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2, 
         terminate/2, code_change/3]).

-record(state, {data, time}).

get_data() -> 
    gen_server:call(gs1, get_data).

get_time() -> 
    gen_server:call(gs1, get_time).

init([Data]) ->
    {ok, #state{data = Data, time = erlang:time()}}.

handle_call(get_data, _From, State) ->
    {reply, {ok, State#state.data}, State};
handle_call(get_time, _From, State) ->
    {reply, {ok, State#state.time}, State}.

handle_cast(_Request, State) ->
    {noreply, State}.

handle_info(_Info, State) ->
    {noreply, State}.

terminate(_Reason, _State) ->
    ok.

code_change(1, {state, Data}, _Extra) ->
    {ok, #state{data = Data, time = erlang:time()}}.

The release upgrade instructions are as follows:

[{update, gs1, {advanced, []}, soft_purge, soft_purge, []}]
        

The alternative low-level instructions are:

[{load_object_code, {foo, "1.2", [gs1]}},
 point_of_no_return,
 {suspend, [gs1]},
 {load, {gs1, soft_purge, soft_purge}},
 {code_change, [{gs1, []}]},
 {resume, [gs1]}]
        

If we want to handle soft downgrade as well, the code would be as follows:

-module(gs1).
-vsn(2).
-behaviour(gen_server).

-export([get_data/0, get_time/0]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2, 
         terminate/2, code_change/3]).

-record(state, {data, time}).

get_data() -> 
    gen_server:call(gs1, get_data).
get_time() -> 
    gen_server:call(gs1, get_time).

init([Data]) ->
    {ok, #state{data = Data, time = erlang:time()}}.

handle_call(get_data, _From, State) ->
    {reply, {ok, State#state.data}, State};
handle_call(get_time, _From, State) ->
    {reply, {ok, State#state.time}, State}.

handle_cast(_Request, State) ->
    {noreply, State}.

handle_info(_Info, State) ->
    {noreply, State}.

terminate(_Reason, _State) ->
    ok.

code_change(1, {state, Data}, _Extra) ->
    {ok, #state{data = Data, time = erlang:time()}};
code_change({down, 1}, #state{data = Data}, _Extra) ->
    {ok, {state, Data}}.

Note that we take care of translating the new state to the old format as well. The low-level instructions are:

[{load_object_code, {foo, "1.2", [gs1]}},
 point_of_no_return,
 {suspend, [gs1]},
 {code_change, [{gs1, []}]},
 {load, {gs1, soft_purge, soft_purge}},
 {resume, [gs1]}]
        
11.7.1.5 Advanced gen_server with Dependencies

This example assumes that we have gen_server process that uses the in gs1 as defined in the previous example.

The contents of the original module are as follows:

-module(gs2).
-vsn(1).
-behaviour(gen_server).

-export([is_operation_ok/1]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2, 
         terminate/2, code_change/3]).

is_operation_ok(Op) -> 
    gen_server:call(gs2, {is_operation_ok, Op}).

init([Data]) ->
    {ok, []}.

handle_call({is_operation_ok, Op}, _From, State) ->
    Data = gs1:get_data(),
    Reply = lists2:assoc(Op, Data),
    {reply, Reply, State}.

handle_cast(_Request, State) ->
    {noreply, State}.

handle_info(_Info, State) ->
    {noreply, State}.

terminate(_Reason, _State) ->
    ok.

code_change(_OldVsn, State, _Extra) ->
    {ok, State}.

The new version does not have to transform the internal state, hence the code_change/3 function is not really needed (it will not be called since the upgrade of gs2 is soft).

-module(gs2).
-vsn(2).
-behaviour(gen_server).

-export([is_operation_ok/1]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2, 
         terminate/2, code_change/3]).

is_operation_ok(Op) -> 
    gen_server:call(gs2, {is_operation_ok, Op}).

init([Data]) ->
    {ok, []}.

handle_call({is_operation_ok, Op}, _From, State) ->
    Data = gs1:get_data(),
    Time = gs1:get_time(),
    Reply = do_things(lists2:assoc(Op, Data), Time),
    {reply, Reply, State}.

handle_cast(_Request, State) ->
    {noreply, State}.

handle_info(_Info, State) ->
    {noreply, State}.

terminate(_Reason, _State) ->
    ok.

code_change(_OldVsn, State, _Extra) ->
    {ok, State}.

do_things({ok, Val}, Time) ->
    Val;
do_things(false, Time) ->
    {false, Time}.

    

The release upgrade instructions are:

[{update, gs1, {advanced, []}, soft_purge, soft_purge, []},
 {update, gs2, soft, soft_purge, soft_purge, [gs1]},
        

The corresponding low-level instructions are:

[{load_object_code, {foo, "1.2", [gs1, gs2]}},
 point_of_no_return,
 {suspend, [gs1, gs2]},
 {load, {gs1, soft_purge, soft_purge}},
 {load, {gs2, soft_purge, soft_purge}},
 {code_change, [{gs1, []}]},    % No gs2 here!
 {resume, [gs1, gs2]}]
        
11.7.1.6 Other Worker Processes

All other worker processes in a supervision tree, such as processes of the types gen_event, gen_fsm, and processes implemented by using proc_lib and sys, are handled in exactly the same way as processes of type gen_server are handled. Examples follow.

11.7.1.7 Simple gen_event

This example shows how an event handler may be updated. We do not make any assumptions about which event manager processes the handler is installed in, it is the responsibility of the release handler to find them. The contents of the original module is as follows:

-module(ge_h).
-vsn(1).
-behaviour(gen_event).

-export([get_events/1]).
-export([init/1, handle_event/2, handle_call/2, handle_info/2, 
         terminate/2, code_change/3]).

get_events(Mgr) -> 
    gen_event:call(Mgr, ge_h, get_events).

init(_) -> {ok, undefined}.

handle_event(Event, _LastEvent) -> 
    {ok, Event}.

handle_call(get_events, LastEvent) -> 
    {ok, [LastEvent], LastEvent}.

handle_info(Info, LastEvent) ->
    {ok, LastEvent}.

terminate(Arg, LastEvent) ->
    ok.

code_change(_OldVsn, LastEvent, _Extra) ->
    {ok, LastEvent}.

The new module decides to keep the two latest events in a list and must translate the old state into the new state.

-module(ge_h).
-vsn(2).
-behaviour(gen_event).

-export([get_events/1]).
-export([init/1, handle_event/2, handle_call/2, handle_info/2, 
         terminate/2, code_change/3]).

get_events(Mgr) -> 
    gen_event:call(Mgr, ge_h, get_events).

init(_) -> {ok, []}.

handle_event(Event, []) -> 
    {ok, [Event]};
handle_event(Event, [Event1 | _]) -> 
    {ok, [Event, Event1]}.

handle_call(get_events, Events) -> 
    Events.

handle_info(Info, Events) ->
    {ok, Events}.

terminate(Arg, Events) ->
    ok.

code_change(1, undefined, _Extra) -> 
    {ok, []};
code_change(1, LastEvent, _Extra) -> 
    {ok, [LastEvent]}.

The release upgrade instructions are:

[{update, ge_h, {advanced, []}, soft_purge, soft_purge, []}]
        

The low-level instructions are:

[{load_object_code, {foo, "1.2", [ge_h]}},
 point_of_no_return,
 {suspend, [ge_h]},
 {load, {ge_h, soft_purge, soft_purge}},
 {code_change, [{ge_h, []}]},
 {resume, [ge_h]}]
        

Note!

These instructions are identical to those used for the gen_server.

11.7.1.8 Process Implemented with sys and proc_lib

Processes implemented with sys and proc_lib are changed in the same way as processes that are implemented according to the gen_server behavior (which should not come as surprise, since gen_server et al. are implemented on top of sys and proc_lib). However, the code change function is defined differently. The original is as follows:

-module(sp).
-vsn(1).

-export([start/0, get_data/0]).
-export([init/1, system_continue/3, system_terminate/4]).

-record(state, {data}).

start() ->
    Pid = proc_lib:spawn_link(?MODULE, init, [self()]),
    {ok, Pid}.

get_data() ->
    sp_server ! {self(), get_data},
    receive
        {sp_server, Data} -> Data
    end.

init(Parent) ->
    register(sp_server, self()),
    process_flag(trap_exit, true),
    loop(#state{}, Parent).

loop(State, Parent) ->
    receive
        {system, From, Request} ->
            sys:handle_system_msg(Request, From, Parent, ?MODULE, [], State);
        {'EXIT', Parent, Reason} ->
            cleanup(State),
            exit(Reason);
        {From, get_data} ->
            From ! {sp_server, State#state.data},
            loop(State, Parent);
        _Any ->
            loop(State, Parent)
    end.

cleanup(State) -> ok.

%% Here are the sys call back functions
system_continue(Parent, _, State) ->
    loop(State, Parent).

system_terminate(Reason, Parent, _, State) ->
    cleanup(State),
    exit(Reason).

The new code, which takes care of up- and downgrade is as follows:

-module(sp).
-vsn(2).

-export([start/0, get_data/0, set_data/1]).
-export([init/1, system_continue/3, system_terminate/4, 
        system_code_change/4]).

-record(state, {data, last_pid}).

start() ->
    Pid = proc_lib:spawn_link(?MODULE, init, [self()]),
    {ok, Pid}.

get_data() ->
    sp_server ! {self(), get_data},
    receive
        {sp_server, Data} -> Data
    end.

set_data(Data) ->
    sp_server ! {self(), set_data, Data}.

init(Parent) ->
    register(sp_server, self()),
    process_flag(trap_exit, true),
    loop(#state{last_pid = no_one}, Parent).

loop(State, Parent) ->
    receive
        {system, From, Request} ->
            sys:handle_system_msg(Request, From, Parent, 
                                  ?MODULE, [], State);
        {'EXIT', Parent, Reason} ->
            cleanup(State),
            exit(Reason);
        {From, get_data} ->
            From ! {sp_server, State#state.data},
            loop(State, Parent);
        {From, set_data, Data} ->
            loop(State#state{data = Data, last_pid = From}, Parent);
        _Any ->
            loop(State, Parent)
    end.

cleanup(State) -> ok.

%% Here are the sys call back functions
system_continue(Parent, _, State) ->
    loop(State, Parent).

system_terminate(Reason, Parent, _, State) ->
    cleanup(State),
    exit(Reason).

system_code_change({state, Data}, _Mod, 1, _Extra) ->
    {ok, #state{data = Data, last_pid = no_one}};
system_code_change(#state{data = Data}, _Mod, {down, 1}, _Extra) ->
    {ok, {state, Data}}.

The release upgrade instructions are:

[{update, sp, static, default, {advanced, []}, soft_purge, soft_purge, []}]
        

The low-level instructions are the same for upgrade and downgrade:

[{load_object_code, {foo, "1.2", [sp]}},
 point_of_no_return,
 {suspend, [sp]},
 {load, {sp, soft_purge, soft_purge}},
 {code_change, [{sp, []}]},
 {resume, [sp]}]
        
11.7.1.9 Supervisor

This example assumes that a new version of an application adds a new process, and deletes one process from a supervisor. The original code is as follows:

-module(sup).
-vsn(1).
-behaviour(supervisor).
-export([init/1]).

init([]) ->
    SupFlags = {one_for_one, 4, 3600},
    Server = {my_server, {my_server, start_link, []},
              permanent, 2000, worker, [my_server]},
    GS1 = {gs1, {gs1, start_link, []}, permanent, 2000, worker, [gs1]},  
    {ok, {SupFlags, [Server, GS1]}}.

The new code is as follows:

-module(sup).
-vsn(2).
-behaviour(supervisor).
-export([init/1]).

init([]) ->
    SupFlags = {one_for_one, 4, 3600},
    GS1 = {gs1, {gs1, start_link, []}, permanent, 2000, worker, [gs1]},  
    GS2 = {gs2, {gs2, start_link, []}, permanent, 2000, worker, [gs2]},  
    {ok, {SupFlags, [GS1, GS2]}}.

The release upgrade instructions are:

[{update, sup, {advanced, []}, soft_purge, soft_purge, []}
 {apply, {supervisor, terminate_child, [sup, my_server]}},
 {apply, {supervisor, delete_child, [sup, my_server]}},
 {apply, {supervisor, restart_child, [sup, gs2]}}]
        

The low-level instructions are:

[{load_object_code, {foo, "1.2", [sup]}},
 point_of_no_return,
 {suspend, [sup]},
 {load, {sup, soft_purge, soft_purge}},
 {code_change, [{sup, []}]},
 {resume, [sup]},
 {apply, {supervisor, terminate_child, [sup, my_server]}},
 {apply, {supervisor, delete_child, [sup, my_server]}},
 {apply, {supervisor, restart_child, [sup, gs2]}}]
        

High-level update instruction for a supervisor is mapped to a low-level advanced code change instruction. In the code_change function of the supervisor, the new child specification is installed, but no children are explicitly terminated or started. Therefore, children must be terminated, deleted and started by using the apply instruction.

11.7.1.10 Complex Dependencies

As already mentioned, sometimes the simplest and safest way to introduce a new release is to terminate parts of the system, load the new code, and restart that part. However, individual processes cannot simply be killed, since their supervisors will restart them again. Instead supervisors must first be ordered to stop their children before now code can be loaded. Then supervisors are ordered to restart their children. All this is done by issuing the stop and start instructions.

The following example assumes that we have a supervisor a with two children b and c, where b is a worker and c is a supervisor for d. We want to restart all processes except for a. The upgrade instructions are as follows:

[{load_object_code, {foo, "1.2", [b,c,d]}},
 point_of_no_return,
 {stop, [b, c]},
 {load, {b, soft_purge, soft_purge}},
 {load, {c, soft_purge, soft_purge}},
 {load, {d, soft_purge, soft_purge}},
 {start, [b, c]}]
        

Note!

We do not need to explicitly stop d, this is done by the supervisor c.

A whole application cannot be stopped and started with the stop and start instructions. The instruction restart_application has to be used instead.

11.7.1.11 New Application

The examples shown so far have dealt with changing an existing application. In order to introduce a completely new application we just have to have an add_application instruction, but we also have to make sure that the boot file of the new release contains enough in order to start it. The following example shows how to to introduce the application new_appl, which has just one module: new_mod.

The release upgrade instructions are:

[{add_application, new_appl}]
        

The corresponding low-level instructions are as follows (note that the application specification is used as argument to application:start_application/1):

[{load_object_code, {new_appl, "1.0", [new_mod]}},
 point_of_no_return,
 {load, {new_mod, soft_purge, soft_purge}},
 {apply, {application, start,
           [{application, new_appl,
             [{description, "NEW APPL"},
              {vsn, "1.0"},
              {modules, [new_mod]},
              {registered, []},
              {applications, [kernel, foo]},
              {env, []},
              {mod, {new_mod, start_link, []}}]},
            permanent]}}].
        
11.7.1.12 Remove an Application

An application is removed in the same way as new applications are introduced. This example assumes that we want to remove the new_appl application:

[{remove_application, new_appl}]
        

The corresponding low_level instructions are:

[point_of_no_return,
 {apply, {application, stop, [new_appl]}},
 {remove, {new_mod, soft_purge, soft_purge}}].
        

11.7.2 Update of Port Programs

Each port program is controlled by a Erlang process called the port controller. A port program is updated by the port controller process. It is always done by terminating the old port program, and starting the new one.

11.7.2.1 Port Controller

In this example we have a port controller process, where we must take care of the termination and restart of the port program ourselves. Also, we may prepare for the possibility of changing the Erlang code of the port controller only. The gen_server behavior is used to implement the port controller. The contents of the original module is as follows.

-module(portc).
-vsn(1).
-behaviour(gen_server).

-export([get_data/0]).
-export([init/1, handle_call/3, handle_info/2, code_change/3]).

-record(state, {port, data}).

get_data() -> gen_server:call(portc, get_data).

init([]) ->
    PortProg = code:priv_dir(foo) ++ "/bin/portc",
    Port = open_port({spawn, PortProg}, [binary, {packet, 2}]),
    {ok, #state{port = Port}}.

handle_call(get_data, _From, State) ->
    {reply, {ok, State#state.data}, State}.

handle_info({Port, Cmd}, State) ->
    NewState = do_cmd(Cmd, State),
    {noreply, NewState}.

code_change(_, State, change_port_only) ->
    State#state.port ! close,
    receive
        {Port, closed} -> true
    end,
    NPortProg = code:priv_dir(foo) ++ "/bin/portc",   % get new version
    NPort = open_port({spawn, NPortProg}, [binary, {packet, 2}]),
    {ok, State#state{port = NPort}}.

To change the port program without changing the Erlang code, we can use the following code:

[point_of_no_return,
 {suspend, [portc]},
 {code_change, [{portc, change_port_only}]},
 {resume, [portc]}]
        

Here we used low-level instructions only. In this example we also make use of the Extra argument of the code_change/3 function.

Suppose now that we wish to change only the Erlang code. The new version of portc is as follows:

-module(portc).
-vsn(2).
-behaviour(gen_server).

-export([get_data/0]).
-export([init/1, handle_call/3, handle_info/2, code_change/3]).

-record(state, {port, data}).

get_data() -> gen_server:call(portc, get_data).

init([]) ->
    PortProg = code:priv_dir(foo) ++ "/bin/portc",
    Port = open_port({spawn, PortProg}, [binary, {packet, 2}]),
    {ok, #state{port = Port}}.

handle_call(get_data, _From, State) ->
    {reply, {ok, State#state.data}, State}.

handle_info({Port, Cmd}, State) ->
    NewState = do_cmd(Cmd, State),
    {noreply, NewState}.

code_change(_, State, change_port_only) ->
    State#state.port ! close,
    receive
        {Port, closed} -> true
    end,
    NPortProg = code:priv_dir(foo) ++ "/bin/portc",   % get new version
    NPort = open_port({spawn, NPortProg}, [binary, {packet, 2}]),
    {ok, State#state{port = NPort}};
code_change(1, State, change_erl_only) ->
    NState = transform_state(State),
    {ok, NState}.

The high-level instruction is:

[{update, portc, {advanced, change_erl_only}, soft_purge, soft_purge, []}]
        

The corresponding low-level instructions are:

[{load_object_code, {portc, 2, [portc]}},
 point_of_no_return,
 {suspend, [portc]},
 {load, {portc, soft_purge, soft_purge}},
 {code_change, [{portc, change_erl_only}]},
 {resume, [portc]}]
        

Copyright © 1991-2003 Ericsson Utvecklings AB