[Ericsson AB]

heart

MODULE

heart

MODULE SUMMARY

Heartbeat Monitoring of an Erlang Runtime System.

DESCRIPTION

The heart module sends periodic heartbeats to an external port program, which is also named heart. The purpose of the heart port program is to check that the Erlang runtime system it is supervising is still running. If the port program has not received any heartbeats within HEART_BEAT_TIMEOUT (default is 60 seconds) from the last one, the system can be rebooted. Also, if the system is equipped with a hardware watchdog timer and is running Solaris, the watchdog can be used to supervise the entire system.

This module is started by the init module during system start-up. The -heart command line flag determines if the heart module should start .

If the system should be rebooted because of missing heart-beats, or a terminated Erlang runtime system, the environment variable HEART_COMMAND has to be set before the system is started. If this variable is not set, a warning text will be printed but the system will not reboot. However, if the hardware watchdog is used, it will trigger a reboot HEART_BEAT_BOOT_DELAY seconds later nevertheless (default is 60).

To reboot on the WINDOWS platform HEART_COMMAND can be set to heart -shutdown (included in the Erlang delivery) or of course to any other suitable program which can activate a reboot.

The hardware watchdog will not be started under Solaris if the environment variable HW_WD_DISABLE is set.

The HEART_BEAT_TIMEOUT and HEART_BEAT_BOOT_DELAY environment variables can be used to configure the heart timeouts, they can be set in the operating system shell before erl -heart is started or can be passed on the command line like this: erl -heart -env HEART_BEAT_TIMEOUT 30.

The value (in seconds) must be in the range 10 < X <= 65535.

It should be noted that if the system clock is adjusted with more than HEART_BEAT_TIMEOUT seconds heart will timeout and try to reboot the system. This can happen for example if the system clock is adjusted automatically by use of NTP (Network Time Protocol).

EXPORTS

start() -> {ok, Pid} | ignore | {error, What}

Types:

Pid = pid()
What = void()

Starts the heart program. This function returns ignore if the -heart command line flag is not supplied.

set_cmd(Cmd) -> ok | {error, {bad_cmd, Cmd}}

Types:

Cmd = string()

Sets a temporary reboot command. This command is used if a HEART_COMMAND other than the one specified with the environment variable should be used in order to reboot the system. The new Erlang runtime system will (if it misbehaves) use the environment variable HEART_COMMAND to reboot.

The length of the Cmd command string must be less than 2047 characters.

clear_cmd() -> ok

Clears the temporary boot command. If the system terminates, the normal HEART_COMMAND is used to reboot.

get_cmd() -> {ok, Cmd}

Types:

Cmd = string()

Get the temporary reboot command. If the command is cleared the empty string will be returned.

AUTHORS

Magnus Fröberg - support@erlang.ericsson.se
Kenneth Lundin - support@erlang.ericsson.se

kernel 2.10.3
Copyright © 1991-2004 Ericsson AB