<!--
%CopyrightBegin%

SPDX-License-Identifier: Apache-2.0

Copyright Ericsson AB 2023-2025. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

%CopyrightEnd%
-->
# How to Interpret the Erlang Crash Dumps

This section describes the `erl_crash.dump` file generated upon abnormal exit of
the Erlang runtime system.

> #### Note {: .info }
>
> The Erlang crash dump had a major facelift in Erlang/OTP R9C. The information
> in this section is therefore not directly applicable for older dumps. However,
> if you use `m:crashdump_viewer` on older dumps, the crash dumps are translated
> into a format similar to this.

The system writes the crash dump in the current directory of the emulator or in
the file pointed out by the environment variable (whatever that means on the
current operating system) `ERL_CRASH_DUMP`. For a crash dump to be written, a
writable file system must be mounted.

Crash dumps are written mainly for one of two reasons: either the built-in
function `erlang:halt/1` is called explicitly with a string argument from
running Erlang code, or the runtime system has detected an error that cannot be
handled. The most usual reason that the system cannot handle the error is that
the cause is external limitations, such as running out of memory. A crash dump
caused by an internal error can be caused by the system reaching limits in the
emulator itself (like the number of atoms in the system, or too many
simultaneous ETS tables). Usually the emulator or the operating system can be
reconfigured to avoid the crash, which is why interpreting the crash dump
correctly is important.

On systems that support OS signals, it is also possible to stop the runtime
system and generate a crash dump by sending the `SIGUSR1` signal.

The Erlang crash dump is a readable text file, but it can be difficult to read.
Using the Crashdump Viewer tool in the `Observer` application simplifies the
task. This is a wx-widget-based tool for browsing Erlang crash dumps.

## General Information

The first part of the crash dump shows the following:

- The creation time for the dump
- A slogan indicating the reason for the dump
- The system version of the node from which the dump originates
- The number of atoms in the atom table
- The runtime system thread that caused the crash dump

### Reasons for Crash Dumps (Slogan)

The reason for the dump is shown in the beginning of the file as:

```text
Slogan: <reason>
```

If the system is halted by the BIF `erlang:halt/1`, the slogan is the string
parameter passed to the BIF, otherwise it is a description generated by the
emulator or the (Erlang) kernel. Normally the message is enough to understand
the problem, but some messages are described here. Notice that the suggested
reasons for the crash are _only suggestions_. The exact reasons for the errors
can vary depending on the local applications and the underlying operating
system.

- **_<A>: Cannot allocate <N> bytes of memory (of type "<T>")_** - The system
  has run out of memory. <A> is the allocator that failed to allocate memory,
  <N> is the number of bytes that <A> tried to allocate, and <T> is the memory
  block type that the memory was needed for. The most common case is that a
  process stores huge amounts of data. In this case <T> is most often `heap`,
  `old_heap`, `heap_frag`, or `binary`. For more information on allocators, see
  [`erts_alloc(3)`](erts_alloc.md).

- **_<A>: Cannot reallocate <N> bytes of memory (of type "<T>")_** - Same as
  above except that memory was reallocated instead of allocated when the system
  ran out of memory.

- **_Unexpected op code <N>_** - Error in compiled code, `beam` file damaged, or
  error in the compiler.

- **_Module <Name> undefined `|` Function <Name> undefined `|` No function
  <Name>:<Name>/1 `|` No function <Name>:start/2_** - The Kernel/STDLIB
  applications are damaged or the start script is damaged.

- **_Driver_select called with too large file descriptor `N`_** - The number of
  file descriptors for sockets exceeds 1024 (Unix only). The limit on file
  descriptors in some Unix flavors can be set to over 1024, but only 1024
  sockets/pipes can be used simultaneously by Erlang (because of limitations in
  the Unix `select` call). The number of open regular files is not affected by
  this.

- **_Received SIGUSR1_** - Sending the `SIGUSR1` signal to an Erlang machine
  (Unix only) forces a crash dump. This slogan reflects that the Erlang machine
  crash-dumped because of receiving that signal.

- **_Kernel pid terminated (<Who>) (<Exit reason>)_** - The kernel supervisor
  has detected a failure, usually that the `application_controller` has shut
  down (`Who` = `application_controller`, `Why` = `shutdown`). The application
  controller can have shut down for many reasons, the most usual is that the
  node name of the distributed Erlang node is already in use. A complete
  supervisor tree "crash" (that is, the top supervisors have exited) gives about
  the same result. This message comes from the Erlang code and not from the
  virtual machine itself. It is always because of some failure in an
  application, either within OTP or a "user-written" one. Looking at the error
  log for your application is probably the first step to take.

- **_Init terminating in do_boot ()_** - The primitive Erlang boot sequence was
  terminated, most probably because the boot script has errors or cannot be
  read. This is usually a configuration error; the system can have been started
  with a faulty `-boot` parameter or with a boot script from the wrong OTP
  version.

- **_Could not start kernel pid (<Who>) ()_** - One of the kernel processes
  could not start. This is probably because of faulty arguments (like errors in
  a `-config` argument) or faulty configuration files. Check that all files are
  in their correct location and that the configuration files (if any) are not
  damaged. Usually messages are also written to the controlling terminal and/or
  the error log explaining what is wrong.

Other errors than these can occur, as the `erlang:halt/1` BIF can generate any
message. If the message is not generated by the BIF and does not occur in the
list above, it can be because of an error in the emulator. There can however be
unusual messages, not mentioned here, which are still connected to an
application failure. There is much more information available, so a thorough
reading of the crash dump can reveal the crash reason. The size of processes,
the number of ETS tables, and the Erlang data on each process stack can be
useful to find the problem.

### Number of Atoms

The number of atoms in the system at the time of the crash is shown as `Atoms: <number>`.
Some ten thousands atoms is perfectly normal, but more can indicate
that the BIF `erlang:list_to_atom/1` is used to generate many _different_ atoms
dynamically, which is never a good idea.

## Scheduler Information

Under the tag _=scheduler_ is shown information about the current state and
statistics of the schedulers in the runtime system. On operating systems that
allow suspension of other threads, the data within this section reflects what
the runtime system looks like when a crash occurs.

The following fields can exist for a process:

- **_=scheduler:id_** - Heading. States the scheduler identifier.

- **_Scheduler Sleep Info Flags_** - If empty, the scheduler was doing some
  work. If not empty, the scheduler is either in some state of sleep, or
  suspended.

- **_Scheduler Sleep Info Aux Work_** - If not empty, a scheduler internal
  auxiliary work is scheduled to be done.

- **_Current Port_** - The port identifier of the port that is currently
  executed by the scheduler.

- **_Current Process_** - The process identifier of the process that is
  currently executed by the scheduler. If there is such a process, this entry is
  followed by the _State_, _Internal State_, _Program Counter_, and _CP_ of that
  same process. The entries are described in section
  [Process Information](crash_dump.md#process-data).

  Notice that this is a snapshot of what the entries are exactly when the crash
  dump is starting to be generated. Therefore they are most likely different
  (and more telling) than the entries for the same processes found in the
  _=proc_ section. If there is no currently running process, only the _Current
  Process_ entry is shown.

- **_Current Process Limited Stack Trace_** - This entry is shown only if there
  is a current process. It is similar to
  [_=proc_stack_](crash_dump.md#process-data), except that only the function frames
  are shown (that is, the stack variables are omitted). Also, only the top and
  bottom part of the stack are shown. If the stack is small (< 512 slots), the
  entire stack is shown. Otherwise the entry _skipping ## slots_ is shown, where
  `##` is replaced by the number of slots that has been skipped.

- **_Run Queue_** - Shows statistics about how many processes and ports of
  different priorities are scheduled on this scheduler.

- **\*\*\* crashed \*\*\*** - This entry is normally not shown. It signifies
  that getting the rest of the information about this scheduler failed for some
  reason.

## Memory Information

Under the tag _=memory_ is shown information similar to what can be obtained on
a living node with [`erlang:memory()`](`erlang:memory/0`).

## Internal Table Information

Under the tags _=hash_table:<table_name>_ and _=index_table:<table_name>_ is
shown internal tables. These are mostly of interest for runtime system
developers.

## Allocated Areas

Under the tag _=allocated_areas_ is shown information similar to what can be
obtained on a living node with
[`erlang:system_info(allocated_areas)`](`m:erlang#system_info_allocated_areas`).

## Allocator

Under the tag _=allocator:<A>_ is shown various information about allocator <A>.
The information is similar to what can be obtained on a living node with
[`erlang:system_info({allocator, <A>})`](`m:erlang#system_info_allocator_tuple`).
For more information, see also [`erts_alloc(3)`](erts_alloc.md).

## Process Information

The Erlang crashdump contains a listing of each living Erlang process in the
system. The following fields can exist for a process:

- **_=proc:<pid>_** - Heading. States the process identifier.

- **_State_** - The state of the process. This can be one of the following:

  - **_Scheduled_** - The process was scheduled to run but is currently not
    running ("in the run queue").

  - **_Waiting_** - The process was waiting for something (in `receive`).

  - **_Running_** - The process was currently running. If the BIF
    `erlang:halt/1` was called, this was the process calling it.

  - **_Exiting_** - The process was on its way to exit.

  - **_Garbing_** - This is bad luck, the process was garbage collecting when
    the crash dump was written. The rest of the information for this process is
    limited.

  - **_Suspended_** - The process is suspended, either by the BIF
    `erlang:suspend_process/1` or because it tries to write to a busy port.

- **_Registered name_** - The registered name of the process, if any.

- **_Spawned as_** - The entry point of the process, that is, what function was
  referenced in the `spawn` or `spawn_link` call that started the process.

- **_Last scheduled in for | Current call_** - The current function of the
  process. These fields do not always exist.

- **_Spawned by_** - The parent of the process, that is, the process that
  executed `spawn` or `spawn_link`.

- **_Started_** - The date and time when the process was started.

- **_Message queue length_** - The number of messages in the process' message
  queue.

- **_Number of heap fragments_** - The number of allocated heap fragments.

- **_Heap fragment data_** - Size of fragmented heap data, in words. This is
  data either created by messages sent to the process or by the Erlang BIFs.
  This amount depends on so many things that this field is usually
  uninteresting.

- **_Link list_** - Process IDs of processes linked to this one. Can also
  contain ports. If process monitoring is used, this field also tells in which
  direction the monitoring is in effect. That is, a link "to" a process tells
  you that the "current" process was monitoring the other, and a link "from" a
  process tells you that the other process was monitoring the current one.

- **_Reductions_** - The number of reductions consumed by the process.

- **_Stack+heap_** - The size of the stack and heap, in words (they share memory
  segment).

- **_OldHeap_** - The size of the "old heap", in words. The Erlang virtual
  machine uses generational garbage collection with two generations. There is
  one heap for new data items and one for the data that has survived two garbage
  collections. The assumption (which is almost always correct) is that data
  surviving two garbage collections can be "tenured" to a heap more seldom
  garbage collected, as they will live for a long period. This is a usual
  technique in virtual machines. The sum of the heaps and stack together
  constitute most of the allocated memory of the process.

- **_Heap unused, OldHeap unused_** - The amount of unused memory on each heap,
  in words. This information is usually useless.

- **_Memory_** - The total memory used by this process, in bytes. This includes
  call stack, heap, and internal structures. Same as
  [`erlang:process_info(Pid,memory)`](`erlang:process_info/2`).

- **_Program counter_** - The current instruction pointer. This is only of
  interest for runtime system developers. The function into which the program
  counter points is the current function of the process.

- **_CP_** - The continuation pointer, that is, the return address for the
  current call. Usually useless for other than runtime system developers. This
  can be followed by the function into which the CP points, which is the
  function calling the current function.

- **_Arity_** - The number of live argument registers. The argument registers if
  any are live will follow. These can contain the arguments of the function if
  they are not yet moved to the stack.

- **_Internal State_** - A more detailed internal representation of the state of
  this process.

See also section [Process Data](crash_dump.md#process-data).

## Port Information

This section lists the open ports, their owners, any linked processes, and the
name of their driver or external process.

## ETS Tables

This section contains information about all the ETS tables in the system. The
following fields are of interest for each table:

- **_=ets:<owner>_** - Heading. States the table owner (a process identifier).

- **_Table_** - The identifier for the table. If the table is a `named_table`,
  this is the name.

- **_Name_** - The table name, regardless of if it is a `named_table` or not.

- **_Hash table, Buckets_** - If the table is a hash table, that is, if it is
  not an `ordered_set`.

- **_Hash table, Chain Length_** - If the table is a hash table. Contains
  statistics about the table, such as the maximum, minimum, and average chain
  length. Having a maximum much larger than the average, and a standard
  deviation much larger than the expected standard deviation is a sign that the
  hashing of the terms behaves badly for some reason.

- **_Ordered set (AVL tree), Elements_** - If the table is an `ordered_set`.
  (The number of elements is the same as the number of objects in the table.)

- **_Fixed_** - If the table is fixed using `ets:safe_fixtable/2` or some
  internal mechanism.

- **_Objects_** - The number of objects in the table.

- **_Words_** - The number of words allocated to data in the table.

- **_Type_** - The table type, that is, `set`, `bag`, `duplicate_bag`, or
  `ordered_set`.

- **_Compressed_** - If the table was compressed.

- **_Protection_** - The protection of the table.

- **_Write Concurrency_** - If `write_concurrency` was enabled for the table.

- **_Read Concurrency_** - If `read_concurrency` was enabled for the table.

## Timers

This section contains information about all the timers started with the BIFs
`erlang:start_timer/3` and `erlang:send_after/3`. The following fields exist for
each timer:

- **_=timer:<owner>_** - Heading. States the timer owner (a process identifier),
  that is, the process to receive the message when the timer expires.

- **_Message_** - The message to be sent.

- **_Time left_** - Number of milliseconds left until the message would have
  been sent.

## Distribution Information

If the Erlang node was alive, that is, set up for communicating with other
nodes, this section lists the connections that were active. The following fields
can exist:

- **_=node:<node_name>_** - The node name.

- **_no_distribution_** - If the node was not distributed.

- **_=visible_node:<channel>_** - Heading for a visible node, that is, an alive
  node with a connection to the node that crashed. States the channel number for
  the node.

- **_=hidden_node:<channel>_** - Heading for a hidden node. A hidden node is the
  same as a visible node, except that it is started with the `"-hidden"` flag.
  States the channel number for the node.

- **_=not_connected:<channel>_** - Heading for a node that was connected to the
  crashed node earlier. References (that is, process or port identifiers) to the
  not connected node existed at the time of the crash. States the channel number
  for the node.

- **_Name_** - The name of the remote node.

- **_Controller_** - The port controlling communication with the remote node.

- **_Creation_** - An integer (1-3) that together with the node name identifies
  a specific instance of the node.

- **_Remote monitoring: <local_proc> <remote_proc>_** - The local process was
  monitoring the remote process at the time of the crash.

- **_Remotely monitored by: <local_proc> <remote_proc>_** - The remote process
  was monitoring the local process at the time of the crash.

- **_Remote link: <local_proc> <remote_proc>_** - A link existed between the
  local process and the remote process at the time of the crash.

## Loaded Module Information

This section contains information about all loaded modules.

First, the memory use by the loaded code is summarized:

- **_Current code_** - Code that is the current latest version of the modules.

- **_Old code_** - Code where there exists a newer version in the system, but
  the old version is not yet purged.

Then, all loaded modules are listed. The following fields exist:

- **_=mod:<module_name>_** - Heading. States the module name.

- **_Current size_** - Memory use for the loaded code, in bytes.

- **_Old size_** - Memory use for the old code, in bytes.

- **_Current attributes_** - Module attributes for the current code. This field
  is decoded when looked at by the Crashdump Viewer tool.

- **_Old attributes_** - Module attributes for the old code, if any. This field
  is decoded when looked at by the Crashdump Viewer tool.

- **_Current compilation info_** - Compilation information (options) for the
  current code. This field is decoded when looked at by the Crashdump Viewer
  tool.

- **_Old compilation info_** - Compilation information (options) for the old
  code, if any. This field is decoded when looked at by the Crashdump Viewer
  tool.

## Fun Information

This section lists all funs. The following fields exist for each fun:

- **_=fun_** - Heading.

- **_Module_** - The name of the module where the fun was defined.

- **_Uniq, Index_** - Identifiers.

- **_Address_** - The address of the fun's code.

- **_Refc_** - The number of references to the fun.

## Process Data

For each process there is at least one _=proc_stack_ and one _=proc_heap_ tag,
followed by the raw memory information for the stack and heap of the process.

For each process there is also a _=proc_messages_ tag if the process message
queue is non-empty, and a _=proc_dictionary_ tag if the process dictionary (the
[`put/2`](`put/2`) and [`get/1`](`get/1`) thing) is non-empty.

The raw memory information can be decoded by the Crashdump Viewer tool. You can
then see the stack dump, the message queue (if any), and the dictionary (if
any).

The stack dump is a dump of the Erlang process stack. Most of the live data
(that is, variables currently in use) are placed on the stack; thus this can be
interesting. One has to "guess" what is what, but as the information is
symbolic, thorough reading of this information can be useful. As an example, we
can find the state variable of the Erlang primitive loader online `(5)` and
`(6)` in the following example:

```erlang
(1)  3cac44   Return addr 0x13BF58 (<terminate process normally>)
(2)  y(0)     ["/view/siri_r10_dev/clearcase/otp/erts/lib/kernel/ebin",
(3)            "/view/siri_r10_dev/clearcase/otp/erts/lib/stdlib/ebin"]
(4)  y(1)     <0.1.0>
(5)  y(2)     {state,[],none,#Fun<erl_prim_loader.6.7085890>,undefined,#Fun<erl_prim_loader.7.9000327>,
(6)            #Fun<erl_prim_loader.8.116480692>,#Port<0.2>,infinity,#Fun<erl_prim_loader.9.10708760>}
(7)  y(3)     infinity
```

When interpreting the data for a process, it is helpful to know that anonymous
function objects (funs) are given the following:

- A name constructed from the name of the function in which they are created
- A number (starting with 0) indicating the number of that fun within that
  function

## Atoms

This section presents all the atoms in the system. This is only of interest if
one suspects that dynamic generation of atoms can be a problem, otherwise this
section can be ignored.

Notice that the last created atom is shown first.

## Disclaimer

The format of the crash dump evolves between OTP releases. Some information
described here may not apply to your version. A description like this will never
be complete; it is meant as an explanation of the crash dump in general and as a
help when trying to find application errors, not as a complete specification.
