1 Match specifications in Erlang
A "match specification" (match_spec) is an Erlang term describing a small "program" that will try to match something (preferably the parameters to a function as used in the
erlang:trace_pattern/2
BIF). The match_spec in many ways works like a small function in Erlang, but is interpreted/compiled by the Erlang runtime system to something much more efficient than calling an Erlang function. The match_spec is also very limited compared to the expressiveness of real Erlang functions.Match specifications are given to the BIF
erlang:trace_pattern/2
to execute matching of function arguments as well as to define some actions to be taken when the match succeeds (theMatchBody
part).The most notable difference between a match_spec and an Erlang fun is of course the syntax. Match specifications are Erlang terms, not Erlang code. A match_spec also has a somewhat strange concept of exceptions. An exception (e.g.,
badarg
) in theMatchCondition
part, which resembles an Erlang guard, will generate immediate failure, while an exception in theMatchBody
part, which resembles the body of an Erlang function, is implicitly caught and results in the single atom'EXIT'
.1.1 Grammar
A match_spec can be described in this informal grammar:
- MatchExpression ::= [ MatchFunction, ... ]
- MatchFunction ::= { MatchHead, MatchConditions, MatchBody }
- MatchHead ::= MatchVariable |
'_'
| [ MatchHeadPart, ... ]
- MatchHeadPart ::= term() | MatchVariable |
'_'
- MatchVariable ::= '$<number>'
- MatchConditions ::= [ MatchCondition, ...] |
[]
- MatchCondition ::= { BoolFunction } | { BoolFunction, ConditionExpression, ... }
- BoolFunction ::=
is_atom
|is_constant
|is_float
|is_integer
|is_list
|is_number
|is_pid
|is_port
|is_reference
|is_tuple
|is_binary
|is_function
|is_record
|is_seq_trace
|'and'
|'or'
|'not'
|'xor'
|andthen
|orelse
- ConditionExpression ::= ExprMatchVariable | { GuardFunction } | { GuardFunction, ConditionExpression, ... } | TermConstruct
- ExprMatchVariable ::= MatchVariable (bound in the MatchHead) |
'$_'
|'$$'
- TermConstruct = {{}} | {{ ConditionExpression, ... }} |
[]
| [ConditionExpression, ...] | NonCompositeTerm | Constant
- NonCompositeTerm ::= term() (not list or tuple)
- Constant ::= {
const
, term()}
- GuardFunction ::= BoolFunction |
abs
|element
|hd
|length
|node
|round
|size
|tl
|trunc
|'+'
|'-'
|'*'
|'div'
|'rem'
|'band'
|'bor'
|'bxor'
|'bnot'
|'bsl'
|'bsr'
|'>'
|'>='
|'<'
|'=<'
|'=:='
|'=='
|'=/='
|'/='
|self
|get_tcw
- MatchBody ::= [ ActionTerm ]
- ActionTerm ::= ConditionExpression | ActionCall
- ActionCall ::= {ActionFunction} | {ActionFunction, ActionTerm, ...}
- ActionFunction ::=
set_seq_token
|get_seq_token
|message
|return_trace
|process_dump
|enable_trace
|disable_trace
|display
|caller
|set_tcw
|silent
1.2 Function descriptions
The different functions allowed in
match_spec
work like this:is_atom, is_constant, is_float, is_integer, is_list, is_number, is_pid, is_port, is_reference, is_tuple, is_binary, is_function: Like the corresponding guard tests in Erlang, return
true
orfalse
.is_record: Takes an additional parameter, which SHALL be the result of
record_info(<record_type>, size)
, like in{is_record, '$1', rectype, record_info(rectype, size)}
.is_seq_trace: Returns
true
if a sequential trace token is set for the current process, otherwisefalse
.'not': Negates its single argument (anything other than
false
givesfalse
).'and': Returns
true
if all its arguments (variable length argument list) evaluate totrue
, elsefalse
. Evaluation order is undefined.'or': Returns
true
if any of its arguments evaluates totrue
. Variable length argument list. Evaluation order is undefined.andthen: Like
'and'
, but quits evaluating its arguments as soon as one argument evaluates to something else than true. Arguments are evaluated left to right.orelse: Like
'or'
, but quits evaluating as soon as one of its arguments evaluates totrue
. Arguments are evaluated left to right.'xor': Only two arguments, of which one has to be true and the other false to return
true
; otherwise'xor'
returns false.abs, element, hd, length, node, round, size, tl, trunc, '+', '-', '*', 'div', 'rem', 'band', 'bor', 'bxor', 'bnot', 'bsl', 'bsr', '>', '>=', '<', '=<', '=:=', '==', '=/=', '/=', self: Work as the corresponding Erlang bif's (or operators). In case of bad arguments, the result depends on the context. In the
MatchConditions
part of the expression, the test fails immediately (like in an Erlang guard), but in theMatchBody
, exceptions are implicitly caught and the call results in the atom'EXIT'
.set_seq_token: Works like
seq_trace:set_token/2
, but returnstrue
on success and'EXIT'
on error or bad argument. Only allowed in theMatchBody
part.get_seq_token: Works just like
seq_trace:get_token/0
, and is only allowed in theMatchBody
part.message: Sets an additional message appended to the trace message sent. One can only set one additional message in the body; subsequent calls will replace the appended message. As a special case,
{message, false}
disables sending of trace messages for this function call, which can be useful if only the side effects of theMatchBody
are desired. Another special case is{message, true}
which sets the default behavior, trace message is sent with no extra information (if no other calls tomessage
are placed before{message, true}
, it is in fact a "noop").Takes one argument, the message. Returns
true
and can only be used in theMatchBody
part.return_trace: Causes a trace message to be sent upon return from the current function. Takes no arguments, returns
true
and can only be used in theMatchBody
part.NOTE! If the traced function is tail recursive, this match spec function destroys that property. Hence, if a match spec executing this function is used on a perpetual server process, it may only be active for a limited time, or the emulator will eventually use all memory in the host machine and crash.
process_dump: Returns some textual information about the current process as a binary. Takes no arguments and is only allowed in the
MatchBody
part.enable_trace: With one parameter this function turns on tracing like the Erlang call
erlang:trace(self(), true, [P])
, whereP
is the parameter toenable_trace
. With two parameters, the first parameter should be either a process identifier or the registered name of a process. In this case tracing is turned on for the designated process in the same way as in the Erlang callerlang:trace(P1, true, [P2])
, where P1 is the first and P2 is the second argument. The processP1
gets its trace messages sent to the same tracer as the process executing the statement uses.P1
can not be one of the atomsall
,new
orexisting
(unless, of course, they are registered names). Returnstrue
and may only be used in theMatchBody
part.disable_trace: With one parameter this function disables tracing like the Erlang call
erlang:trace(self(), false, [P])
, whereP
is the parameter todisable_trace
. With two parameters it works like the Erlang callerlang:trace(P1, false, [P2])
, where P1 can be either a process identifier or a registered name and is given as the first argument to the match_spec function. Returnstrue
and may only be used in theMatchBody
part.caller: Returns the calling function as a tuple {Module, Function, Arity} or the atom
undefined
if the calling function cannot be determined. May only be used in theMatchBody
part.Note that if a "technically built in function" (i.e. a function not written in Erlang) is traced, the
caller
function will always return the atomundefined
. The calling Erlang function is not available during such calls.display: For debugging purposes only; displays the single argument as an Erlang term on stdout, which is seldom what is wanted. Returns
true
and may only be used in theMatchBody
part.get_tcw: Takes no argument and returns the value of the node's trace control word. The same is done by
erlang:system_info(trace_control_word)
.The trace control word is an unsigned integer intended for generic trace control. It's width is determined by the underlying processor and hardware (today 32 bits). If the value of the trace control word does not fit in 24 bits it may have to be handled as a big integer, which is not as efficient as a small one. The trace control word can be tested and set both from within trace match specifications and with BIFs.
set_tcw: Takes one unsigned integer argument, sets the value of the node's trace control word to the value of the argument and returns the previous value. The same is done by
erlang:system_flag(trace_control_word, Value)
. It is only allowed to useset_tcw
in theMatchBody
part.silent: Takes one argument. If the argument is
true
, the call trace message mode for the current process is set to silent for this call and all subsequent, i.e call trace messages are inhibited even if{message, true}
is called in theMatchBody
part for a traced function.This mode can also be activated with the
silent
flag toerlang:trace/3
.If the argument is
false
, the call trace message mode for the current process is set to normal (non-silent) for this call and all subsequent.If the argument is neither
true
norfalse
, the call trace message mode is unaffected.Note that all "function calls" have to be tuples, even if they take no arguments. The value of
self
is the atom()self
, but the value of{self}
is the pid() of the current process.1.3 Variables and literals
Variables take the form
'$<number>'
where<number>
is an integer between 0 (zero) and 100000000 (1e+8), the behavior if the number is outside these limits is undefined. In theMatchHead
part, the special variable'_'
matches anything, and never gets bound (like_
in Erlang). In theMatchCondition/MatchBody
parts, no unbound variables are allowed, why'_'
is interpreted as itself (an atom). Variables can only be bound in theMatchHead
part. In theMatchBody
andMatchCondition
parts, only variables bound previously may be used. As a special case, in theMatchCondition/MatchBody
parts, the variable'$_'
expands to the whole expression which matched theMatchHead
(i.e., the whole parameter list to the possibly traced function) and the variable'$$'
expands to a list of the values of all bound variables in order (i.e.['$1','$2', ...]
).In the
MatchHead
part, all literals (except the variables noted above) are interpreted as is. In theMatchCondition/MatchBody
parts, however, the interpretation is in some ways different. Literals in theMatchCondition/MatchBody
can either be written as is, which works for all literals except tuples, or by using the special form{const, T}
, whereT
is any Erlang term. For tuple literals in the match_spec, one can also use double tuple parentheses, i.e., construct them as a tuple of arity one containing a single tuple, which is the one to be constructed. The "double tuple parenthesis" syntax is useful to construct tuples from already bound variables, like in{{'$1', [a,b,'$2']}}
. Some examples may be needed:
Expression Variable bindings Result {{'$1','$2'}} '$1' = a, '$2' = b {a,b} {const, {'$1', '$2'}} doesn't matter {'$1', '$2'} a doesn't matter a '$1' '$1' = [] [] ['$1'] '$1' = [] [[]] [{{a}}] doesn't matter [{a}] 42 doesn't matter 42 "hello" doesn't matter "hello" $1 doesn't matter 49 (the ASCII value for the character '1') Literals in the MatchCondition/MatchBody parts of a match_spec 1.4 Execution of the match
The execution of the match expression, when the runtime system decides whether a trace message should be sent, goes as follows:
For each tuple in the
MatchExpression
list and while no match has succeeded:
- Match the
MatchHead
part against the arguments to the function, binding the'$<number>'
variables (much like inets:match/2
). If theMatchHead
cannot match the arguments, the match fails.
- Evaluate each
MatchCondition
(where only'$<number>'
variables previously bound in theMatchHead
can occur) and expect it to return the atomtrue
. As soon as a condition does not evaluate totrue
, the match fails. If any BIF call generates an exception, also fail.
- Evaluate each
ActionTerm
in the same way as theMatchConditions
, but completely ignore the return values. Regardless of what happens in this part, the match has succeeded.
1.5 Examples
Match an argument list of three where the first and third arguments are equal:
[ { ['$1', '_', '$1'], [], [] } ]Match an argument list of three where the second argument is a number greater than three:
[ { ['_', '$1', '_'], [ { '>', '$1', 3} ], [] } ]Match an argument list of three, where the third argument is a tuple containing argument one and two or a list beginning with argument one and two (i. e.
[a,b,[a,b,c]]
or[a,b,{a,b}]
):[ { ['$1', '$2', '$3'], [ {orelse, {'=:=', '$3', {{'$1','$2'}}}, {'and', {'=:=', '$1', {hd, '$3'}}, {'=:=', '$2', {hd, {tl, '$3'}}} } } ], [] } ]The above problem may also be solved like this:
[ { ['$1', '$2', {'$1', '$2}], [], [] }, { ['$1', '$2', ['$1', '$2' | '_']], [], [] } ]Match two arguments where the first is a tuple beginning with a list which in turn begins with the second argument times two (i. e. [{[4,x],y},2] or [{[8], y, z},4])
[ { ['$1', '$2'], [ {'=:=', {'*', 2, '$2'}, {hd, {element, 1, '$1'}}} ], [] } ]Match three arguments. When all three are equal and are numbers, append the process dump to the trace message, else let the trace message be as is, but set the sequential trace token label to 4711.
[ { ['$1', '$1', '$1'], [{is_number, '$1'}], [{message, {process_dump}}] }, { '_', [], [{set_seq_token, label, 4711}] } ]As can be noted above, the parameter list can be matched against a single
MatchVariable
or an'_'
. To replace the whole parameter list with a single variable is a special case. In all other cases theMatchHead
has to be a proper list.