xmerl_scan (xmerl v2.1.9)

Copy Markdown View Source

Single pass XML scanner.

This module is the interface to the XML parser, it handles XML 1.0. The XML parser is activated through xmerl_scan:string/[1,2] or xmerl_scan:file/[1,2]. It returns records of the type defined in xmerl.hrl.

See also the "Customization functions" tutorial.

Summary

Types

An XML document.

The global state of the scanner, represented by the #xmerl_scanner{} record.

Options allow to customize the behaviour of the scanner. See also the "Customization functions" tutorial.

Record #xmlDocument{}.

Record #xmlElement{}.

Functions

Accumulate and normalize whitespace.

Fetch the ContinuationState.

Set the ContinuationState, to be used in a continuation function.

Fetch the EventState.

Set the EventState, to be used in an event function.

Fetch the FetchState.

Set the FetchState, to be used in a fetch function.

Parse a file containing an XML document

Fetch the HookState.

Set the HookState, to be used in a hook function.

Fetch the RulesState.

Set the RulesState, to be used in a rules function.

Parse a string containing an XML document

Fetch the UserState.

Set the UserState, to be used in a user function.

Types

document()

-type document() :: xmlElement() | xmlDocument().

An XML document.

The document returned by xmerl_scan:string/[1,2] and xmerl_scan:file/[1,2]. The type of the returned record depends on the value of the document option passed to the function.

global_state()

(not exported)
-type global_state() :: xmerl_scanner().

The global state of the scanner, represented by the #xmerl_scanner{} record.

option_list()

(not exported)
-type option_list() :: [{atom(), term()} | {atom(), fun(), term()} | {atom(), fun(), fun(), term()}].

Options allow to customize the behaviour of the scanner. See also the "Customization functions" tutorial.

Possible options are:

  • {acc_fun, Fun} - Call back function to accumulate contents of entity.
  • {continuation_fun, Fun} | {continuation_fun, Fun, ContinuationState} - Call back function to decide what to do if the scanner runs into EOF before the document is complete.

  • {event_fun, Fun} | {event_fun, Fun, EventState} - Call back function to handle scanner events.

  • {fetch_fun, Fun} | {fetch_fun, Fun, FetchState} - Call back function to fetch an external resource.

  • {hook_fun, Fun} | {hook_fun, Fun, HookState} - Call back function to process the document entities once identified.

  • {close_fun, Fun} - Called when document has been completely parsed.
  • {rules, ReadFun, WriteFun, RulesState} | {rules, Rules} - Handles storing of scanner information when parsing.

  • {user_state, UserState} - Global state variable accessible from all customization functions.
  • {fetch_path, PathList} - PathList is a list of directories to search when fetching files. If the file in question is not in the fetch_path, the URI will be used as a file name.
  • {space, Flag} - preserve (default) to preserve spaces, normalize to accumulate consecutive whitespace and replace it with one space.
  • {line, Line} - To specify starting line for scanning in document which contains fragments of XML.
  • {namespace_conformant, Flag} - Controls whether to behave as a namespace conformant XML parser, false (default) to not otherwise true.
  • {validation, Flag} - Controls whether to process as a validating XML parser: off (default) no validation, or validation dtd by DTD or schema by XML Schema. false and true options are obsolete (i.e. they may be removed in a future release), if used false equals off and true equals dtd.
  • {schemaLocation, [{Namespace,Link}|...]} - Tells explicitly which XML Schema documents to use to validate the XML document. Used together with the {validation,schema} option.
  • {quiet, Flag} - Set to true if Xmerl should behave quietly and not output any information to standard output (default false).
  • {doctype_DTD, DTD} - Allows to specify DTD name when it isn't available in the XML document. This option has effect only together with {validation,dtd} option.
  • {xmlbase, Dir} - XML Base directory. If using string/1 default is current directory. If using file/1 default is directory of given file.
  • {encoding, Enc} - Set default character set used (default UTF-8). This character set is used only if not explicitly given by the XML declaration.
  • {document, Flag} - Set to true if Xmerl should return a complete XML document as an xmlDocument record (default false).
  • {comments, Flag} - Set to false if Xmerl should skip comments otherwise they will be returned as xmlComment records (default true).
  • {default_attrs, Flag} - Set to true if Xmerl should add to elements missing attributes with a defined default value (default false).
  • {allow_entities, Flag} - Set to true if xmerl_scan shouldn't fail when there is an ENTITY declaration in the XML document (default false).

xmlDocument()

(not exported)
-type xmlDocument() :: xmerl:xmlDocument().

Record #xmlDocument{}.

The record definition is found in xmerl.hrl.

xmlElement()

-type xmlElement() :: xmerl:xmlElement().

Record #xmlElement{}.

The record definition is found in xmerl.hrl.

Functions

accumulate_whitespace/4

-spec accumulate_whitespace(Text, global_state(), How, Acc) -> {NewAcc, NewText, global_state()}
                               when
                                   Text :: string(),
                                   How :: preserve | normalize,
                                   Acc :: string(),
                                   NewAcc :: string(),
                                   NewText :: string().

Accumulate and normalize whitespace.

cont_state/1

-spec cont_state(global_state()) -> ContinuationState when ContinuationState :: term().

Fetch the ContinuationState.

See the "Customization functions" tutorial.

cont_state/2

-spec cont_state(ContState :: term(), global_state()) -> global_state().

Set the ContinuationState, to be used in a continuation function.

The continuation function is called when the parser encounters the end of the byte stream. See the "Customization functions" tutorial.

event_state/1

-spec event_state(global_state()) -> EventState when EventState :: term().

Fetch the EventState.

See the "Customization functions" tutorial.

event_state/2

-spec event_state(EventState :: term(), global_state()) -> global_state().

Set the EventState, to be used in an event function.

The event function is called at the beginning and at the end of a parsed entity. See the "Customization functions" tutorial.

fetch_state/1

-spec fetch_state(global_state()) -> FetchState when FetchState :: term().

Fetch the FetchState.

See the "Customization functions" tutorial.

fetch_state/2

-spec fetch_state(FetchState :: term(), global_state()) -> global_state().

Set the FetchState, to be used in a fetch function.

The fetch function is and called when the parser fetches an external resource (eg. a DTD). See the "Customization functions" tutorial.

file(Filename)

-spec file(Filename :: string()) -> {xmlElement(), Rest} | {error, Reason}
              when Rest :: string(), Reason :: term().

Equivalent to file(Filename, []).

file(F, Options)

-spec file(Filename :: string(), option_list()) -> {dynamic(), Rest} | {error, Reason}
              when Rest :: string(), Reason :: term().

Parse a file containing an XML document

hook_state/1

-spec hook_state(global_state()) -> HookState when HookState :: term().

Fetch the HookState.

See the "Customization functions" tutorial.

hook_state/2

-spec hook_state(HookState :: term(), global_state()) -> global_state().

Set the HookState, to be used in a hook function.

The hook function is and called when the parser has parsed a complete entity. See the "Customization functions" tutorial.

rules_state/1

-spec rules_state(global_state()) -> RulesState when RulesState :: term().

Fetch the RulesState.

See the "Customization functions" tutorial.

rules_state/2

-spec rules_state(RulesState :: term(), global_state()) -> global_state().

Set the RulesState, to be used in a rules function.

The rules function is and called when the parser store scanner information in a rules database. See the "Customization functions" tutorial.

string(Text)

-spec string(Text :: string()) -> {xmlElement(), Rest} when Rest :: string().

Equivalent to string(Text, []).

string(Str, Options)

-spec string(Text :: string(), option_list()) -> {dynamic(), Rest} when Rest :: string().

Parse a string containing an XML document

user_state/1

-spec user_state(global_state()) -> UserState when UserState :: term().

Fetch the UserState.

See the "Customization functions" tutorial.

user_state(UserState, G)

-spec user_state(UserState :: term(), G :: global_state()) -> global_state().

Set the UserState, to be used in a user function.

See the "Customization functions" tutorial.