MarkupParseContext

class MarkupParseContext(**kwargs)

A parse context is used to parse a stream of bytes that you expect to contain marked-up text.

See new(), MarkupParser, and so on for more details.

Constructors

class MarkupParseContext

classmethod new(parser: MarkupParser, flags: MarkupParseFlags, user_data: None, user_data_dnotify: Callable[[None], None]) → MarkupParseContext

Creates a new parse context. A parse context is used to parse marked-up documents. You can feed any number of documents into a context, as long as no errors occur; once an error occurs, the parse context can’t continue to parse text (you have to free it and create a new parse context).

Parameters:

parser – a MarkupParser
flags – one or more MarkupParseFlags
user_data – user data to pass to MarkupParser functions
user_data_dnotify – user data destroy notifier called when the parse context is freed

Methods

class MarkupParseContext

end_parse() → bool

Signals to the MarkupParseContext that all data has been fed into the parse context with parse().

This function reports an error if the document isn’t complete, for example if elements are still open.

free() → None

Frees a MarkupParseContext.

This function can’t be called from inside one of the MarkupParser functions or while a subparser is pushed.

get_element() → str

Retrieves the name of the currently open element.

If called from the start_element or end_element handlers this will give the element_name as passed to those functions. For the parent elements, see get_element_stack().

Added in version 2.2.

get_element_stack() → list[str]

Retrieves the element stack from the internal state of the parser.

The returned GSList is a list of strings where the first item is the currently open tag (as would be returned by get_element()) and the next item is its immediate parent.

This function is intended to be used in the start_element and end_element handlers where get_element() would merely return the name of the element that is being processed.

Added in version 2.16.

get_position() → tuple[int, int]: Retrieves the current line number and the number of the character on that line. Intended for use in error messages; there are no strict semantics for what constitutes the “current” line number other than “the best number we could come up with for error messages.”

get_user_data() → None

Returns the user_data associated with context.

This will either be the user_data that was provided to new() or to the most recent call of push().

Added in version 2.18.

parse(text: str, text_len: int) → bool

Feed some data to the MarkupParseContext.

The data need not be valid UTF-8; an error will be signaled if it’s invalid. The data need not be an entire document; you can feed a document into the parser incrementally, via multiple calls to this function. Typically, as you receive data from a network connection or file, you feed each received chunk of data into this function, aborting the process if an error occurs. Once an error is reported, no further data may be fed to the MarkupParseContext; all errors are fatal.

Parameters:

text – chunk of text to parse
text_len – length of text in bytes

pop() → None

Completes the process of a temporary sub-parser redirection.

This function exists to collect the user_data allocated by a matching call to push(). It must be called in the end_element handler corresponding to the start_element handler during which push() was called. You must not call this function from the error callback – the user_data is provided directly to the callback in that case.

This function is not intended to be directly called by users interested in invoking subparsers. Instead, it is intended to be used by the subparsers themselves to implement a higher-level interface.

Added in version 2.18.

push(parser: MarkupParser, user_data: None) → None

Temporarily redirects markup data to a sub-parser.

This function may only be called from the start_element handler of a MarkupParser. It must be matched with a corresponding call to pop() in the matching end_element handler (except in the case that the parser aborts due to an error).

All tags, text and other data between the matching tags is redirected to the subparser given by parser. user_data is used as the user_data for that parser. user_data is also passed to the error callback in the event that an error occurs. This includes errors that occur in subparsers of the subparser.

The end tag matching the start tag for which this call was made is handled by the previous parser (which is given its own user_data) which is why pop() is provided to allow “one last access” to the user_data provided to this function. In the case of error, the user_data provided here is passed directly to the error callback of the subparser and pop() should not be called. In either case, if user_data was allocated then it ought to be freed from both of these locations.

This function is not intended to be directly called by users interested in invoking subparsers. Instead, it is intended to be used by the subparsers themselves to implement a higher-level interface.

As an example, see the following implementation of a simple parser that counts the number of tags encountered.

typedef struct
{
  gint tag_count;
} CounterData;

static void
counter_start_element (GMarkupParseContext  *context,
                       const gchar          *element_name,
                       const gchar         **attribute_names,
                       const gchar         **attribute_values,
                       gpointer              user_data,
                       GError              **error)
{
  CounterData *data = user_data;

  data->tag_count++;
}

static void
counter_error (GMarkupParseContext *context,
               GError              *error,
               gpointer             user_data)
{
  CounterData *data = user_data;

  g_slice_free (CounterData, data);
}

static GMarkupParser counter_subparser =
{
  counter_start_element,
  NULL,
  NULL,
  NULL,
  counter_error
};

In order to allow this parser to be easily used as a subparser, the following interface is provided:

void
start_counting (GMarkupParseContext *context)
{
  CounterData *data = g_slice_new (CounterData);

  data->tag_count = 0;
  g_markup_parse_context_push (context, &counter_subparser, data);
}

gint
end_counting (GMarkupParseContext *context)
{
  CounterData *data = g_markup_parse_context_pop (context);
  int result;

  result = data->tag_count;
  g_slice_free (CounterData, data);

  return result;
}

The subparser would then be used as follows:

static void start_element (context, element_name, ...)
{
  if (strcmp (element_name, "count-these") == 0)
    start_counting (context);

  // else, handle other tags...
}

static void end_element (context, element_name, ...)
{
  if (strcmp (element_name, "count-these") == 0)
    g_print ("Counted ``%d`` tags\n", end_counting (context));

  // else, handle other tags...
}

Added in version 2.18.

Parameters:

parser – a MarkupParser
user_data – user data to pass to MarkupParser functions