Scanner

class Scanner(*args, **kwargs)

GScanner provides a general-purpose lexical scanner.

You should set input_name after creating the scanner, since it is used by the default message handler when displaying warnings and errors. If you are scanning a file, the filename would be a good choice.

The user_data and max_parse_errors fields are not used. If you need to associate extra data with the scanner you can place them here.

If you want to use your own message handler you can set the msg_handler field. The type of the message handler function is declared by ScannerMsgFunc.

Methods

class Scanner
cur_line() int

Returns the current line in the input stream (counting from 1). This is the line of the last token parsed via get_next_token().

cur_position() int

Returns the current position in the current line (counting from 0). This is the position of the last token parsed via get_next_token().

cur_token() TokenType

Gets the current token type. This is simply the token field in the Scanner structure.

destroy() None

Frees all memory used by the Scanner.

eof() bool

Returns True if the scanner has reached the end of the file or text buffer.

get_next_token() TokenType

Parses the next token just like peek_next_token() and also removes it from the input stream. The token data is placed in the token, value, line, and position fields of the Scanner structure.

input_file(input_fd: int) None

Prepares to scan a file.

Parameters:

input_fd – a file descriptor

input_text(text: str, text_len: int) None

Prepares to scan a text buffer.

Parameters:
  • text – the text buffer to scan

  • text_len – the length of the text buffer

lookup_symbol(symbol: str) None

Looks up a symbol in the current scope and return its value. If the symbol is not bound in the current scope, None is returned.

Parameters:

symbol – the symbol to look up

peek_next_token() TokenType

Parses the next token, without removing it from the input stream. The token data is placed in the next_token, next_value, next_line, and next_position fields of the Scanner structure.

Note that, while the token is not removed from the input stream (i.e. the next call to get_next_token() will return the same token), it will not be reevaluated. This can lead to surprising results when changing scope or the scanner configuration after peeking the next token. Getting the next token after switching the scope or configuration will return whatever was peeked before, regardless of any symbols that may have been added or removed in the new scope.

scope_add_symbol(scope_id: int, symbol: str, value: None) None

Adds a symbol to the given scope.

Parameters:
  • scope_id – the scope id

  • symbol – the symbol to add

  • value – the value of the symbol

scope_foreach_symbol(scope_id: int, func: Callable[[...], None], *user_data: Any) None

Calls the given function for each of the symbol/value pairs in the given scope of the Scanner. The function is passed the symbol and value of each pair, and the given user_data parameter.

Parameters:
  • scope_id – the scope id

  • func – the function to call for each symbol/value pair

  • user_data – user data to pass to the function

scope_lookup_symbol(scope_id: int, symbol: str) None

Looks up a symbol in a scope and return its value. If the symbol is not bound in the scope, None is returned.

Parameters:
  • scope_id – the scope id

  • symbol – the symbol to look up

scope_remove_symbol(scope_id: int, symbol: str) None

Removes a symbol from a scope.

Parameters:
  • scope_id – the scope id

  • symbol – the symbol to remove

set_scope(scope_id: int) int

Sets the current scope.

Parameters:

scope_id – the new scope id

sync_file_offset() None

Rewinds the filedescriptor to the current buffer position and blows the file read ahead buffer. This is useful for third party uses of the scanners filedescriptor, which hooks onto the current scanning position.

unexp_token(expected_token: TokenType, identifier_spec: str, symbol_spec: str, symbol_name: str, message: str, is_error: int) None

Outputs a message through the scanner’s msg_handler, resulting from an unexpected token in the input stream. Note that you should not call peek_next_token() followed by unexp_token() without an intermediate call to get_next_token(), as unexp_token() evaluates the scanner’s current token (not the peeked token) to construct part of the message.

Parameters:
  • expected_token – the expected token

  • identifier_spec – a string describing how the scanner’s user refers to identifiers (None defaults to “identifier”). This is used if expected_token is IDENTIFIER or IDENTIFIER_NULL.

  • symbol_spec – a string describing how the scanner’s user refers to symbols (None defaults to “symbol”). This is used if expected_token is SYMBOL or any token value greater than %G_TOKEN_LAST.

  • symbol_name – the name of the symbol, if the scanner’s current token is a symbol.

  • message – a message string to output at the end of the warning/error, or None.

  • is_error – if True it is output as an error. If False it is output as a warning.

Fields

class Scanner
buffer
config

Link into the scanner configuration

input_fd
input_name

Name of input stream, featured by the default message handler

line

Line number of the last token from get_next_token()

max_parse_errors

Unused

msg_handler

Handler function for _warn and _error

next_line

Line number of the last token from peek_next_token()

next_position

Char number of the last token from peek_next_token()

next_token

Token parsed by the last peek_next_token()

next_value

Value of the last token from peek_next_token()

parse_errors

error() increments this field

position

Char number of the last token from get_next_token()

qdata

Quarked data

scope_id
symbol_table
text
text_end
token

Token parsed by the last get_next_token()

user_data

Unused

value

Value of the last token from get_next_token()