Scanner#
- class Scanner(*args, **kwargs)#
GScanner
provides a general-purpose lexical scanner.
You should set input_name
after creating the scanner, since
it is used by the default message handler when displaying
warnings and errors. If you are scanning a file, the filename
would be a good choice.
The user_data
and max_parse_errors
fields are not used.
If you need to associate extra data with the scanner you
can place them here.
If you want to use your own message handler you can set the
msg_handler
field. The type of the message handler function
is declared by ScannerMsgFunc
.
Methods#
- class Scanner
- cur_line() int #
Returns the current line in the input stream (counting from 1). This is the line of the last token parsed via
get_next_token()
.
- cur_position() int #
Returns the current position in the current line (counting from 0). This is the position of the last token parsed via
get_next_token()
.
- cur_token() TokenType #
Gets the current token type. This is simply the
token
field in theScanner
structure.
- get_next_token() TokenType #
Parses the next token just like
peek_next_token()
and also removes it from the input stream. The token data is placed in thetoken
,value
,line
, andposition
fields of theScanner
structure.
- input_text(text: str, text_len: int) None #
Prepares to scan a text buffer.
- Parameters:
text – the text buffer to scan
text_len – the length of the text buffer
- lookup_symbol(symbol: str) None #
Looks up a symbol in the current scope and return its value. If the symbol is not bound in the current scope,
None
is returned.- Parameters:
symbol – the symbol to look up
- peek_next_token() TokenType #
Parses the next token, without removing it from the input stream. The token data is placed in the
next_token
,next_value
,next_line
, andnext_position
fields of theScanner
structure.Note that, while the token is not removed from the input stream (i.e. the next call to
get_next_token()
will return the same token), it will not be reevaluated. This can lead to surprising results when changing scope or the scanner configuration after peeking the next token. Getting the next token after switching the scope or configuration will return whatever was peeked before, regardless of any symbols that may have been added or removed in the new scope.
- scope_add_symbol(scope_id: int, symbol: str, value: None) None #
Adds a symbol to the given scope.
- Parameters:
scope_id – the scope id
symbol – the symbol to add
value – the value of the symbol
- scope_foreach_symbol(scope_id: int, func: Callable[[...], None], *user_data: Any) None #
Calls the given function for each of the symbol/value pairs in the given scope of the
Scanner
. The function is passed the symbol and value of each pair, and the givenuser_data
parameter.- Parameters:
scope_id – the scope id
func – the function to call for each symbol/value pair
user_data – user data to pass to the function
- scope_lookup_symbol(scope_id: int, symbol: str) None #
Looks up a symbol in a scope and return its value. If the symbol is not bound in the scope,
None
is returned.- Parameters:
scope_id – the scope id
symbol – the symbol to look up
- scope_remove_symbol(scope_id: int, symbol: str) None #
Removes a symbol from a scope.
- Parameters:
scope_id – the scope id
symbol – the symbol to remove
- sync_file_offset() None #
Rewinds the filedescriptor to the current buffer position and blows the file read ahead buffer. This is useful for third party uses of the scanners filedescriptor, which hooks onto the current scanning position.
- unexp_token(expected_token: TokenType, identifier_spec: str, symbol_spec: str, symbol_name: str, message: str, is_error: int) None #
Outputs a message through the scanner’s msg_handler, resulting from an unexpected token in the input stream. Note that you should not call
peek_next_token()
followed byunexp_token()
without an intermediate call toget_next_token()
, asunexp_token()
evaluates the scanner’s current token (not the peeked token) to construct part of the message.- Parameters:
expected_token – the expected token
identifier_spec – a string describing how the scanner’s user refers to identifiers (
None
defaults to “identifier”). This is used ifexpected_token
isIDENTIFIER
orIDENTIFIER_NULL
.symbol_spec – a string describing how the scanner’s user refers to symbols (
None
defaults to “symbol”). This is used ifexpected_token
isSYMBOL
or any token value greater than%G_TOKEN_LAST
.symbol_name – the name of the symbol, if the scanner’s current token is a symbol.
message – a message string to output at the end of the warning/error, or
None
.is_error – if
True
it is output as an error. IfFalse
it is output as a warning.
Fields#
- class Scanner
- buffer#
- config#
Link into the scanner configuration
- input_fd#
- input_name#
Name of input stream, featured by the default message handler
- line#
Line number of the last token from
get_next_token()
- max_parse_errors#
Unused
- msg_handler#
Handler function for _warn and _error
- next_line#
Line number of the last token from
peek_next_token()
- next_position#
Char number of the last token from
peek_next_token()
- next_token#
Token parsed by the last
peek_next_token()
- next_value#
Value of the last token from
peek_next_token()
- parse_errors#
error()
increments this field
- position#
Char number of the last token from
get_next_token()
- qdata#
Quarked data
- scope_id#
- symbol_table#
- text#
- text_end#
- token#
Token parsed by the last
get_next_token()
- user_data#
Unused
- value#
Value of the last token from
get_next_token()