penman.lexer¶
Classes and functions for lexing PENMAN strings.
Module Constants¶
-
penman.lexer.
PATTERNS
¶ A dictionary mapping token names to regular expressions. For instance:
'ROLE': r':[^\s()\/,:~^]*'
The token names are used later by the
TokenIterator
to help with parsing.
-
penman.lexer.
PENMAN_RE
¶ A compiled regular expression pattern for lexing PENMAN graphs.
-
penman.lexer.
TRIPLE_RE
¶ A compiled regular expression pattern for lexing triple conjunctions.
Module Functions¶
-
penman.lexer.
lex
(lines, pattern=None)[source]¶ Yield PENMAN tokens matched in lines.
By default, this lexes strings in lines using the basic pattern for PENMAN graphs. If pattern is given, it is used for lexing instead.
- Parameters
lines – iterable of lines to lex
pattern – pattern to use for lexing instead of the default ones
- Returns
A
TokenIterator
object
Classes¶
-
class
penman.lexer.
Token
[source]¶ A lexed token.
-
property
line
¶ The line the token appears in.
-
property
lineno
¶ The line number the token appears on.
-
property
offset
¶ The character offset of the token.
-
property
text
¶ The matched string for the token.
-
property
type
¶ The token type.
-
property
-
class
penman.lexer.
TokenIterator
(iterator)[source]¶ An iterator of Tokens with L1 lookahead.
-
accept
(*choices)[source]¶ Return the next token if its type is in choices.
The iterator is advanced if successful. If unsuccessful, None is returned.
-
expect
(*choices)[source]¶ Return the next token if its type is in choices.
The iterator is advanced if successful.
- Raises
DecodeError – if the next token type is not in choices
-
next
()[source]¶ Advance the iterator and return the next token.
- Raises
StopIteration – if the iterator is already exhausted.
-