penman.lexer

Classes and functions for lexing PENMAN strings.

Module Constants

penman.lexer.PATTERNS

A dictionary mapping token names to regular expressions. For instance:

'ROLE':  r':[^\s()\/,:~^]*'

The token names are used later by the TokenIterator to help with parsing.

penman.lexer.PENMAN_RE

A compiled regular expression pattern for lexing PENMAN graphs.

penman.lexer.TRIPLE_RE

A compiled regular expression pattern for lexing triple conjunctions.

Module Functions

penman.lexer.lex(lines, pattern=None)[source]

Yield PENMAN tokens matched in lines.

By default, this lexes strings in lines using the basic pattern for PENMAN graphs. If pattern is given, it is used for lexing instead.

Parameters
  • lines – iterable of lines to lex

  • pattern – pattern to use for lexing instead of the default ones

Returns

A TokenIterator object

Classes

class penman.lexer.Token[source]

A lexed token.

property line

The line the token appears in.

property lineno

The line number the token appears on.

property offset

The character offset of the token.

property text

The matched string for the token.

property type

The token type.

class penman.lexer.TokenIterator(iterator)[source]

An iterator of Tokens with L1 lookahead.

accept(*choices)[source]

Return the next token if its type is in choices.

The iterator is advanced if successful. If unsuccessful, None is returned.

expect(*choices)[source]

Return the next token if its type is in choices.

The iterator is advanced if successful.

Raises

DecodeError – if the next token type is not in choices

next()[source]

Advance the iterator and return the next token.

Raises

StopIteration – if the iterator is already exhausted.

peek()[source]

Return the next token but do not advance the iterator.

If the iterator is exhausted then a DecodeError is raised.