penman.codec

Serialization of PENMAN graphs.

class penman.codec.PENMANCodec(model=None)[source]

An encoder/decoder for PENMAN-serialized graphs.

ATOMS = {'FLOAT', 'INTEGER', 'STRING', 'SYMBOL'}

The valid non-node targets of edges.

decode(s, triples=False)[source]

Deserialize PENMAN-notation string s into its Graph object.

Parameters
  • s – a string containing a single PENMAN-serialized graph

  • triples – if True, parse s as a triple conjunction

Returns

The Graph object described by s.

Example

>>> codec = PENMANCodec()
>>> codec.decode('(b / bark :ARG1 (d / dog))')
<Graph object (top=b) at ...>
>>> codec.decode(
...     'instance(b, bark) ^ instance(d, dog) ^ ARG1(b, d)',
...     triples=True
... )
<Graph object (top=b) at ...>
iterdecode(lines, triples=False)[source]

Yield graphs parsed from lines.

Parameters
  • lines – a string or open file with PENMAN-serialized graphs

  • triples – if True, parse s as a triple conjunction

Returns

The Graph objects described in lines.

parse(s)[source]

Parse PENMAN-notation string s into its tree structure.

Parameters

s – a string containing a single PENMAN-serialized graph

Returns

The tree structure described by s.

Example

>>> codec = PENMANCodec()
>>> codec.parse('(b / bark :ARG1 (d / dog))')  # noqa
Tree(('b', [('/', 'bark', []), ('ARG1', ('d', [('/', 'dog', [])]), [])]))
parse_triples(s)[source]

Parse a triple conjunction from s.

encode(g, top=None, triples=False, indent=-1, compact=False)[source]

Serialize the graph g into PENMAN notation.

Parameters
  • g – the Graph object

  • top – if given, the node to use as the top in serialization

  • triples – if True, serialize as a conjunction of triples

  • indent – how to indent formatted strings

  • compact – if True, put initial attributes on the first line

Returns

the PENMAN-serialized string of the Graph g

Example

>>> codec = PENMANCodec()
>>> codec.encode(Graph([('h', 'instance', 'hi')]))
(h / hi)
>>> codec.encode(Graph([('h', 'instance', 'hi')]),
...                      triples=True)
instance(h, hi)
format(tree, indent=-1, compact=False)[source]

Format tree into a PENMAN string.

format_triples(triples, indent=True)[source]

Return the formatted triple conjunction of triples.

Parameters
  • triples – an iterable of triples

  • indent – how to indent formatted strings

Returns

the serialized triple conjunction of triples

Example

>>> codec = PENMANCodec()
>>> codec.format_triples([('a', ':instance', 'alpha'),
...                       ('a', ':ARG0', 'b'),
...                       ('b', ':instance', 'beta')])
...
'instance(a, alpha) ^\nARG0(a, b) ^\ninstance(b, beta)'