libearth.parser.base — Base Parser

Common interfaces used in both Atom parser and RSS2 parser.

class libearth.parser.base.ParserBase(parser=None)

The ParserBase object purposes to define parsers. Defined parsers take an XML element, and then return a parsed Element object. Every parser is defined together with a path(e.g. 'channel/item') of elements to take through path() decorator.

Every decorated function becomes to a child parser of the parser that decorats it.

rss2_parser = Parser()

@rss2_parser.path('channel')
def channel_parser(element, session):
    # ...

@channel_parser.path('item')
def item_parser(element, session):
    # ...
path(element_name, namespace_set=None, attr_name=None)

The decorator function to define a parser in the top of parser hierarchy or its children parsers.

Parameters:
  • element_name (str) – The element id. It consists of an xml namespace and an element name. The parser should return a :class: ~libearth.feed.Element matches it.
  • attr_name – The descriptor attribute name of the parent :class: ~libearth.feed.Element for the designated Element
class libearth.parser.base.SessionBase

The additional data which are needed for parsing the elements. For example, an xml:base is needed to retrieve the full uri when an relative uri is given in the Atom element. A session object is passed from root parser to its children parsers, and A change of the session only affects in the parser where the change occurs and its children parsers.

libearth.parser.base.get_element_id(name_space, element_name)

Returns combined string of the name_space and element_name. The return value is ‘{namespace}element_name’