Previous topic

libearth.parser.autodiscovery — Autodiscovery

Next topic

libearth.parser.util — Utilities for feed parsing

This Page

libearth.parser.rss2 — RSS 2.0 parser

Parsing RSS 2.0 feed.

libearth.parser.rss2.guess_default_tzinfo(root, url)

Guess what time zone is implied in the feed by seeing the TLD of the url and its <language> tag.

libearth.parser.rss2.parse_rss(xml, feed_url=None, parse_entry=True)

Parse RSS 2.0 XML and translate it into Atom.

To make the feed data valid in Atom format, id and link[rel=self] fields would become the url of the feed.

If pubDate is not present, updated field will be from the latest entry’s updated time, or the time it’s crawled instead.

Parameters:
  • xml (str) – rss 2.0 xml string to parse
  • parse_item (bool) – whether to parse items (entries) as well. it’s useful when to ignore items when retrieve <source>. True by default
Returns:

a pair of (Feed, crawler hint)

Return type:

tuple

Fork me on GitHub