libearth.parser.autodiscovery — Autodiscovery

This module provides functions to autodiscovery feed url in document.

libearth.parser.autodiscovery.ATOM_TYPE = 'application/atom+xml'

(str) The MIME type of Atom format (application/atom+xml).

libearth.parser.autodiscovery.RSS_TYPE = 'application/rss+xml'

(str) The MIME type of RSS 2.0 format (application/rss+xml).

libearth.parser.autodiscovery.TYPE_TABLE = {<function parse_rss at 0x7fe1db2b5140>: 'application/rss+xml', <function parse_atom at 0x7fe1db2b1668>: 'application/atom+xml'}

(collections.Mapping) The mapping table of feed types

class libearth.parser.autodiscovery.AutoDiscovery

Parse the given HTML and try finding the actual feed urls from it.

Changed in version 0.3.0: It became to find icon links as well, and find_feed_url() method (that returned only feed links) was gone, instead find() (that return a pair of feed links and icon links) was introduced.

Namedtuple which is a pair of type` and ``url


Alias for field number 0


Alias for field number 1

exception libearth.parser.autodiscovery.FeedUrlNotFoundError(msg)

Exception raised when feed url cannot be found in html.

libearth.parser.autodiscovery.autodiscovery(document, url)

If the given url refers an actual feed, it returns the given url without any change.

If the given url is a url of an ordinary web page (i.e. text/html), it finds the urls of the corresponding feed. It returns feed urls in feed types’ lexicographical order.

If autodiscovery failed, it raise FeedUrlNotFoundError.

  • document (str) – html, or xml strings
  • url (str) – the url used to retrieve the document. if feed url is in html and represented in relative url, it will be rebuilt on top of the url

list of FeedLink objects

Return type:



Guess the syndication format of an arbitrary document.

Parameters:document (str, bytes) – document string to guess
Returns:the function possible to parse the given document
Return type:collections.Callable

Changed in version 0.2.0: The function was in libearth.parser.heuristic module (which is removed now) before 0.2.0, but now it’s moved to libearth.parser.autodiscovery.