Previous topic

libearth.parser.util — Utilities for feed parsing

Next topic

libearth.sanitizer — Sanitize HTML tags

This Page

libearth.repository — Repositories

Repository abstracts storage backend e.g. filesystem. There might be platforms that have no chance to directly access file system e.g. iOS, and in that case the concept of repository makes you to store data directly to Dropbox or Google Drive instead of filesystem. However in the most cases we will simply use FileSystemRepository even if data are synchronized using Dropbox or rsync.

In order to make the repository highly configurable it provides the way to lookup and instantiate the repository from url. For example, the following url will load FileSystemRepository which sets path to /home/dahlia/.earthreader/:

file:///home/dahlia/.earthreader/

For extensibility every repository class has to implement from_url() and to_url() methods, and register it as an entry point of libearth.repositories group e.g.:

[libearth.repositories]
file = libearth.repository:FileSystemRepository

Note that the entry point name (file in the above example) becomes the url scheme to lookup the corresponding repository class (libearth.repository.FileSystemRepository in the above example).

class libearth.repository.FileIterator(path, buffer_size)

Read a file through Iterator protocol, with automatic closing of the file when it ends.

Parameters:
  • path (str) – the path of file
  • buffer_size (numbers.Integral) – the size of bytes that would be produced each step
exception libearth.repository.FileNotFoundError

Raised when a given path does not exist.

class libearth.repository.FileSystemRepository(path, mkdir=True, atomic=False)

Builtin implementation of Repository interface which uses the ordinary file system.

Parameters:
  • path (str) – the directory path to store keys
  • mkdir (bool) – create the directory if it doesn’t exist yet. True by default
  • atomic – make the update invisible until it’s complete. False by default
Raises:
path = None

(str) The path of the directory to read and write data files. It should be readable and writable.

exception libearth.repository.NotADirectoryError

Raised when a given path is not a directory.

class libearth.repository.Repository

Repository interface agnostic to its underlying storage implementation. Stage objects can deal with documents to be stored using the interface.

Every content in repositories is accessible using keys. It actually abstracts out “filenames” in “file systems”, hence keys share the common concepts with filenames. Keys are hierarchical, like file paths, so consists of multiple sequential strings e.g. ['dir', 'subdir', 'key']. You can list() all subkeys in the upper key as well e.g.:

repository.list(['dir', 'subdir'])
exists(key)

Return whether the key exists or not. It returns False if it doesn’t exist instead of raising RepositoryKeyError.

Parameters:key (collections.Sequence) – the key to find whether it exists
Returns:True only if the given key exists, or False if not exists
Return type:bool

Note

Every subclass of Repository has to override exists() method to implement details.

classmethod from_url(url)

Create a new instance of the repository from the given url. It’s used for configuring the repository in plain text e.g. *.ini.

Note

Every subclass of Repository has to override from_url() static/class method to implement details.

Parameters:url (urllib.parse.ParseResult) – the parsed url tuple
Returns:a new repository instance
Return type:Repository
Raises ValueError:
 when the given url is not invalid
list(key)

List all subkeys in the key.

Parameters:key (collections.Sequence) – the incomplete key that might have subkeys
Returns:the set of subkeys (set of strings, not set of string lists)
Return type:collections.Set
Raises RepositoryKeyError:
 the key cannot be found in the repository, or it’s not a directory

Note

Every subclass of Repository has to override list() method to implement details.

read(key)

Read the content from the key.

Parameters:key (collections.Sequence) – the key which stores the content to read
Returns:byte string chunks
Return type:collections.Iterable
Raises RepositoryKeyError:
 the key cannot be found in the repository, or it’s not a file

Note

Every subclass of Repository has to override read() method to implement details.

to_url(scheme)

Generate a url that from_url() can accept. It’s used for configuring the repository in plain text e.g. *.ini. URL scheme is determined by caller, and given through argument.

Note

Every subclass of Repository has to override to_url() method to implement details.

Parameters:scheme – a determined url scheme
Returns:a url that from_url() can accept
Return type:str
write(key, iterable)

Write the iterable into the key.

Parameters:
  • key (collections.Sequence) – the key to stores the iterable
  • iterable (collections.Iterable) – the iterable object yiels chunks of the whole content. every chunk has to be a byte string

Note

Every subclass of Repository has to override write() method to implement details.

exception libearth.repository.RepositoryKeyError(key, *args, **kwargs)

Exception which rises when the requested key cannot be found in the repository.

key = None

(collections.Sequence) The requested key.

libearth.repository.from_url(url)

Load the repository instance from the given configuration url.

Note

If setuptools is not installed it will only support file:// scheme and FileSystemRepository.

Parameters:

url (str, urllib.parse.ParseResult) – a repository configuration url

Returns:

the loaded repository instance

Return type:

Repository

Raises:
  • LookupError – when the corresponding repository type to the given url scheme cannot be found
  • ValueError – when the given url is invalid
Fork me on GitHub