Utilities#

from cached_path import *

Functions#

cached_path.set_cache_dir(cache_dir)[source]#

Set the global default cache directory.

Return type

None

cached_path.get_cache_dir()[source]#

Get the global default cache directory.

Return type

Path

cached_path.add_scheme_client(client)[source]#

Add a new SchemeClient.

This can be used to extend cached_path.cached_path() to handle custom schemes, or handle existing schemes differently.

Return type

None

cached_path.resource_to_filename(resource, etag=None)[source]#

Convert a resource into a hashed filename in a repeatable way. If etag is specified, append its hash to the resources’, delimited by a period.

THis is essentially the inverse of filename_to_url().

Return type

str

cached_path.filename_to_url(filename, cache_dir=None)[source]#

Return the URL and etag (which may be None) stored for filename. Raises FileNotFoundError if filename or its stored metadata do not exist.

This is essentially the inverse of resource_to_filename().

Return type

Tuple[str, Optional[str]]

cached_path.find_latest_cached(url, cache_dir=None)[source]#

Get the path to the latest cached version of a given resource.

Return type

Optional[Path]

cached_path.check_tarfile(tar_file)[source]#

Tar files can contain files outside of the extraction directory, or symlinks that point outside the extraction directory. We also don’t want any block devices fifos, or other weird file types extracted. This checks for those issues and throws an exception if there is a problem.

cached_path.get_download_progress(quiet=False)[source]#
Return type

Progress

Classes#

class cached_path.SchemeClient(resource)[source]#

A client used for caching remote resources corresponding to URLs with a particular scheme.

Subclasses must define the scheme class variable and implement get_etag() and get_resource().

Important

Take care when implementing subclasses to raise the right error types from get_etag() and get_resource().

abstract get_etag()[source]#

Get the Etag or an equivalent version identifier associated with the resource.

Returns

The ETag as a str or None if there is no ETag associated with the resource.

Return type

Optional[str]

Raises
  • FileNotFoundError – If the resource doesn’t exist.

  • Recoverable error – Any error type defined in SchemeClient.recoverable_errors will be treated as a recoverable error. This means that when of these is caught by cached_path(), it will look for cached versions of the given resource and return the latest version if there are any. Otherwise the error is propogated.

  • Other errors – Any other error type can be raised. These errors will be treated non-recoverable and will be propogated immediately by cached_path().

abstract get_resource(temp_file)[source]#

Download the resource to the given temporary file.

Raises
  • FileNotFoundError – If the resource doesn’t exist.

  • Other errors – Any other error type can be raised. These errors will be treated non-recoverable and will be propogated immediately by cached_path().

Return type

None

abstract get_size()[source]#

Get the size of the resource in bytes (if known).

Returns

The size (in bytes).

Return type

Optional[int]

Raises
  • FileNotFoundError – If the resource doesn’t exist.

  • Recoverable error – Any error type defined in SchemeClient.recoverable_errors will be treated as a recoverable error. This means that when of these is caught by cached_path(), the size will be ignored.

  • Other errors – Any other error type can be raised. These errors will be treated non-recoverable and will be propogated immediately by cached_path().

recoverable_errors: ClassVar[Tuple[Type[BaseException], ...]] = (<class 'requests.exceptions.ConnectionError'>, <class 'requests.exceptions.Timeout'>)#

Subclasses can override this to define error types that will be treated as recoverable.

If cached_path() catches of one these errors while calling get_etag(), it will log a warning and return the latest cached version if there is one, otherwise it will propogate the error.

scheme: ClassVar[Union[str, Tuple[str, ...]]] = ()#

The scheme or schemes that the client will be used for (e.g. “http”).