Utilities

from cached_path import *

Functions

cached_path.set_cache_dir(cache_dir: Union[str, os.PathLike]) None[source]

Set the global default cache directory.

cached_path.get_cache_dir() Union[str, os.PathLike][source]

Get the global default cache directory.

cached_path.file_friendly_logging(on: bool = True) None[source]

Turn on (or off) file-friendly logging globally.

You can also control this through the environment variable FILE_FRIENDLY_LOGGING.

cached_path.add_scheme_client(client: Type[cached_path.schemes.scheme_client.SchemeClient]) None[source]

Add a new SchemeClient.

This can be used to extend cached_path.cached_path() to handle custom schemes, or handle existing schemes differently.

cached_path.resource_to_filename(resource: str, etag: Optional[str] = None) str[source]

Convert a resource into a hashed filename in a repeatable way. If etag is specified, append its hash to the resources’, delimited by a period.

THis is essentially the inverse of filename_to_url().

cached_path.filename_to_url(filename: str, cache_dir: Optional[Union[str, os.PathLike]] = None) Tuple[str, Optional[str]][source]

Return the URL and etag (which may be None) stored for filename. Raises FileNotFoundError if filename or its stored metadata do not exist.

This is essentially the inverse of resource_to_filename().

cached_path.find_latest_cached(url: str, cache_dir: Optional[Union[str, os.PathLike]] = None) Optional[str][source]

Get the path to the latest cached version of a given resource.

cached_path.check_tarfile(tar_file: tarfile.TarFile)[source]

Tar files can contain files outside of the extraction directory, or symlinks that point outside the extraction directory. We also don’t want any block devices fifos, or other weird file types extracted. This checks for those issues and throws an exception if there is a problem.

Classes

class cached_path.SchemeClient(resource: str)[source]

A client used for caching remote resources corresponding to URLs with a particular scheme.

Subclasses must define the scheme class variable and implement get_etag() and get_resource().

Important

Take care when implementing subclasses to raise the right error types from get_etag() and get_resource().

get_etag() Optional[str][source]

Get the Etag or an equivalent version identifier associated with the resource.

Returns

The ETag as a str or None if there is no ETag associated with the resource.

Return type

Optional[str]

Raises
  • FileNotFoundError – If the resource doesn’t exist.

  • Recoverable error – Any error type defined in SchemeClient.recoverable_errors will be treated as a recoverable error. This means that when of these is caught by cached_path(), it will look for cached versions of the given resource and return the latest version if there are any. Otherwise the error is propogated.

  • Other errors – Any other error type can be raised. These errors will be treated non-recoverable and will be propogated immediately by cached_path().

get_resource(temp_file: IO) None[source]

Download the resource to the given temporary file.

Raises
  • FileNotFoundError – If the resource doesn’t exist.

  • Other errors – Any other error type can be raised. These errors will be treated non-recoverable and will be propogated immediately by cached_path().

recoverable_errors: ClassVar[Tuple[BaseException, ...]] = (<class 'requests.exceptions.ConnectionError'>, <class 'requests.exceptions.Timeout'>)

Subclasses can override this to define error types that will be treated as recoverable.

If cached_path() catches of one these errors while calling get_etag(), it will log a warning and return the latest cached version if there is one, otherwise it will propogate the error.

scheme: ClassVar[Union[str, Tuple[str, ...]]] = ()

The scheme or schemes that the client will be used for (e.g. “http”).