Utilities¶
from cached_path import *
Functions¶
- cached_path.set_cache_dir(cache_dir: Union[str, os.PathLike]) None ¶
Set the global default cache directory.
- cached_path.get_cache_dir() Union[str, os.PathLike] ¶
Get the global default cache directory.
- cached_path.file_friendly_logging(on: bool = True) None ¶
Turn on (or off) file-friendly logging globally.
- cached_path.add_scheme_client(client: Type[cached_path.schemes.scheme_client.SchemeClient]) None ¶
Add a new
SchemeClient
.This can be used to extend
cached_path.cached_path()
to handle custom schemes, or handle existing schemes differently.
- cached_path.resource_to_filename(resource: str, etag: Optional[str] = None) str ¶
Convert a
resource
into a hashed filename in a repeatable way. Ifetag
is specified, append its hash to the resources’, delimited by a period.THis is essentially the inverse of
filename_to_url()
.
- cached_path.filename_to_url(filename: str, cache_dir: Optional[Union[str, os.PathLike]] = None) Tuple[str, Optional[str]] ¶
Return the URL and etag (which may be
None
) stored forfilename
. RaisesFileNotFoundError
iffilename
or its stored metadata do not exist.This is essentially the inverse of
resource_to_filename()
.
- cached_path.find_latest_cached(url: str, cache_dir: Optional[Union[str, os.PathLike]] = None) Optional[str] ¶
Get the path to the latest cached version of a given resource.
- cached_path.check_tarfile(tar_file: tarfile.TarFile)¶
Tar files can contain files outside of the extraction directory, or symlinks that point outside the extraction directory. We also don’t want any block devices fifos, or other weird file types extracted. This checks for those issues and throws an exception if there is a problem.
Classes¶
- class cached_path.SchemeClient(resource: str)¶
A client used for caching remote resources corresponding to URLs with a particular scheme.
Subclasses must define the
scheme
class variable and implementget_etag()
andget_resource()
.Important
Take care when implementing subclasses to raise the right error types from
get_etag()
andget_resource()
.- connection_error_types: ClassVar[Tuple[BaseException, ...]] = (<class 'requests.exceptions.ConnectionError'>,)¶
Subclasses can override this to define error types that will be treated as retriable connection errors.
- get_etag() Optional[str] ¶
Get the Etag or an equivalent version identifier associated with the resource.
- Returns
The ETag as a
str
orNone
if there is no ETag associated with the resource.- Return type
Optional[str]
- Raises
FileNotFoundError – If the resource doesn’t exist.
Connection error – Any error type defined in
SchemeClient.connection_error_types
will be treated as a retriable connection error.Other errors – Any other error type can be raised, in which case
cached_path()
will log the error and move on to try to fetch the resource without the ETag.
- get_resource(temp_file: IO) None ¶
Download the resource to the given temporary file.
- Raises
FileNotFoundError – If the resource doesn’t exist.
Connection error – Any error type defined in
SchemeClient.connection_error_types
will be treated as a retriable connection error.Other errors – Any other error type can be raised, in which case
cached_path()
will fail and propogate the error.
- scheme: ClassVar[Union[str, Tuple[str, ...]]] = ()¶
The scheme or schemes that the client will be used for (e.g. “http”).