Cloud Storage

GcsUtility

class gwrappy.storage.GcsUtility(**kwargs)[source]

Initializes object for interacting with Google Cloud Storage API.

By default, Application Default Credentials are used.
If gcloud SDK isn’t installed, credential files have to be specified using the kwargs json_credentials_path and client_id.
Parameters:
  • max_retries (integer) – Argument specified with each API call to natively handle retryable errors.
  • chunksize (integer) – Upload/Download chunk size
  • client_secret_path – File path for client secret JSON file. Only required if credentials are invalid or unavailable.
  • json_credentials_path – File path for automatically generated credentials.
  • client_id – Credentials are stored as a key-value pair per client_id to facilitate multiple clients using the same credentials file. For simplicity, using one’s email address is sufficient.
list_buckets(project_id, max_results=None, filter_exp=None)[source]

Abstraction of buckets().list() method with inbuilt iteration functionality. [https://cloud.google.com/storage/docs/json_api/v1/buckets/list]

Parameters:
  • project_id (string) – Unique project identifier.
  • max_results (integer) – If None, all results are iterated over and returned.
  • filter_exp (function) – Function that filters entries if filter_exp evaluates to True.
Returns:

List of dictionary objects representing bucket resources.

list_objects(bucket_name, max_results=None, prefix=None, projection=None, filter_exp=None)[source]

Abstraction of objects().list() method with inbuilt iteration functionality. [https://cloud.google.com/storage/docs/json_api/v1/objects/list]

Parameters:
  • bucket_name (string) – Bucket identifier.
  • max_results (integer) – If None, all results are iterated over and returned.
  • prefix (string) – Pre-filter (on API call) results to objects whose names begin with this prefix.
  • projection – Set of properties to return.
  • filter_exp (function) – Function that filters entries if filter_exp evaluates to True.
Returns:

List of dictionary objects representing object resources.

get_object(bucket_name, object_name, projection=None)[source]

Abstraction of objects().get() method with inbuilt iteration functionality. [https://cloud.google.com/storage/docs/json_api/v1/objects/get]

Parameters:
  • bucket_name (string) – Bucket identifier.
  • object_name (list or string) – Can take string representation of object resource or list denoting path to object on GCS.
  • projection – Set of properties to return.
Returns:

Dictionary object representing object resource.

update_object(bucket_name, object_name, predefined_acl=None, projection=None, **object_resource)[source]

Abstraction of objects().update() method. [https://cloud.google.com/storage/docs/json_api/v1/objects/update]

Parameters:
  • bucket_name (string) – Bucket identifier.
  • object_name (list or string) – Can take string representation of object resource or list denoting path to object on GCS.
  • predefined_acl – Apply a predefined set of access controls to this object.
  • projection – Set of properties to return.
  • object_resource – Supply optional properties [https://cloud.google.com/storage/docs/json_api/v1/objects/insert#request-body]
Returns:

Dictionary object representing object resource.

delete_object(bucket_name, object_name)[source]

Abstraction of objects().delete() method with inbuilt iteration functionality. [https://cloud.google.com/storage/docs/json_api/v1/objects/delete]

Parameters:
  • bucket_name (string) – Bucket identifier.
  • object_name (list or string) – Can take string representation of object resource or list denoting path to object on GCS.
Raises:

AssertionError if unsuccessful. Response should be empty string if successful.

download_object(bucket_name, object_name, write_path)[source]

Downloads object in chunks.

Parameters:
  • bucket_name (string) – Bucket identifier.
  • object_name (list or string) – Can take string representation of object resource or list denoting path to object on GCS.
  • write_path (string) – Local path to write object to.
Returns:

GcsResponse object.

Raises:

HttpError if non-retryable errors are encountered.

upload_object(bucket_name, object_name, read_path, predefined_acl=None, projection=None, **object_resource)[source]

Uploads object in chunks.

Optional parameters and valid object resources are listed here [https://cloud.google.com/storage/docs/json_api/v1/objects/insert]

Parameters:
  • bucket_name (string) – Bucket identifier.
  • object_name (list or string) – Can take string representation of object resource or list denoting path to object on GCS.
  • read_path (string) – Local path of object to upload.
  • predefined_acl – Apply a predefined set of access controls to this object.
  • projection – Set of properties to return.
  • object_resource – Supply optional properties [https://cloud.google.com/storage/docs/json_api/v1/objects/insert#request-body]
Returns:

GcsResponse object.

Raises:

HttpError if non-retryable errors are encountered.

Misc Classes/Functions

class gwrappy.storage.utils.GcsResponse(description)[source]

Wrapper for GCS upload and download responses, mainly for calculating/parsing job statistics into human readable formats for logging.

Parameters:description – String descriptor for specific function of job.
load_resp(resp, is_download)[source]

Loads json response from API.

Parameters:
  • resp (dictionary) – Response from API
  • is_download (boolean) – Calculates time taken based on ‘updated’ field in response if upload, and based on stop time if download