tsadar.utils.misc#

Functions

download_file(fname, artifact_uri, ...)

Downloads a file from an MLflow artifact URI to a specified local destination.

export_run(run_id[, prefix, step])

Exports an MLflow run and uploads its artifacts to an S3 bucket.

get_cfg(artifact_uri, temp_path)

Downloads configuration files from the specified artifact URI to a temporary path.

log_mlflow(cfg[, which, step])

Logs the parameters form the input deck in the parameters section of MLFlow.

update(base_dict, new_dict)

Combines 2 dictionaries overwriting common fields

upload_dir_to_s3(local_directory, bucket, ...)

Uploads the contents of a local directory to an S3 bucket, preserving the directory structure.

tsadar.utils.misc.log_mlflow(cfg, which='params', step=0)Source#

Logs the parameters form the input deck in the parameters section of MLFlow.

Parameters:

cfg – input dictionary

Returns:

tsadar.utils.misc.update(base_dict, new_dict)Source#

Combines 2 dictionaries overwriting common fields

Parameters:
  • base_dict – dictionary to be modified

  • new_dict – dictionary containing new or additional values to be inserted

Returns:

combined_dict – combined dictionary with the updated values

tsadar.utils.misc.upload_dir_to_s3(local_directory: str, bucket: str, destination: str, run_id: str, prefix='ingest', step=0)Source#

Uploads the contents of a local directory to an S3 bucket, preserving the directory structure. After uploading all files, creates a marker file indicating completion and uploads it to the bucket.

Parameters:
  • local_directory (str) – Path to the local directory to upload.

  • bucket (str) – Name of the S3 bucket to upload to.

  • destination (str) – S3 key prefix (folder path) where files will be uploaded.

  • run_id (str) – Identifier for the current run, used in the marker filename.

  • prefix (str, optional) – Prefix for the marker filename. Defaults to “ingest”.

  • step (int, optional) – Step number for the marker filename. Defaults to 0.

Returns:

None

tsadar.utils.misc.export_run(run_id, prefix='ingest', step=0)Source#

Exports an MLflow run and uploads its artifacts to an S3 bucket. :param run_id: The unique identifier of the MLflow run to export. :type run_id: str :param prefix: Prefix to use when uploading to S3. Defaults to “ingest”. :type prefix: str, optional :param step: Step number or identifier for the upload process. Defaults to 0. :type step: int, optional

Side Effects:
  • Exports the specified MLflow run to a temporary directory.

  • Uploads the exported run directory to the specified S3 bucket and path.

  • Prints the time taken for export and upload operations.

Environment Variables:

BASE_TEMPDIR: If set, used as the base directory for the temporary export directory.

Raises:

Any exceptions raised by MLflow or S3 upload operations will propagate.

tsadar.utils.misc.get_cfg(artifact_uri, temp_path)Source#

Downloads configuration files from the specified artifact URI to a temporary path. Allows configuration files to be locked at queue time. :param artifact_uri: The URI of the artifact containing the configuration files. :type artifact_uri: str :param temp_path: The temporary directory path where the files will be downloaded. :type temp_path: str

Returns:

None

Note

This function currently downloads ‘defaults.yaml’ and ‘inputs.yaml’ files but does not load or return their contents.

tsadar.utils.misc.download_file(fname, artifact_uri, destination_path)Source#

Downloads a file from an MLflow artifact URI to a specified local destination. Supports downloading from both S3 and local file system artifact URIs. :param fname: The name of the file to download. :type fname: str :param artifact_uri: The MLflow artifact URI indicating the storage location. :type artifact_uri: str :param destination_path: The local directory path where the file should be saved. :type destination_path: str

Returns:

str or None – The full local path to the downloaded file if successful, otherwise None.

Raises:

None – Any exceptions are handled internally and None is returned on failure.