You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 3, 2021. It is now read-only.
Right now AZTK in Spark SDK when aztk.spark.client.Client.submit() is called,
it assumes that ApplicationConfiguration contains paths to local files in jars and files fields.
In our case we already have the spark job resources uploaded to Azure Blob Storage so we want to avoid downloading and uploading them again.
From what I see, aztk.spark.client.Client.submit() calls generate_task which uploads files to blob storage, generates ResourceFiles for them, replaces local paths with file names in application config and uploads it as application.yml file to blob storage.
I would like to have an option to provide resource_files directly to Client.submit() and thus skip uploading files.
Right now we use a workaround where we basically reimplement generate_task and generate resource_files for our blobs ourselves. This seems brittle as it is coupled to AZTK SDK implementation and can break when AZTK changes in future.
The text was updated successfully, but these errors were encountered:
I think this is a great feature. We should support both scenarios - local upload and referencing existing files in storage. Thanks for the feature request!
Hello @jafreck @timotheeguerin
Right now AZTK in Spark SDK when
aztk.spark.client.Client.submit()
is called,it assumes that
ApplicationConfiguration
contains paths to local files injars
andfiles
fields.In our case we already have the spark job resources uploaded to Azure Blob Storage so we want to avoid downloading and uploading them again.
From what I see,
aztk.spark.client.Client.submit()
callsgenerate_task
which uploads files to blob storage, generatesResourceFile
s for them, replaces local paths with file names in application config and uploads it asapplication.yml
file to blob storage.I would like to have an option to provide
resource_files
directly toClient.submit()
and thus skip uploading files.Right now we use a workaround where we basically reimplement
generate_task
and generateresource_files
for our blobs ourselves. This seems brittle as it is coupled to AZTK SDK implementation and can break when AZTK changes in future.The text was updated successfully, but these errors were encountered: