Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python): add capability to read unity catalog (uc://) uris #3113

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

omkar-foss
Copy link
Contributor

Description

This adds capability to read directly from uc:// uris using the local catalog-unity crate. This also exposes the UC temporary credentials in storage_options of the DeltaTable instance so polars or similar readers can use it.

Related Issue(s)

Documentation

N/A

@github-actions github-actions bot added binding/python Issues for the Python package binding/rust Issues for the Rust crate labels Jan 10, 2025
Copy link

ACTION NEEDED

delta-rs follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

@omkar-foss omkar-foss force-pushed the feat-uc-python branch 2 times, most recently from 8fcd7f4 to 0824abe Compare January 10, 2025 07:03
@omkar-foss omkar-foss changed the title feat(python): Add capability to read Unity Catalog (uc://) uris feat(python): add capability to read unity catalog (uc://) uris Jan 10, 2025
@@ -266,14 +266,14 @@ pub struct TableSummary {
pub struct Table {
/// Username of table creator.
#[serde(default)]
pub created_by: String,
pub created_by: Option<String>,
Copy link
Contributor Author

@omkar-foss omkar-foss Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are instances I encountered where created_by and updated_by are null in UC Get Tables response, so we need to keep this as an Option<String> to handle it.


let mut options = storage_options.clone().unwrap_or_default();
if !uc_token.is_empty() {
options.insert("UNITY_CATALOG_TEMPORARY_TOKEN".to_string(), uc_token);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have added this key UNITY_CATALOG_TEMPORARY_TOKEN for Polars or similar readers to be able reuse the credentials. cc: @ion-elgreco

/// Allow http url (e.g. http://localhost:8080/api/2.1/...)
/// Supported keys:
/// - `unity_allow_http_url`
AllowHttpUrl,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows users to work with a local (non-https) Unity Catalog REST API with delta-rs.

Comment on lines +173 to +186
let (table_path, uc_token) = if UnityCatalogBuilder::is_unity_catalog_uri(table_uri) {
match rt().block_on(UnityCatalogBuilder::get_uc_location_and_token(table_uri)) {
Ok(tup) => tup,
Err(err) => return Err(PyRuntimeError::new_err(err.to_string())),
}
} else {
(table_uri.to_string(), "".to_string())
};

let mut options = storage_options.clone().unwrap_or_default();
if !uc_token.is_empty() {
options.insert("UNITY_CATALOG_TEMPORARY_TOKEN".to_string(), uc_token);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be a better fit for code inside an object store factory

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Als you might have to wait before @hntd187's code gets merged (#3078)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Als you might have to wait before @hntd187's code gets merged (#3078)

Yep will have to wait for #3078 to get merged before this one, the temp credentials part is in there :)

I think this would be a better fit for code inside an object store factory

Yes, this code is currently duplicated in new() and is_deltatable(). Shall I move it to a private function? Or object store factory works too, let me know what you have in mind about that.

This adds capability to read directly from uc:// uris using the
local catalog-unity crate. This also exposes the UC temporary
credentials in storage_options of the `DeltaTable` instance so
polars or similar readers can use it.

Signed-off-by: Omkar P <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package binding/rust Issues for the Rust crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Python-side support for Unity Catalog
3 participants