Support for conditional column writes #3066

panbamid-r · 2025-01-08T13:54:08Z

Is your feature request related to a problem? Please describe.
I'm working on a project and I need to overwrite some data (a specific subset of columns) on an Iceberg table via athena. The current implementation doesn't support it, and it's not feasible to load all the data in memory and do the processing in python, so that then I could overwrite the entire subset

Describe the solution you'd like
I would like the to_iceberg function in the awswrangler.athena module to support partial column overwrites. This could be an additional argument for the function.

Describe alternatives you've considered
The alternative would be to load everything in memory (as described above)

Additional context

The text was updated successfully, but these errors were encountered:

GrumpyCat51 · 2025-01-08T15:25:16Z

Here is a minimal example of what we try to do:

We have an iceberg table structured like this:

id	label	...(several other columns)
1	0	...
2	0	...
3	0	...
4	0	...
5	0	...
6	0	...
7	0	...

Then we calculate updated label for a subset of them, e.g.

id	label
3	1
4	1

Now we want to update the original table with these values without the need to first download all the additional columns in order to get

id	label	...(several other columns)
1	0	...
2	0	...
3	1	...
4	1	...
5	0	...
6	0	...
7	0	...

The current implementation cannot do this, as it will either try to change the table structure (with fill_missing_columns_in_df = False) and raise an exceptions.InvalidArgumentCombination error, or it will replace all additional columns with None/NULL.

I'd create a PR for this if wished. I've already made a fork here main...GrumpyCat51:aws-sdk-pandas:main that we tested successfully as a suggestion, but I'd be happy to change/improve/adapt it as needed.

panbamid-r added the feature label Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for conditional column writes #3066

Support for conditional column writes #3066

panbamid-r commented Jan 8, 2025

GrumpyCat51 commented Jan 8, 2025 •

edited

Loading

Support for conditional column writes #3066

Support for conditional column writes #3066

Comments

panbamid-r commented Jan 8, 2025

Additional context

GrumpyCat51 commented Jan 8, 2025 • edited Loading

GrumpyCat51 commented Jan 8, 2025 •

edited

Loading