Fix updating ID for resource #3239

ThibaudDauce · 2025-01-02T14:57:43Z

No description provided.

bolinocroustibat

LGTM, that was quickly done!
Would that make sense, in the future, to create the ID on udata server side?

ThibaudDauce · 2025-01-03T08:52:45Z

LGTM, that was quickly done! Would that make sense, in the future, to create the ID on udata server side?

The front-end is using the client side ID generation to send in parallel all the chunks so not sure it's possible without changing the API (for exemple having a client side UUID then a real UUID saved in the database at the end of the upload process)… https://github.com/datagouv/front-end/blob/700fa2e57f9ae347c7b59e8eb1e5dbbff659a475/utils/datasets.ts#L175-L214

magopian

Nice! Thanks for the investigation and resolution, good job!

udata/core/dataset/api.py

udata/tests/api/test_datasets_api.py

maudetes

Thank you for the investigation and proposed solution!

I don't think the resource id field is currently set by the frontend.
The uuid header in the multipart isn't currently used to create the Resource object, it is only used to identify the chunks the file belong to.
Comparing the uuid with the resource id shows they differ in this chunk upload test case.

I think we should make sure the resource id field is entirely readonly instead.

maudetes

I think we'll need to deal with existing duplicated resources ID before merging & deploying?

maudetes · 2025-01-13T14:30:50Z

udata/core/dataset/api.py

@@ -377,8 +383,9 @@ class ResourcesAPI(API):
    def post(self, dataset):
        """Create a new resource for a given dataset"""
        ResourceEditPermission(dataset).test()
-        form = api.validate(ResourceForm)
-        resource = Resource()
+        form = api.validate(ResourceFormWithoutId)


maudetes · 2025-01-13T14:32:40Z

udata/core/dataset/forms.py

@@ -104,6 +104,10 @@ class ResourceForm(BaseResourceForm):
    id = fields.UUIDField()


+class ResourceFormWithoutId(BaseResourceForm):


I'm confused we use ResourceFormWithoutId in DatasetForm?

Because otherwise we don't have the ID for the reorder for exemple…

maudetes · 2025-01-13T14:33:48Z

udata/core/dataset/models.py

+        resources_ids = set()
+        for resource in self.resources:
+            if resource.id in resources_ids:
+                raise MongoEngineValidationError(
+                    f"Duplicate resource ID {resource.id} in dataset #{self.id}."
+                )
+            resources_ids.add(resource.id)


Could we do something like?

if len(set(self.resources)) != len(res.id for res in self.resources): raise...

I think it's slower since there is two iteration but yes it's shorter…

maudetes · 2025-01-13T14:35:52Z

udata/core/dataset/models.py

@@ -335,7 +335,7 @@ def to_mongo(self, *args, **kwargs):


 class ResourceMixin(object):
-    id = db.AutoUUIDField(primary_key=True)
+    id = db.AutoUUIDField(primary_key=True, unique=True)


Do we need the manual check if we already have unique=True?

Yes I can remove this, I didn't find a way to make it work

I removed it. Adding it fails a lot of tests I don't know why… https://app.circleci.com/pipelines/github/opendatateam/udata/5863/workflows/ee92a1f9-786c-42dc-a1e9-dc5b5a3be992/jobs/33495?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-checks-link&utm_content=summary

maudetes · 2025-01-13T14:36:42Z

udata/core/dataset/models.py

@@ -638,6 +638,14 @@ def clean(self):
        if self.frequency in LEGACY_FREQUENCIES:
            self.frequency = LEGACY_FREQUENCIES[self.frequency]

+        resources_ids = set()


These manual checks only care about the current dataset though?

Yes we cannot check for all resources in different dataset (but I think resources are always scoped by the dataset ID so it shouldn't be a problem, no?)

ThibaudDauce added 4 commits January 2, 2025 15:57

Fix updating ID for resource

615601f

Some cleanups

5c8ccda

Prevent creating a new resource with the same ID

bd0ca4a

add comment

17af9af

bolinocroustibat self-requested a review January 2, 2025 17:29

bolinocroustibat approved these changes Jan 2, 2025

View reviewed changes

magopian approved these changes Jan 6, 2025

View reviewed changes

udata/core/dataset/api.py Outdated Show resolved Hide resolved

udata/tests/api/test_datasets_api.py Show resolved Hide resolved

maudetes reviewed Jan 7, 2025

View reviewed changes

ThibaudDauce and others added 11 commits January 13, 2025 11:19

Generate a new UUID instead of using the one sent by the client

00bd6d8

Fix tests

df51cd7

Add check for duplicate resource ID in dataset clean

210fbe6

Merge branch 'master' into fix_same_id_for_resources

89b7a95

Update changelog

868d742

Try adding one more test

2ca5df0

Fix test reorder

070f95d

Try to add unique index in Mongo

5741cb5

Try to fix all the tests

25b2a92

Do check of ids in add_resource

6fda0a2

Add back request import

8b35df7

maudetes reviewed Jan 13, 2025

View reviewed changes

Remove unique=True

94f4f2d

maudetes mentioned this pull request Jan 14, 2025

More safeguards in the resource reorder endpoint #3243

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix updating ID for resource #3239

Fix updating ID for resource #3239

ThibaudDauce commented Jan 2, 2025

bolinocroustibat left a comment

ThibaudDauce commented Jan 3, 2025

magopian left a comment

maudetes left a comment

maudetes left a comment

maudetes Jan 13, 2025

maudetes Jan 13, 2025

ThibaudDauce Jan 13, 2025

maudetes Jan 13, 2025

ThibaudDauce Jan 13, 2025

maudetes Jan 13, 2025

ThibaudDauce Jan 13, 2025

ThibaudDauce Jan 13, 2025

maudetes Jan 13, 2025

ThibaudDauce Jan 13, 2025

		@@ -104,6 +104,10 @@ class ResourceForm(BaseResourceForm):
		id = fields.UUIDField()


		class ResourceFormWithoutId(BaseResourceForm):

Fix updating ID for resource #3239

Are you sure you want to change the base?

Fix updating ID for resource #3239

Conversation

ThibaudDauce commented Jan 2, 2025

bolinocroustibat left a comment

Choose a reason for hiding this comment

ThibaudDauce commented Jan 3, 2025

magopian left a comment

Choose a reason for hiding this comment

maudetes left a comment

Choose a reason for hiding this comment

maudetes left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment