Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a user ID column to the CSV import format #75

Open
tlater-famedly opened this issue Oct 18, 2024 · 2 comments
Open

Add a user ID column to the CSV import format #75

tlater-famedly opened this issue Oct 18, 2024 · 2 comments

Comments

@tlater-famedly
Copy link
Contributor

To track changes to users between updates, we need to be able to uniquely identify them. Since the source is authoritative, an ID from the source must be used to do this.

Currently, the CSV source re-uses the user's email address as this ID, but since this is PII, this means we cannot pseudonymise CSV entries. We should add a new column to enable this.

Things to consider:

  • This will be a very breaking change, and since the tool is already used in production here and there, we might need a way to migrate.
    • This could simply be a check for existing users with the given email address as an external ID, but it would be nice to find a way to prevent us having to lug around code for this forever...
  • Technically the ID column can be entirely free-form, but we may want to enforce it containing UUIDs or simple numeric IDs to prevent user error causing PII leaks.
@jannden
Copy link
Contributor

jannden commented Dec 13, 2024

After clarification with Niklas, this task talks about external user IDs. This is different from #96 , which talks about zitadel user IDs.

@tlater-famedly
Copy link
Contributor Author

tlater-famedly commented Jan 3, 2025

They are in fact the same, as #96 is about using the "localpart" as the Zitadel ID, which is just the external user ID encoded with UUIDv5 in our namespace. You cannot get the original value from a UUIDv5, but we could have just had one field for the ID, and encoded it. If we still want this, we'll need some migration now, but I suppose we can also just use the UUIDv5 value for a pseudonymized logging identifier.

Only issue with that is that now the CSV source needs separate code paths for what is considered its "ID", which is awkward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants