Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Private model training (DP-SGD) with sparse features #1370

Open
Lufe44 opened this issue Dec 17, 2024 · 1 comment
Open

Private model training (DP-SGD) with sparse features #1370

Lufe44 opened this issue Dec 17, 2024 · 1 comment

Comments

@Lufe44
Copy link

Lufe44 commented Dec 17, 2024

Hello,

Private model training has been recently mentioned here. One of the privacy considerations is to include DP in the training loop through DP-SGD.

There are cases when DP-SGD would make the training process considerably slower as it destroys the sparsity of the gradients calculated during backprop, rendering impossible to use optimization techniques that rely on such sparsity. This is usually the case when some features are categorical features or working with embedding tables in the case of LLMs. I am aware there is research around this topic to remedy it, although it is not clear from the explainer linked above if this is something that has been considered in the context of Protected Audience API.

Are there any techniques that are being considered to face this or thoughts about this topic?

Thanks

@csharrison
Copy link
Contributor

Hi @Lufe44. Thank you very much for the feedback. It is useful to know that this is a concern.

Yes we are aware of some techniques to preserve sparsity in DP-SGD, e.g. https://arxiv.org/abs/2404.10881, and indeed we are exploring supporting these kinds of techniques in the private model training infrastructure. However, since we are in the research phase we aren't able to make any commitments yet to whether it will be feasible for us to support.

Did you have any specific approaches you wanted us to consider?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants