Private model training (DP-SGD) with sparse features #1370

Lufe44 · 2024-12-17T23:53:36Z

Hello,

Private model training has been recently mentioned here. One of the privacy considerations is to include DP in the training loop through DP-SGD.

There are cases when DP-SGD would make the training process considerably slower as it destroys the sparsity of the gradients calculated during backprop, rendering impossible to use optimization techniques that rely on such sparsity. This is usually the case when some features are categorical features or working with embedding tables in the case of LLMs. I am aware there is research around this topic to remedy it, although it is not clear from the explainer linked above if this is something that has been considered in the context of Protected Audience API.

Are there any techniques that are being considered to face this or thoughts about this topic?

Thanks

csharrison · 2024-12-18T01:51:50Z

Hi @Lufe44. Thank you very much for the feedback. It is useful to know that this is a concern.

Yes we are aware of some techniques to preserve sparsity in DP-SGD, e.g. https://arxiv.org/abs/2404.10881, and indeed we are exploring supporting these kinds of techniques in the private model training infrastructure. However, since we are in the research phase we aren't able to make any commitments yet to whether it will be feasible for us to support.

Did you have any specific approaches you wanted us to consider?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Private model training (DP-SGD) with sparse features #1370

Private model training (DP-SGD) with sparse features #1370

Lufe44 commented Dec 17, 2024

csharrison commented Dec 18, 2024

Private model training (DP-SGD) with sparse features #1370

Private model training (DP-SGD) with sparse features #1370

Comments

Lufe44 commented Dec 17, 2024

csharrison commented Dec 18, 2024