-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TEE Inference service #55
Comments
Hi! For more details regarding our inference proposal, please see our explainer here: https://github.com/privacysandbox/protected-auction-services-docs/blob/main/inference_overview.md We are planning a presentation in the WICG discussion, tentatively for 10th Apr. We will add this to the agenda (https://docs.google.com/document/d/1Hk6uW-i4KPUb-u20E-EWbu8_EbPnzcptnhV9mxBP5Mo/edit#heading=h.5xmqy05su18g) once this is finalized. Initially we will support Tensorflow (C++) and PyTorch libtorch runtimes, but not ONNX runtime. We can investigate support for ONNX runtime in the future based on feedback we receive from the community. Akshay |
We recently open sourced the code for inference. Feel free to check it out at: https://github.com/privacysandbox/bidding-auction-servers/tree/release-3.4/services/inference_sidecar |
Hey @akshaypundle is the following correct, that the inference service:
If that's right, what is achieved by those constraints? I'm thinking through this more broadly for server side private auctions in general (including ASAPI), but specific to inference why not do something like:
The second option would open up a few really valuable options:
I understand and support having carefully crafted output gates from the TEEs, but I'm wondering, both in general but here for inference, why we wouldn't open things up within the TEEs, and avoid constraining operations within them. |
Also @akshaypundle re the demo, Wed April 10th a bunch of folks will be at the PATCG F2F. |
Hi Isaac, Thanks very much for the feedback. Regarding constraints of where inference can be run:Our initial proposal is to implement inference on the generateBid and prepareDataForAdRetrieval UDFs (on the bidding servers). If there is ecosystem interest, we could expand the sidecars to Auction servers or K/V servers. This means that the same inference API can be made available to UDFs running on Auction, K/V or other TEE servers in the future. Regarding separate scaling of servers, such considerations do exist in the current PA/PAS systems, and that is one of the reasons that the K/V servers are separate from the B&A servers. Extracting inference into its own service comes with privacy challenges though. For example, just the fact that an inference call was made from a UDF is a 1 bit leak. Observing the traffic to the servicing TEE will provide an observer this information. Since the number of runInference calls is not limited, we can probably come up with a scheme to leak any number of bits from the UDF just by observing whether inference calls were made or not. Such cases use techniques like chaffing (see here) to reduce privacy risk, but these add to cost. We would need a detailed analysis of the threat model and mitigations before we can have a separate inference service.. Pulling out inference as its own service is something that we will consider more in the future, as we look at more use cases and independently scaling the service, especially once machines with accelerators are available in the TEE. Our current architecture runs a GRPC service inside the sandbox (on the same host). This could be extracted into a distinct service in the future - the GRPC design helps keep our options open. The current design helps us sidestep some privacy concerns and deliver a valuable feature quicker, so we can get feedback and iterate on the design. As we get more data, we are open to making changes to the architecture to provide maximum flexibility / utility while preserving user privacy. Regarding your proposal:I am not sure I fully understand your proposal. Are you saying that you would like to run ad tech provided javascript code inside Roma on a different machine, which can be accessed from UDFs (all involved machines being TEEs)? Where will the inference code run? Will it run inside Roma (javascript / wasm)? What backends will run inference? In our proposal, we run the C++ inference library backend (for TF and PyTorch), and make this available to the javascript UDFs through an API. This means the predictions are run on mature battle tested, performant systems (the core TF and PyTorch libraries). The next WICG-servers call is scheduled for the 24th April and we are planning to do the inference presentation in that call. I’m happy to discuss this more. Thanks! |
@akshaypundle Can you confirm that adding inference to I found some doc about ad retrieval overview but not this dedicated API https://github.com/privacysandbox/protected-auction-key-value-service/blob/release-0.16/docs/ad_retrieval_overview.md For simplicity it seems important to be able to run inference from inside one single container which seems to be the key value service TEE (as described in the ad retrieval doc). |
@fhoering , the |
My understanding of this document is that it is not specific to Android but that the workflow could be applied even on Chrome web on-device workflows. Today the key/value service already supports UDFs. So for on-device auctions which is what is currently being tested it seems natural to also support inference without having to deploy and maintain additional bidding servers. |
My comment was just related to the |
Thanks @fhoering and @galarragas. @fhoering , to answer your question, inference will not initially be available from the K/V codebase. It will be available only on the bidding server initially. The bidding server runs both the The In the future, it may be possible to extend inference to K/V or other TEE servers (e.g. Auction server). But as of now, the additional inference capabilities will only be available on the Bidding server initially. |
@akshaypundle It should me made available. How to move forward on this ? |
Hi @fhoering , Yes, I think creating a new github and adding it to the agenda for discussion sounds good! Akshay |
Hey @akshaypundle and @galarragas want to confirm in which of the 3 flows Inference Service (runInference) will be available.
I think the documentation plus overlapping name of "Protected Audience Signals", despite the APIs having differences, is causing me some confusion. For instance:
My best guess ATM is that it's supported in both flows on Android and not the Chrome one? |
@akshaypundle @TrentonStarkey while there's some energy on this, ping on the specific comment right above, RE in which flows inference is available. |
@thegreatfatzby Currently, the inference service only works with Android Protected App Signals. We plan to expand this to Protected Audience auctions in B&A for both Chrome and Android, but we don't have specific timelines yet. We'll share more details in public explainers as they become available. |
Hi All, Bidding and auction services will be updated to support Inference for the Protected Audience flow in upcoming releases. This is scheduled currently for the next release (4.3.0).
Akshay |
As part of last week's call, I'm raising this to request for more details about TEE inference service. Will ONNX runtime be supported in this inference service?
The text was updated successfully, but these errors were encountered: