[Extensibility] ZeroConf #132

DazWilkin · 2020-11-17T21:51:51Z

**Is your feature request related to a problem? Please describe.**g

I've begun prototyping (!) a ZeroConf protocol implementation for Akri

Is your feature request related to a way you would like Akri extended? Please describe.

ZeroConf is probably (!) a useful protocol implementation particularly due to its reliable naming of (transient) devices.

Describe the solution you'd like

Akri protocol implementation with Broker samples.

kate-goldenring · 2020-11-18T21:48:48Z

@DazWilkin, can you put in a PR with a proposal for what you envision? Maybe something like the one for OPC UA that provides background on zero conf, what filters/settings will be a part of the Configuration, and any security considerations?

DazWilkin · 2020-11-19T17:08:55Z

Yes, I can draft something.

To set expectations: I'm mostly unfamiliar with ZeroConf and am writing more of a proof-of-concept using it to learn more about ZeroConf, Akri and Rust. I don't have any production uses of ZeroConf.

It would be useful to read a proposal written by folks that are familiar with ZeroConf and who use it in production.

I'll draft a proposal and include programming notes, concerns, omissions, feature requests etc.

https://github.com/DazWilkin/akri/blob/proposal-zeroconf/docs/proposals/zeroconf.md

DazWilkin · 2020-11-19T20:33:09Z

@kate-goldenring @bfjelds
Should I raise this as a separate issue? Or would you prefer I continue to keep ZeroConf-related questions here?

The zeroconf crate has a bunch of Linux dependencies, see #104 (comment) and it requires that the agent be started with several daemons.

Dockerfile.agent was:

ENTRYPOINT ["./agent"]

To:

ENTRYPOINT ["bash","-c","service dbus start && service avahi-daemon start && ./agent"]

This uncommon approach yields a shell running the 2 services and the agent.

Problems:

As mentioned in [Extensibility] Too(?)-tightly coupled #104, more agents potentially bulks out the agent image
Two (more) services for ZeroConf but what happens as other protocols add more

kate-goldenring · 2020-11-23T18:43:04Z

Hey Daz,

I think we can keep this issue as our thread for Zeroconf questions.

Maybe on Slack we can make a shout out to see who has experience with Zeroconf. Like you said, it would be great to get their input on a proposal. Zeroconf is also new to me.

As to Zeroconf bulking up the Agent, hopefully this is only a temporary issue that will be mitigated once we have implemented new extensibility models like options 3 and 4 in the proposal. Instead of the agent having to contain support for all protocols, the discovery handler portion could be its own pod or the agent could be exposed as a library, only bringing in support for requested protocols. These implementations are on the roadmap for us. When implemented, we are thinking we will port all existing protocols over to the new model. So while this is messy, do you think a better extensibility model removes the mess?

DazWilkin · 2020-11-23T18:57:12Z

@kate-goldenring yes, that would be excellent.

The current issue (mega-Agent) is that everyone must necessarily run every protocol (and all their encumbrances and vulnerabilities). By moving to a per-protocol Agent, not only does the user get more focused Agents but these may be optimized for Akri's and the protocol's dependencies and maintained (patching) on a cycle specific to the protocol and its deps.

DazWilkin · 2020-11-23T19:03:09Z

Ok, I have a working prototype of ZeroConf.

A single mDNS service (18628d4f173f.local) is running

Confirmed by avahi-browse:

avahi-browse --all
+ docker0 IPv4 18628d4f173f                                  Web Site             local

After deploying the agent, its (!) logs report finding the service:

[zeroconf:new] Entered
[zeroconf:discover] Entered
[zeroconf:discovery] Started browsing
[zeroconf:discovery:λ] Service Discovered: ServiceDiscovery { name: "18628d4f173f", kind: "_http._tcp", domain: "local", host_name: "18628d4f173f.local", address: "172.17.0.3", port: 8080, txt: Some(AvahiTxtRecord(UnsafeCell)) }
[zeroconf:discovery] Stopped browsing
[zeroconf:discovery] Iterating over services
[zeroconf:discovery] Service: ServiceDiscovery { name: "18628d4f173f", kind: "_http._tcp", domain: "local", host_name: "18628d4f173f.local", address: "172.17.0.3", port: 8080, txt: Some(AvahiTxtRecord(UnsafeCell)) }

A single Akri instance is deployed:

kubectl get instances
NAME              CONFIG     SHARED   NODES    AGE
zeroconf-e7f45d   zeroconf   true     [akri]   2m

And:

for INSTANCE in $(kubectl get instances --output=jsonpath="{.items[].metadata.name}")
do
  POD="pod/akri-${INSTANCE}-pod"
  kubectl logs ${POD}
done

Yields:

[zeroconf:main] Entered
[zeroconf:new] Entered
[zeroconf:new]
  Kind: _http._tcp
  Name: 18628d4f173f
  Host: 18628d4f173f.local
  Addr: 172.17.0.3
  Port: 8080
[zeroconf:main] Service: kind: _http._tcp
name: 18628d4f173f
host: 18628d4f173f.local
addr: 172.17.0.3
port: 8080
[zeroconf:main:loop] Sleep
[zeroconf:main:loop] check_device(Service { kind: "_http._tcp", name: "18628d4f173f", host: "18628d4f173f.local", addr: "172.17.0.3", port: 8080 })
[zeroconf:read_device] Entered: Service { kind: "_http._tcp", name: "18628d4f173f", host: "18628d4f173f.local", addr: "172.17.0.3", port: 8080 }
[zeroconf:main:loop] Sleep
[zeroconf:main:loop] check_device(Service { kind: "_http._tcp", name: "18628d4f173f", host: "18628d4f173f.local", addr: "172.17.0.3", port: 8080 })
[zeroconf:read_device] Entered: Service { kind: "_http._tcp", name: "18628d4f173f", host: "18628d4f173f.local", addr: "172.17.0.3", port: 8080 }
[zeroconf:main:loop] Sleep

And, we can examine the Instance's environment:

kubectl exec --stdin --tty ${POD} -- env | grep ^AKRI

Yields:

AKRI_ZEROCONF_DEVICE_KIND=_http._tcp
AKRI_ZEROCONF=zeroconf
AKRI_ZEROCONF_DEVICE_HOST=18628d4f173f.local
AKRI_ZEROCONF_DEVICE_NAME=18628d4f173f
AKRI_ZEROCONF_DEVICE_PORT=8080
AKRI_ZEROCONF_DEVICE_ADDR=172.17.0.3

DazWilkin · 2020-11-23T20:18:40Z

Updated proposal: https://github.com/DazWilkin/akri/blob/proposal-zeroconf/docs/proposals/zeroconf.md

DazWilkin · 2020-12-07T18:55:57Z

Thanks to @kate-goldenring's suggestion, I've written a ZeroConf filter parser (using Pest for ZeroConf.

I've encountered some issues with astro-dnssd e.g. AstroHQ/astro-dnssd#17 and am currently continuing to use zeroconf because it works well on Linux and is easier for me to test.

I continue to be challenged to understand how I'll be able to easily test x-platform.

I wonder whether it would be better if I were to focus on building this protocol as a distinct (standalone repo) project as a proof-of-concept and not submit a PR here instead of continuing to bug you both? I'm happy to proceed either way but want to continue to be respectful of your time.

kate-goldenring · 2020-12-07T20:00:35Z

Hey Daz! Great to hear that the filter parsing is working out. That will be a really nice way for people to customize discovery. I think it would be great to see your Zeroconf implementation added into main. I think it is fine to go with the zeroconf crate and if the astro-dnssd folks respond to the issue thread, we can think about adding in more x-platform support in the future. And someone else can pick that up later too if that is something they are really wanting support for.

How does this sound for a plan of action:

Submit PR for Zeroconf proposal. Some of the adjustments I'd make to your current zeroconf proposal would be to readdress the outstanding questions and feature request sections. Per our conversation on slack, some notes could be added on how the K8s service doesn't have to be the same protocol as the zeroconf connection and maybe give an example broker scenario from our discussion over slack. These might go in a new/different section. Also, if you have any ideas about the security considerations -- do creds need to be passed? Those notes would be great to add. A note on x-platform support would also be great. I can make comments on it in the PR, too. We can try to get this merged in by EOW.
Put in a PR for your zeroconf discovery handler with a zeroconf-configuration.md file that explains how to customize a zeroconf helm template and filter for specific devices using your parser. I can help demystify helm templating if you need any help. I think we can add in zeroconf support without a custom broker created yet. Do you think a sample broker is necessary right away? @bfjelds what do you think?
Someone creates a custom broker. If you're up for it, great! If you are feeling zeroconfed out, we can pass the broker work off to someone else.

DazWilkin · 2020-12-07T20:30:50Z

OK, the protocol is working, so I'll submit that PR today and kick-off that review... I've really tried (again) to keep the documents current with the code but, it's tough (and more so because of the manual Agent updates, not criticizing, just is).

BTW Do you have a process for automating the generation of Markdown files to include code snippets, image references etc.? It seems (!?) as though many READMEs should be templated so that, underlying code changes are just reflected in documents when rebuilt; corollary: it's tough to keep docs sync'd with code manually.

I'll review the proposal and your feedback on Slack and here and incorporate your recommendations. I'll submit that PR today too.

I've done some Helm templating previously. I think the current approach should suffice but I'm always open to feedback.

I've a sample broker in the implementation and am using avahi-publish to create a test service (easier).

DazWilkin · 2020-12-07T20:34:45Z

Sorry, please add a branch to deislabs/akri perhaps zeroconf-extensibility and I'll submit the PR there.

kate-goldenring · 2020-12-07T21:48:21Z

Sorry, please add a branch to deislabs/akri perhaps zeroconf-extensibility and I'll submit the PR there.

Can you submit the PR into main instead of a branch? I dont see the reason for having it live in a branch.

kate-goldenring · 2020-12-07T21:51:03Z

For the Markdown templating, are you thinking of having a template for all protocols: <protocol>-configuration.md? or wondering if there is a way to point to code snippets from within the Markdown so that if the code updates the Markdown automatically updates to display the new code? Both would be cool. I am not sure how to do the latter, though.

DazWilkin · 2020-12-07T21:59:40Z

We decided to add the HTTP protocol to a branch (http-extensibility), so for ZeroConf, thinking similar for consistency!?

DazWilkin · 2020-12-07T22:13:54Z

For Markdown templating, yes, the latter.

I think it would be really useful to be able to write (assuming this comment is a Markdown doc in the project):

E.g.

You may revise the Kubernetes configuration to include a filter:

${{ https://github.com/DazWilkin/akri/blob/.../samples/brokers/zeroconf/kubernetes/zeroconf.yaml#L6-8 }}

And have the pre-processor replace that with:

  protocol:
    zeroconf:
      filter: 'kind="_http._tcp"'

Then, if the underlying configuration changes, forgetful me doesn't have to remember to come back and change the doc reference.

Also (and as common), updating docker container image references:

E.g.

Run the following command to start a device:

docker run ghcr.io/deislabs/akri/http-device:${{ git rev-parse HEAD }}

And have the pre-processor replace that with:

docker run ghcr.io/deislabs/akri/http-device:1234567890abcdef1234567890abcdef12345

kate-goldenring · 2020-12-07T22:48:45Z

We decided to add the HTTP protocol to a branch (http-extensibility), so for ZeroConf, thinking similar for consistency!?

The HTTP protocol seemed to be intended as a replacement for Nessie, or an example of how to extend Akri that was more applicable. In my head, if people want to learn how to extend Akri, they walk through extensibility.md and create the HTTP implementation. And this walk through will hopefully be easier when we get a new (more modularized) extensibility model. So, HTTP is ultimately a markdown file (extensibility.md) and a branch that exists for "checking your work".

I didn't think of HTTP as being intended to be a full-fledged protocol implementation that Akri supported since it's occurrence stemmed from the issue of Nessie not making sense or being clear enough. Maybe it should be the example and also a supported protocol? Or could be built out into being one.

Zeroconf I saw as being a protocol implementation intended to be merged into main from the start.

DazWilkin · 2020-12-08T21:26:39Z

Your call... just let me know and I'll submit the PR.

kate-goldenring · 2020-12-08T21:46:58Z

When you feel like it is in a good state, go ahead and put in a PR into main with your zeroconf implementation. Thanks for all the work!

DazWilkin · 2020-12-08T21:55:58Z

It's working.

I added the refinement that we discussed in the proposal discussion so that services are filtered by any term.

I'll sync w/ main and then if it still builds, I'll submit the PR today

Thank you for all your support!

DazWilkin · 2020-12-08T22:13:37Z

PR: #163

DazWilkin · 2020-12-18T22:35:37Z

This is in a working state and I have not received any response to the astro-dnssd isssue.

I'm at an impasse pending progress deconstructing the Agent monolith and slightly concerned at having to live with bit rot on this pending PR. Thoughts?

I'm considering a new, personal project in January but don't want to lose touch with Akri, are there other things that you think could keep me busy without me building up too much 'debt'?

bfjelds · 2020-12-19T23:22:07Z

Generally, I don't have an issue with the PR hanging out waiting for the monolith-breaker to get designed. But I get your concern!

As for other contributions, a few things you've proposed jump to mind:

capacity&allocate cleanup
add podspec and servicespec validation to create

And the big fish would be designing a gRPC interface that would allow your zeroconf to exist outside of the agent container (number 3 here: https://github.com/deislabs/akri/blob/main/docs/proposals/simple-protocol-extension.md)

DazWilkin · 2020-12-20T01:07:47Z

Thanks for taking the time to respond.

Interesting that you propose the ... Agent gRPC protocol. I think this would be very interesting.

I thought about genericising (word?) handlers recently after chatting with @kate-goldenring and tried to conceptualize how it would work.

The Agent gRPC client and server interface is straightforward but it seems it would need to marshall DiscoveryHandlerConfig as pb.Any to be generic and we'd lose strong-typing at the Agent and in the Configuration spec.

Or it needs to introspect.

Unless you all have thought through this already, perhaps I could prototype and see where it breaks?

bfjelds · 2020-12-20T01:12:46Z

Agreed. My thoughts for the CRD protocol configs are that they would become something generically structured as a map like:

"<protocol name>": 
     "<key>": "<value>"

The agent would send the config blob to all gRPC handlers registered for the <protocol name>.

I would love to see the prototype or your take on how to do this!

DazWilkin · 2021-01-11T21:18:35Z

OK, I was thinking about the gRPC protocol this morning.

I plan to submit a PR for the Webhook (hopefully) this week and then to start working on the gRPC protocol implementation.

I assume this remains a green-field and there's no reason why I should not explore (will use the map proposal).

kate-goldenring · 2021-01-11T21:41:50Z

OK, I was thinking about the gRPC protocol this morning.

I plan to submit a PR for the Webhook (hopefully) this week and then to start working on the gRPC protocol implementation.

I assume this remains a green-field and there's no reason why I should not explore (will use the map proposal).

Hey Daz! I started working on (the design for) gRPC protocol implementation today. I should have created an issue earlier to track the progress. Will do that now! I plan to have the implementation done by the end of the month. Maybe I can put up a design doc for it on HackMD and we can brainstorm there?

DazWilkin · 2021-01-11T22:31:15Z

@kate-goldenring thanks for the update... I was just reading the thread on Slack too.

I'll provide input but won't start on this.

github-actions · 2021-09-10T00:01:03Z

Issue has been automatically marked as stale due to inactivity for 45 days. Update the issue to remove label, otherwise it will be automatically closed.

DazWilkin changed the title ~~[Extensibility] ZeroConfjd~~ [Extensibility] ZeroConf Nov 17, 2020

DazWilkin closed this as completed Nov 23, 2020

DazWilkin reopened this Nov 23, 2020

DazWilkin mentioned this issue Dec 7, 2020

Proposal zeroconf #162

Merged

8 tasks

DazWilkin mentioned this issue Dec 8, 2020

[Extensibility] Zeroconf #163

Closed

8 tasks

bfjelds added the enhancement New feature or request label Dec 14, 2020

Britel assigned DazWilkin Jan 11, 2021

kate-goldenring mentioned this issue Jan 11, 2021

Modularize Discovery Protocols as Pods #198

Closed

github-actions bot added the stale label Sep 10, 2021

github-actions bot closed this as completed Dec 9, 2021

kate-goldenring added this to Akri Roadmap Oct 3, 2023

kate-goldenring moved this to Done in Akri Roadmap Oct 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Extensibility] ZeroConf #132

[Extensibility] ZeroConf #132

DazWilkin commented Nov 17, 2020

kate-goldenring commented Nov 18, 2020

DazWilkin commented Nov 19, 2020

DazWilkin commented Nov 19, 2020

kate-goldenring commented Nov 23, 2020

DazWilkin commented Nov 23, 2020

DazWilkin commented Nov 23, 2020

DazWilkin commented Nov 23, 2020

DazWilkin commented Dec 7, 2020

kate-goldenring commented Dec 7, 2020

DazWilkin commented Dec 7, 2020

DazWilkin commented Dec 7, 2020

kate-goldenring commented Dec 7, 2020

kate-goldenring commented Dec 7, 2020 •

edited

Loading

DazWilkin commented Dec 7, 2020

DazWilkin commented Dec 7, 2020

kate-goldenring commented Dec 7, 2020

DazWilkin commented Dec 8, 2020

kate-goldenring commented Dec 8, 2020

DazWilkin commented Dec 8, 2020

DazWilkin commented Dec 8, 2020

DazWilkin commented Dec 18, 2020

bfjelds commented Dec 19, 2020 •

edited

Loading

DazWilkin commented Dec 20, 2020

bfjelds commented Dec 20, 2020 •

edited

Loading

DazWilkin commented Jan 11, 2021

kate-goldenring commented Jan 11, 2021 •

edited

Loading

DazWilkin commented Jan 11, 2021

github-actions bot commented Sep 10, 2021

[Extensibility] ZeroConf #132

[Extensibility] ZeroConf #132

Comments

DazWilkin commented Nov 17, 2020

kate-goldenring commented Nov 18, 2020

DazWilkin commented Nov 19, 2020

DazWilkin commented Nov 19, 2020

kate-goldenring commented Nov 23, 2020

DazWilkin commented Nov 23, 2020

DazWilkin commented Nov 23, 2020

DazWilkin commented Nov 23, 2020

DazWilkin commented Dec 7, 2020

kate-goldenring commented Dec 7, 2020

DazWilkin commented Dec 7, 2020

DazWilkin commented Dec 7, 2020

kate-goldenring commented Dec 7, 2020

kate-goldenring commented Dec 7, 2020 • edited Loading

DazWilkin commented Dec 7, 2020

DazWilkin commented Dec 7, 2020

kate-goldenring commented Dec 7, 2020

DazWilkin commented Dec 8, 2020

kate-goldenring commented Dec 8, 2020

DazWilkin commented Dec 8, 2020

DazWilkin commented Dec 8, 2020

DazWilkin commented Dec 18, 2020

bfjelds commented Dec 19, 2020 • edited Loading

DazWilkin commented Dec 20, 2020

bfjelds commented Dec 20, 2020 • edited Loading

DazWilkin commented Jan 11, 2021

kate-goldenring commented Jan 11, 2021 • edited Loading

DazWilkin commented Jan 11, 2021

github-actions bot commented Sep 10, 2021

kate-goldenring commented Dec 7, 2020 •

edited

Loading

bfjelds commented Dec 19, 2020 •

edited

Loading

bfjelds commented Dec 20, 2020 •

edited

Loading

kate-goldenring commented Jan 11, 2021 •

edited

Loading