Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Sync #14

Closed
mistermoe opened this issue Mar 11, 2022 · 2 comments · Fixed by #290, #305 or #313
Closed

Implement Sync #14

mistermoe opened this issue Mar 11, 2022 · 2 comments · Fixed by #290, #305 or #313
Assignees
Labels
feature New feature or request

Comments

@mistermoe
Copy link
Member

mistermoe commented Mar 11, 2022

related proposal issue: decentralized-identity/decentralized-web-node#128

@mistermoe mistermoe added feature New feature or request proposal needed indicates that no proposal yet exists labels Mar 11, 2022
@mistermoe mistermoe added this to the v1.0 milestone Mar 11, 2022
@mistermoe mistermoe self-assigned this Mar 11, 2022
@mistermoe
Copy link
Member Author

DWN Sync Proposal

Synchronizing a tenant's data across more than 1 DWN can be achieved by introducing three mechanisms:

  • An append-only event log
  • The ability to get events from event log
  • the ability to get a message by CID

using the above mechanisms in addition to existing ones we can break sync into 2 phases:

  • Push: here are all the messages i have for you
  • Pull: give me all the messages i haven't already gotten from you

Event Log

Within the context of a DWN, An append-only event log is simply a ordered K/V store where the key is an ordinal number and the value is the CID of a message.

Each DWN would maintain it's own individual event log. There's no requirement for an individual DWN's event log to be in the same order as another DWN

Whenever a message is stored by a DWN, the CID of that message is added to its event log.

EventsGet

message type used to get events from a DWN

Message Contents

property name type required (y/n) description
method const EventsGet Y
watermark string N providing no watermark returns all events for a given tenant
authorization GeneralJws Y same expected authorization format required by other DWN messages.

Response

all events that occurred between the watermark and the time at which the message is processed
MessageReply where entries is an array of event objects

Notes

  • Should not be usable in protocol definitions

MessagesGet

message typed used to get a message by CID from a DWN

Message Contents

property name type required (y/n) description
method const MessagesGet Y
cid string Y the CID of the message you want
authorization GeneralJws Y same expected authorization format required by other DWN messages.

Pull

Introducing EventsGet and MessagesGet makes it possible to pull all messages from a given DWN

sequenceDiagram
autonumber

participant A as DWN A
participant B as DWN B

A->>A: grab pull-specific watermark for DWN B
A->>B: EventsGet{ watermark, tenant_sig }
B->>B: fetch all events between watermark and now
B->>A: [event_id:cid, event_id:cid, ...]
loop each event
    A->>A: message = MessagesGet{ cid, tenant_sig }
    alt if message
    else else
        A->>B: MessagesGet{ cid, tenant_sig }
        B->>B: inflate message
        B->>A: message
        A->>A: store message
    end
    
    A->>A: update watermark
end
Loading

Push

introducing EventsGet and MessagesGet allows a DWN to push messages to another DWN

sequenceDiagram
autonumber

participant A as DWN A
participant B as DWN B

A->>A: grab push-specific watermark for DWN B
A->>A: EventsGet{watermark, tenant_sig}
loop each event
    A->>A: MessagesGet{ cid, tenant_sig }
    A->>B: message
    B->>B: process message
    A->>A: update watermark
end
Loading

Considerations

  • the only DWNs that can be synced (aka pushed to and pulled from) are DWNs listed in a DID Doc.
  • This approach requires that a syncing DWN maintain a push-watermark and pull-watermark for each DWN it wants to push to and pull from
  • There's a "sync echo" that inevitably occurs with the above approach. Imagine a scenario where there's DWN A and DWN B. They've never synced before. DWN B is empty. DWN A has 1 message in it.
    • DWN A pushes the message it has to DWN B
    • DWN B receives the message. stores it. event is added to event log
    • DWN A pulls from DWN B. DWN B only has 1 message and it happens to be the message that was sent by DWN A.
  • introducing 2 additional message types: SyncPush and SyncPull as convenience messages that facilitate pushing and pulling introduces challenges:
    • both Push and Pull necessitate EventsGet and MessagesGet messages, both of which need to be signed which implies that keys need to be present
      • Counter Argument: I suppose the authorization JWS of SyncPush and SyncPull could could have a wrapped signature that could be used for any resulting EventsGet and MessagesGet messages. haven't fully thought this through yet
    • probably not much benefit of sending SyncPush and SyncPull messages to your other DWNs. I guess it'd be a way to manually trigger your addressable DWNs to push or pull
    • would have to decide whether the SDK wants to take on the responsibility of storing watermarks
      • or make watermark a required property
  • If SyncPush and SyncPull don't make sense as new message types, we could always implement them as methods that are surfaced for consumption by dwn-sdk-js

PoC

the chat-app demo in web5-labs is driven by a PoC of this proposed approach.

  • message-store-level-v2 implements an event log. This message-store is used in web5-wallet and addressable-dwn
  • addressable-dwn exposes an HTTP endpoint /dwn/event-log as a workaround for EventsGet not existing.
      • this PoC endpoint does not have any authn/authz verification as it's just a PoC
  • addressable-dwn exposes an HTTP endpoint /dwn/messages/:cid as a workarond for MessagesGet not existing
    • this PoC endpoint does not have any authn/authz verification as it's just a PoC
  • push and pull are implemented in web5-wallet. they run once every minute (frequency is limited by browser extension constraints)

@andresuribe87
Copy link
Contributor

There’s no requirement for an individual DWN’s event log to be in the same order as another DWN

Can you elaborate on why this isn't a requirement? If it isn't, is there a chance that each node will have different views of what is stored in them?

@thehenrytsai thehenrytsai removed this from the Beta milestone Feb 3, 2023
@frankhinek frankhinek added this to the Bitcoin Miami '23 milestone Mar 14, 2023
@mistermoe mistermoe linked a pull request Apr 1, 2023 that will close this issue
@mistermoe mistermoe linked a pull request Apr 8, 2023 that will close this issue
@mistermoe mistermoe reopened this Apr 8, 2023
@mistermoe mistermoe removed the proposal needed indicates that no proposal yet exists label Apr 8, 2023
@mistermoe mistermoe reopened this Apr 15, 2023
@mistermoe mistermoe linked a pull request Apr 16, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
Status: Done
4 participants