Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshots caveats #28

Closed
ronag opened this issue Apr 25, 2022 · 2 comments
Closed

Snapshots caveats #28

ronag opened this issue Apr 25, 2022 · 2 comments
Labels
documentation Improvements or additions to documentation

Comments

@ronag
Copy link
Contributor

ronag commented Apr 25, 2022

There are some caveats regarding snapshots and read/write ordering that are not entirely obvious.

e.g. in the following case:

const promise = db.put(...)
await db.get(...)
await promise

The put will not be visible for following reads until after the put has been completed, i.e. it is not enough to schedule the write for it to be visible in reads, it actually has to be completed first.

@ronag ronag changed the title Snapshots Snapshots caveats Apr 25, 2022
@vweevers vweevers added the documentation Improvements or additions to documentation label Apr 25, 2022
@ronag
Copy link
Contributor Author

ronag commented Apr 25, 2022

If napi_create_async_work is ordered we could actually do GetSnapshot() inside the worker queue in which case the following reads would immediately see the writes, i.e. the root problem here is that we defer writes to the worker queue while we read the snapshots immediately on the main thread. So the ordering gets inverted.

vweevers added a commit to Level/abstract-level that referenced this issue Sep 25, 2022
At the moment, this is a documentation-only PR, acting as an RFC. In
particular I'd like review of my choice to use a token-based
approach (as first suggested by Rod Vagg in
Level/community#45) as opposed to a
dedicated snapshot API surface (as suggested by Julian Gruber in
Level/community#47). Main reasons for not
choosing the latter:

- It would be a third API surface. We have the main database API,
  the sublevel API which is equal to that except for implementation-
  specific additional methods, and would now add another API which
  only has read methods (get, iterator, etc, but not put and batch).
  If the API was reusable, meaning you could pass around a snapshot
  as if it was a regular database (like you can with sublevels) then
  I'd be cool with it. But for that to happen, we'd have to
  implement transactions as well, which I consider to be out of
  scope although transactions are in fact a use case of snapshots.
- It would have a higher complexity once you factor in sublevels.
  I.e. to make `db.sublevel().snapshot().get(key)` read from the
  snapshot but also prefix the given key. By instead doing
  `db.sublevel().get(key, { snapshot })`, the sublevel can just
  forward that snapshot option to its parent database.
- Furthermore, with the token approach, you can pass a snapshot to
  multiple sublevels, which cleanly solves the main use case of
  retrieving data from an index (I included an example in the docs).

I renamed the existing snapshot mechanism to "implicit snapshots"
and attempted to clarify the behavior of those as well.

If accepted, some related issues can be closed, because this PR:

- Includes (a more complete write-up than)
  Level/classic-level#40.
- Mentions Level/leveldown#796 (which
  should be moved rather than closed).
- Answers Level/classic-level#28.
- Solves the main use case as described in
  Level/leveldown#486.
- Effectively completes Level/awesome#19.
@vweevers
Copy link
Member

Level/abstract-level#42

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants