Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add clean/delete function to ArchiveBuilder(s) #183

Open
VeaaC opened this issue May 15, 2020 · 3 comments
Open

Add clean/delete function to ArchiveBuilder(s) #183

VeaaC opened this issue May 15, 2020 · 3 comments

Comments

@VeaaC
Copy link
Collaborator

VeaaC commented May 15, 2020

Often the user wants to override an existing archive, and we should provide safe methods to do so:

  • check that only flatdata files are in the directory
  • only then delete it

This could either be another parameter to open, or another function like remove etc.

@VeaaC VeaaC changed the title Add clean/delete function to ArchieBuilder(s) Add clean/delete function to ArchiveBuilder(s) May 15, 2020
@gliderkite
Copy link
Contributor

Hi, we use flatdata in Rust and also interested in this particular feature. We are creating in memory resource storages, and subdirectories, and given a flatdata storage we would like to be able to replace the existing one with a new one (at the same path).

Currently I can't find a way to be able to edit/change neither the in memory storage (BTreeMap keys), nor be able to create a new archive builder using an already existing storage without having to call the builder new which internally calls create_archive that raises an error when the storage already exists at the given path.

Even just being able to overwrite an existing storage with empty data without raising an error would be a nice addition. Otherwise the only workaround that probably comes to mind is to not make use of the subdir feature and create/drop a new in memory storage every time, that is, having to handle the archive storage structure outside of this library.

Please let me know if there are better workarounds, thanks!

@VeaaC
Copy link
Collaborator Author

VeaaC commented Jun 27, 2023

One thing to consider with regards to flatdata and Rust's memory safety guarantees is that Rust requires that the contents of memory mapped flatdata resources does not change as long as they are open (since they are marked as const not mut).

This means that if you want to "release" new data for a sub-archive you need to delete the old files (which will be kept alive by the OS as long as they are still memory mapped), and create new files in their place, and then open a new flatdata archive instance in the same location (which would load the new files). Any existing instance of the flatdata archive already loaded would still see the old deleted files (since they are kept alive by the OS).

The way I have seen this usually handled is to move the data that is regularly updated into a separate (top-level) archive, and handle it in a transactional/copy-on-write filesystem/DB layer, e.g. have a folder with multiple "in use" versions of an archive, regularly publish new data, and delete unused versions, and have the "consumer" of the data regularly check for updates, re-opening archives when needed.

A more concrete example:
You might have map data in a main flatdata archive Map, and then a folder with an archive TrafficOverlay. A traffic-producer could regularly publish new TrafficOverlays into that folder (and clean up unused versions), while a map-rendering-service could regularly check the folder for new data and replace the TrafficOverlay archive it has loaded in memory with a new one, while all the other threads of the `map-rendering-service continue processing, until they fetch a new version for the next request.

That's why I suspect that this feature might not help you much in achieving what you want to achieve. It is more useful for local development: You are building one flatdata archive after another and do not want to rm -r my_archive all the time.

@gliderkite
Copy link
Contributor

Thanks for your reply.
Yes, in my scenario of "local development" I cannot rm -r my_archive because there's no such file as we use in memory storages. To make a more concrete example let's say that

  1. I create a new in memory storage
let storage = MemoryResourceStorage::new("/in-memory");
  1. I then create a subdirectory
let storage_sub = storage.subdir("subdir_name");
  1. I create an archive builder
let builder = MyArchiveBuilder::new(storage_sub).unwrap();

what is now, given storage as input, the best way to create a new archive builder for the same subdirectory stored in memory of the original archive (where all the data that was previously written in that subdirectory has been cleaned)?

As mentioned above, constructing again the archive builder will raise the error Io(Custom { kind: AlreadyExists })

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants