You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, fs-index keeps track of collisions regardless of the hash function used to compute the resource ID, whether it's cryptographic or not.
This adds unnecessary computation in some cases. For cryptographic hash functions, there are no collisions, so this part can be ignored.
Note: some users might still want collision counting to track files with the same content.
Plan
We plan to include a compilation flag in the fs-index crate to manage collision tracking. This was initially intended for #42, but we postponed it until we update the ResourceIndex API, which will simplify the collision tracking code. For more details, see this comment.
Notes
The compilation flag might be better named collision-counting since it will now only track the number of occurrences of the same hash.
An alternative approach could be to create a separate crate for the different implementation.
The text was updated successfully, but these errors were encountered:
Collision counting (current implementation) might be useful but the developer need to know how to interpret them, and what to do when it's not just duplicates:
with cryptographic hash function, collisions mean the data is duplicated
with non-cryptographic hash function, collisions could mean both duplicates or real collisions
Collision tracking would be more useful for consumer apps, when used together with cryptographic hash functions, because we would also provide the developer with a list of duplicates for each resource id. For example, a photo app or documents vault could implement deduplication features using this.
tl;dr: collision counting is ResourceId -> usize collision tracking is ResourceId -> Vec<PathBuf>
Description
Currently,
fs-index
keeps track of collisions regardless of the hash function used to compute the resource ID, whether it's cryptographic or not.This adds unnecessary computation in some cases. For cryptographic hash functions, there are no collisions, so this part can be ignored.
Plan
We plan to include a compilation flag in the
fs-index
crate to manage collision tracking. This was initially intended for #42, but we postponed it until we update theResourceIndex
API, which will simplify the collision tracking code. For more details, see this comment.Notes
collision-counting
since it will now only track the number of occurrences of the same hash.The text was updated successfully, but these errors were encountered: