-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a salt to the model output hash #275
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,7 @@ | ||
import asyncio | ||
import logging | ||
import time | ||
from functools import lru_cache | ||
from hashlib import sha1 | ||
from pathlib import Path | ||
from shutil import copy2 | ||
|
@@ -18,10 +19,27 @@ | |
logger = logging.getLogger(__name__) | ||
|
||
|
||
@lru_cache | ||
def _get_salt() -> bytes: | ||
""" | ||
Returns a salt to use when hashing Stan programs if one was set during the Docker build. | ||
This helps ensure that models don't get cached by the browser across different deployments. | ||
""" | ||
salt_file = Path(__file__).parent.parent.parent.parent / "hash-salt.txt" | ||
if not salt_file.exists(): | ||
logger.warning("No hash salt file found at %s", salt_file) | ||
salt = b"" | ||
else: | ||
salt = salt_file.read_bytes().strip() | ||
logger.info("Using hash salt from %s: %s", salt_file, repr(salt)) | ||
return salt | ||
|
||
Comment on lines
+22
to
+36
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should I be seeing the side effect of this function every time I compile a new (cache miss) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, it's not even that predictable when the function reruns and when it doesn't. I don't think this much matters, but I like to note everything. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You should see it once per thread, so up to 4 times, and then no more There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there any rhyme or reason to when a new thread is spawned? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure, that would be a uvicorn implementation detail. It seems like it’s roughly round-robin |
||
|
||
def _compute_stan_program_hash(program_file: Path) -> str: | ||
stan_program = program_file.read_text() | ||
# MAYBE: replace stan_program with a canonical form? | ||
return sha1(stan_program.encode()).hexdigest() | ||
hasher = sha1(_get_salt()) | ||
hasher.update(stan_program.encode()) | ||
return hasher.hexdigest() | ||
|
||
|
||
def make_canonical_model_dir(src_file: Path, built_model_dir: Path) -> Path: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason to sort them before hashing? Isn't the goal just for it to be different from last time (vs being reproducible between builds)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it doesn't matter much either way, but if no files changed we actually wouldn't mind a user's local cache being warm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I suppose redeploying the server isn't exactly something we do so often that it's worth worrying about a spurious sort.