You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
tartufo will not detect Linux passwords that have been hashed using many common algorithms. These may not be detected because individual strict-base64-valid substrings are not long enough to be registered by the entropy checker and the default regex library does not contain expressions that match these patterns.
It is not unknown for these hashes to be injected directly into /etc/shadow, etc. during configuration activities where it is necessary for a user to have a known (initial?) password.
Describe the solution you'd like
The Debian crypt(5) manpage lists explicitly all (present) hashing algorithms, with a regex for each that will match valid password hashes generated for that algorithm. My manpage lists 13 regexes.
These regexes should be added to default_regexes.json.
This would increase the number of default regular expressions from 39 (present) to 52 (contemplated), increasing the cost of a regex scan. Most (but not all) have fixed prefixes, which would reduce testing costs.
Describe alternatives you've considered
We could place these regexes in a separate list that could be invoked only when desired. This would eliminate overhead for users who "know" they don't have Linux hashes, but requires people to opt-in and potentially fails to assist the oblivious owner who needs this help the most.
In general, trying to detect these using entropy seems inferior to targeted regular expressions, but tweaks there are possible:
We could tune the entropy checker so that it would detect hashes that currently are tuned out by the base64 scan, by adjusting the valid character set and/or the minimum string length. However, this could impact present detection by subtly altering existing entropy scoring in ways that are difficult to quantify.
Alternatively, we could add a third entropy checking pass (in addition to base64 and hex), which would have the virtue of not perturbing existing entropy scoring, but at the cost of increasing entropy check effort by ~50% (assuming overall costs are relatively insensitive to exact character set and minimum length).
Ideally (if regexes are added to the default list), users observe no changes in usage or behavior (except for detection of sensitive material which may be overlooked by the current implementation).
The text was updated successfully, but these errors were encountered:
From a different thread, so it isn't lost in the shuffle:
For example from my system's /etc/shadow file, obfuscated by swapping a few characters:
$1$SlhiQ2ZF$KusSU.GcrueRsVJXAj6zw1
This would be ignored entirely because the longest base64-compatible substring is only 16 characters long, but any knowledgeable human looking at it would likely recognize it as a Linux password hash (which unfortunately for us uses a family of mutant base64-ish hash methods for some of the components).
Also, I neglected to mention above that regex-based testing is superior because we can report something like "md5crypt hashed password" instead of "high-entropy" (which might leave people guessing about why tartufo flagged it).
Feature Request
Is your feature request related to a problem? Please describe.
tartufo will not detect Linux passwords that have been hashed using many common algorithms. These may not be detected because individual strict-base64-valid substrings are not long enough to be registered by the entropy checker and the default regex library does not contain expressions that match these patterns.
It is not unknown for these hashes to be injected directly into
/etc/shadow
, etc. during configuration activities where it is necessary for a user to have a known (initial?) password.Describe the solution you'd like
The Debian
crypt(5)
manpage lists explicitly all (present) hashing algorithms, with a regex for each that will match valid password hashes generated for that algorithm. My manpage lists 13 regexes.These regexes should be added to
default_regexes.json
.This would increase the number of default regular expressions from 39 (present) to 52 (contemplated), increasing the cost of a regex scan. Most (but not all) have fixed prefixes, which would reduce testing costs.
Describe alternatives you've considered
We could place these regexes in a separate list that could be invoked only when desired. This would eliminate overhead for users who "know" they don't have Linux hashes, but requires people to opt-in and potentially fails to assist the oblivious owner who needs this help the most.
In general, trying to detect these using entropy seems inferior to targeted regular expressions, but tweaks there are possible:
We could tune the entropy checker so that it would detect hashes that currently are tuned out by the base64 scan, by adjusting the valid character set and/or the minimum string length. However, this could impact present detection by subtly altering existing entropy scoring in ways that are difficult to quantify.
Alternatively, we could add a third entropy checking pass (in addition to base64 and hex), which would have the virtue of not perturbing existing entropy scoring, but at the cost of increasing entropy check effort by ~50% (assuming overall costs are relatively insensitive to exact character set and minimum length).
Teachability, Documentation, Adoption, Migration Strategy
Ideally (if regexes are added to the default list), users observe no changes in usage or behavior (except for detection of sensitive material which may be overlooked by the current implementation).
The text was updated successfully, but these errors were encountered: