Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
When Python's bytes.decode() method encounters encounters a byte sequence that cannot be decoded, it will take an action dependent on its second argument: 'strict': raise UnicodeDecodeError exception (default) 'replace': insert U+FFFD 'ignore': skip to next character While most inputs appear to be sanitized, dmesg output is passed to _parse_dmesg() as is, and can contain data that escapes to invalid Unicode. When the parser attempts to decode this data it immediately raises an exception and dies, as seen in xrmx#77. To prevent this issue, I set the error handling method to 'replace', as 'ignore' can hide decoding errors from developers working with *really* broken dmesg logs. The parts of dmesg the parser looks at should be in a standard format anyway, so a U+FFFD (Replacement Character) or two after the timestamp shouldn't be too harmful.
- Loading branch information