-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: remove ansi escape sequences from escaped xunit output #4527
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there a way to escape or remove all invalid characters for XML?
There is a related |
This PR hasn't had any recent activity, and I'm labeling it |
This issue has been hanging around for 8 months now, and I'm unsure what I can do to get it moving again. There is an open thread that has multiple possible solutions to the problem, but no expression of preference from the Mocha team. |
Hi @JoshuaKGoldberg – that's unexpected – but not unwelcome. I'm happy to rebase the PR to the latest |
Great! Let's go ahead and reopen this one. We also discussed that we don't like the requirement of rebasing, so if you want to do a more casual merge that works too. Whatever's easiest on your end. 🙂 |
Ok, let me get that CLA signed, then. |
|
@JoshuaKGoldberg can I get some eyes on this, please? |
Yes! Thanks for the prod - this slipped off my radar. Reviewing now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Swell! A nice direct change, with a test case - thank you! 🙌
Will wait for another maintainer to review. In the interim, just leaving one small nitpick, nothing particularly important.
Co-authored-by: Josh Goldberg ✨ <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already a test that verifies that invalid XML tags are escaped:
Lines 657 to 665 in 2f3fedc
it('replaces invalid xml characters', function () { | |
expect( | |
utils.escape('\x1B[32mfoo\x1B[0m'), | |
'to be', | |
'[32mfoo[0m' | |
); | |
// Ensure we can handle non-trivial unicode characters as well | |
expect(utils.escape('💩'), 'to be', '💩'); | |
}); |
This should extend those tests rather than introduce a new xunit specific one, as this change is not specific to the xunit parts of the code.
Also added a suggested code change that actually makes use of a regexp from he
return he | ||
.encode(String(html), {useNamedReferences: false}) | ||
.replace(/&#x([0-9A-F]);/g, (match) => { | ||
const val = Number.parseInt(match); | ||
return val === 0x9 || val === 0xA || val === 0xD || (val >= 0xE000 && val <= 0xFFFD) || (val >= 0x10000 && val <= 0x10FFFF) ? `&#x${match};` : ''; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion:
Turns out that he
already has this in itself, just doesn't expose it other than through strict: true
, which throws when it encounters it.
Maybe we could use that regexp and try to land a PR that either exposes it or adds a skipInvalid: true
option or such?
return he | |
.encode(String(html), {useNamedReferences: false}) | |
.replace(/&#x([0-9A-F]);/g, (match) => { | |
const val = Number.parseInt(match); | |
return val === 0x9 || val === 0xA || val === 0xD || (val >= 0xE000 && val <= 0xFFFD) || (val >= 0x10000 && val <= 0x10FFFF) ? `&#x${match};` : ''; | |
}); | |
return he.encode(String(html).replace(regexInvalidRawCodePoint, ''), {useNamedReferences: false}); |
Requires this to be added as well:
// From https://github.com/mathiasbynens/he/blob/36afe179392226cf1b6ccdb16ebbb7a5a844d93a/he.js#L53
var regexInvalidRawCodePoint = /[\0-\x08\x0B\x0E-\x1F\x7F-\x9F\uFDD0-\uFDEF\uFFFE\uFFFF]|[\uD83F\uD87F\uD8BF\uD8FF\uD93F\uD97F\uD9BF\uD9FF\uDA3F\uDA7F\uDABF\uDAFF\uDB3F\uDB7F\uDBBF\uDBFF][\uDFFE\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]/g;
Requirements
Description of the Change
The
utils.escape
function is amended to strip the escape sequence using a regular expression.Alternate Designs
he
library: dismissed because the direct fix is likely faster to become usablehe
: dismissed because the correct escaping in JavaScript is harder to findWhy should this be in core?
The
xunit
reporter is in core and should not be buggyBenefits
xunit
reporter output can be consumed by tools that expect valid XMLPossible Drawbacks
Tools that process the XML as plain (unicode) text (unlikely) can no longer see the ANSI escape sequence.
Applicable issues
fixes #4526
This is a bug fix (patch release)