You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tl;dr: Zero width character becomes ​ in output
Hi there,
I was able to create a minimal test case to repeat the issue with PHPUnit:
<?phppublicfunctiontestQueryPathIssue()
{
$html = '<html><head></head><body>Hello!</body>';
require_once(APP . 'Vendor' . DS. 'QueryPath'. DS . 'qp.php');
$qpOptions = [
'convert_from_encoding' => 'UTF-8',
'convert_to_encoding' => 'UTF-8',
'strip_low_ascii' => FALSE,
];
$qp = htmlqp($html, NULL, $qpOptions);
// result is:/*<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"><html><head></head><body>​Hello!</body></html> */
}
As you see, it replaces the zero-width character with ​ - is this normal?
It has similar results with odd quote marks, like this character: ’
It's not a show stopping issue. We're working around it by running this on the html before using QueryPath on it:
Maybe that helps someone else. Is this an issue with QueryPath, PHP, or the encoding?
The issue does not happen if I remove the convert_from_encoding and convert_to_encoding parameters.
The text was updated successfully, but these errors were encountered:
tl;dr: Zero width character becomes
​
in outputHi there,
I was able to create a minimal test case to repeat the issue with PHPUnit:
As you see, it replaces the zero-width character with
​
- is this normal?It has similar results with odd quote marks, like this character:
’
It's not a show stopping issue. We're working around it by running this on the html before using QueryPath on it:
Maybe that helps someone else. Is this an issue with QueryPath, PHP, or the encoding?
The issue does not happen if I remove the
convert_from_encoding
andconvert_to_encoding
parameters.The text was updated successfully, but these errors were encountered: