Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate name-start code points for identifier #185

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

pierlon
Copy link

@pierlon pierlon commented Jan 22, 2020

When given the following CSS:

body {
    transition: all .3s ease-in-out;
    -webkit-transition: all .3s ease-in-out;
    -moz-transition: all .3s ease-in-out;
    -0-transition: all .3s ease-in-out;
}

The parser does not remove or throw an exception for the malformed rule -0-transition, as it is a syntax error:

image

This occurs because the check to validate if the rule is an identifier does not validate if the first 3 code points are valid.


This PR adds a check to validate the first 3 code points in an identifier. If a malformed identifier is given, an UnexpectedTokenException exception will be thrown before the identifier is consumed.

@westonruter
Copy link
Contributor

@sabberworm How's this look?

@raxbg
Copy link
Contributor

raxbg commented Jan 31, 2020

This doesn't seem to recognize some escape sequences. For example:

body { background: height: ca\lc(100% - 1px); }

The proposed changes cause the background rule here to get dropped.

@raxbg
Copy link
Contributor

raxbg commented Jan 31, 2020

The CSS I meant to post in my previous message was actually:

body { height: ca\lc(100% - 1px); }

And the dropped rule is the height.

@pierlon
Copy link
Author

pierlon commented Jan 31, 2020

I would assume that should be the intended behavior. A UnexpectedTokenException would be thrown with the error message: Identifier expected. Got “- 1px”. This is inline with the W3C CSS validator:

image

@raxbg
Copy link
Contributor

raxbg commented Jan 31, 2020

It looks like this will introduce breaking changes (at least in lenient mode). I am not fully on board with the idea of running this kind of validation while parsing, at least not without introducing something like a new parsing mode.

The syntax of the example calc should be valid according to this https://developer.mozilla.org/en-US/docs/Web/CSS/calc and is also properly recognized by the browsers.

The proposed validation can also be done post parse. Are there cases where doing it while parsing is truly necessary? It would definitely be nice to have this validation as part of the parser, so maybe it can be implemented as a method of Document? In fact a validation method in Document can be quite useful and more flexible.

@sabberworm
Copy link
Contributor

sabberworm commented Jan 31, 2020

I agree with @raxbg. We throw errors when we can‘t make sense of the input (and we try to recover in lenient mode). But throwing errors when we can understand the structure and it just happens to be invalid is a different thing entirely.

Yes, a validation after parsing would be cool though. It could also catch tons of other issues. There‘s lots of CSS that‘s syntactically valid but semantically incorrect. I would not put it on the Document class though. I‘ve put too many extraneous stuff on the main classes already. It should be a separate Validator class (or even a distinct repo/composer package).

@pierlon
Copy link
Author

pierlon commented Jan 31, 2020

It looks like this will introduce breaking changes (at least in lenient mode).

Yes that would the case. Adding a check for lenient mode would resolve that and allow the parser to recover:

--- lib/Sabberworm/CSS/Parsing/ParserState.php	(revision 8fbd0fe82aa08ad2650def1b44f2f77154211e30)
+++ lib/Sabberworm/CSS/Parsing/ParserState.php	(date 1580502992173)
@@ -72,7 +72,7 @@
             $sResult = $this->parseCharacter(true);
         }
 
-		if ($sResult === null) {
+		if (!$this->oParserSettings->bLenientParsing && $sResult === null) {
 			throw new UnexpectedTokenException($sResult, $this->peek(5), 'identifier', $this->iLineNo);
 		}
 		$sCharacter = null;

Validation of the CSS after parsing would be ideal, but that would be for another PR.

The thrown exception will be caught when in lenient mode
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants