Movatterモバイル変換

Appendix G. Grammar of CSS 2.1

Contents

(hide)

Note: Several sections of this specification have been updated by other specifications. Please, see"Cascading Style Sheets (CSS) — The Official Definition" in the latestCSS Snapshot for a list of specifications and the sections they replace.

The CSS Working Group is also developingCSS level 2 revision 2 (CSS 2.2).

This appendix is non-normative.

The grammar below defines the syntax of CSS 2.1. It is in some sense,however, a superset of CSS 2.1 as this specification imposes additionalsemantic constraints not expressed in this grammar. A conforming UAmust also adhere to theforward-compatible parsing rules, the selectors notation, theproperty and value notation,and the unit notation. However, not all syntactically correct CSS can takeeffect, since the document language may impose restrictions that arenot in CSS, e.g., HTML imposes restrictions on the possible values ofthe "class" attribute.

G.1Grammar

The grammar below isLALR(1) (but note that most UA's should not use itdirectly, since it does not express theparsing conventions, only theCSS 2.1 syntax). The format of the productions is optimized for humanconsumption and some shorthand notation beyond Yacc (see[YACC]) isused:

*: 0 or more
+: 1 or more
?: 0 or 1
|: separates alternatives
[ ]: grouping

The productions are:

stylesheet  : [ CHARSET_SYM STRING ';' ]?    [S|CDO|CDC]* [ import [ CDO S* | CDC S* ]* ]*    [ [ ruleset | media | page ] [ CDO S* | CDC S* ]* ]*  ;import  : IMPORT_SYM S*    [STRING|URI] S* media_list? ';' S*  ;media  : MEDIA_SYM S* media_list '{' S* ruleset* '}' S*  ;media_list  : medium [ COMMA S* medium]*  ;medium  : IDENT S*  ;page  : PAGE_SYM S* pseudo_page?    '{' S* declaration? [ ';' S* declaration? ]* '}' S*  ;pseudo_page  : ':' IDENT S*  ;operator  : '/' S* | ',' S*  ;combinator  : '+' S*  | '>' S*  ;unary_operator  : '-' | '+'  ;property  : IDENT S*  ;ruleset  : selector [ ',' S* selector ]*    '{' S* declaration? [ ';' S* declaration? ]* '}' S*  ;selector  : simple_selector [ combinator selector | S+ [ combinator? selector ]? ]?  ;simple_selector  : element_name [ HASH | class | attrib | pseudo ]*  | [ HASH | class | attrib | pseudo ]+  ;class  : '.' IDENT  ;element_name  : IDENT | '*'  ;attrib  : '[' S* IDENT S* [ [ '=' | INCLUDES | DASHMATCH ] S*    [ IDENT | STRING ] S* ]? ']'  ;pseudo  : ':' [ IDENT | FUNCTION S* [IDENT S*]? ')' ]  ;declaration  : property ':' S* expr prio?  ;prio  : IMPORTANT_SYM S*  ;expr  : term [ operator? term ]*  ;term  : unary_operator?    [ NUMBER S* | PERCENTAGE S* | LENGTH S* | EMS S* | EXS S* | ANGLE S* |      TIME S* | FREQ S* ]  | STRING S* | IDENT S* | URI S* | hexcolor | function  ;function  : FUNCTION S* expr ')' S*  ;/* * There is a constraint on thecolor that it must * have either 3 or 6 hex-digits (i.e., [0-9a-fA-F]) * after the "#"; e.g., "#000" is OK, but "#abcd" is not. */hexcolor  : HASH S*  ;

G.2Lexical scanner

The following is thetokenizer, written in Flex (see[FLEX])notation. The tokenizer is case-insensitive.

The "\377" represents the highest characternumber that current versions of Flex can deal with (decimal 255). Itshould be read as "\4177777" (decimal 1114111), which is the highestpossible code point inUnicode/ISO-10646.

%option case-insensitiveh[0-9a-f]nonascii[\240-\377]unicode\\{h}{1,6}(\r\n|[ \t\r\n\f])?escape{unicode}|\\[^\r\n\f0-9a-f]nmstart[_a-z]|{nonascii}|{escape}nmchar[_a-z0-9-]|{nonascii}|{escape}string1\"([^\n\r\f\\"]|\\{nl}|{escape})*\"string2\'([^\n\r\f\\']|\\{nl}|{escape})*\'badstring1      \"([^\n\r\f\\"]|\\{nl}|{escape})*\\?badstring2      \'([^\n\r\f\\']|\\{nl}|{escape})*\\?badcomment1     \/\*[^*]*\*+([^/*][^*]*\*+)*badcomment2     \/\*[^*]*(\*+[^/*][^*]*)*baduri1         url\({w}([!#$%&*-\[\]-~]|{nonascii}|{escape})*{w}baduri2         url\({w}{string}{w}baduri3         url\({w}{badstring}comment\/\*[^*]*\*+([^/*][^*]*\*+)*\/ident-?{nmstart}{nmchar}*name{nmchar}+num[0-9]+|[0-9]*"."[0-9]+string{string1}|{string2}badstring       {badstring1}|{badstring2}badcomment      {badcomment1}|{badcomment2}baduri          {baduri1}|{baduri2}|{baduri3}url([!#$%&*-~]|{nonascii}|{escape})*s[ \t\r\n\f]+w{s}?nl\n|\r\n|\r|\fAa|\\0{0,4}(41|61)(\r\n|[ \t\r\n\f])?Cc|\\0{0,4}(43|63)(\r\n|[ \t\r\n\f])?Dd|\\0{0,4}(44|64)(\r\n|[ \t\r\n\f])?Ee|\\0{0,4}(45|65)(\r\n|[ \t\r\n\f])?Gg|\\0{0,4}(47|67)(\r\n|[ \t\r\n\f])?|\\gHh|\\0{0,4}(48|68)(\r\n|[ \t\r\n\f])?|\\hIi|\\0{0,4}(49|69)(\r\n|[ \t\r\n\f])?|\\iKk|\\0{0,4}(4b|6b)(\r\n|[ \t\r\n\f])?|\\kL               l|\\0{0,4}(4c|6c)(\r\n|[ \t\r\n\f])?|\\lMm|\\0{0,4}(4d|6d)(\r\n|[ \t\r\n\f])?|\\mNn|\\0{0,4}(4e|6e)(\r\n|[ \t\r\n\f])?|\\nOo|\\0{0,4}(4f|6f)(\r\n|[ \t\r\n\f])?|\\oPp|\\0{0,4}(50|70)(\r\n|[ \t\r\n\f])?|\\pRr|\\0{0,4}(52|72)(\r\n|[ \t\r\n\f])?|\\rSs|\\0{0,4}(53|73)(\r\n|[ \t\r\n\f])?|\\sTt|\\0{0,4}(54|74)(\r\n|[ \t\r\n\f])?|\\tU               u|\\0{0,4}(55|75)(\r\n|[ \t\r\n\f])?|\\uXx|\\0{0,4}(58|78)(\r\n|[ \t\r\n\f])?|\\xZz|\\0{0,4}(5a|7a)(\r\n|[ \t\r\n\f])?|\\z%%{s}{return S;}\/\*[^*]*\*+([^/*][^*]*\*+)*\//* ignore comments */{badcomment}                         /* unclosed comment at EOF */"<!--"{return CDO;}"-->"{return CDC;}"~="{return INCLUDES;}"|="{return DASHMATCH;}{string}{return STRING;}{badstring}             {return BAD_STRING;}{ident}{return IDENT;}"#"{name}{return HASH;}@{I}{M}{P}{O}{R}{T}{return IMPORT_SYM;}@{P}{A}{G}{E}{return PAGE_SYM;}@{M}{E}{D}{I}{A}{return MEDIA_SYM;}"@charset "{return CHARSET_SYM;}"!"({w}|{comment})*{I}{M}{P}{O}{R}{T}{A}{N}{T}{return IMPORTANT_SYM;}{num}{E}{M}{return EMS;}{num}{E}{X}{return EXS;}{num}{P}{X}{return LENGTH;}{num}{C}{M}{return LENGTH;}{num}{M}{M}{return LENGTH;}{num}{I}{N}{return LENGTH;}{num}{P}{T}{return LENGTH;}{num}{P}{C}{return LENGTH;}{num}{D}{E}{G}{return ANGLE;}{num}{R}{A}{D}{return ANGLE;}{num}{G}{R}{A}{D}{return ANGLE;}{num}{M}{S}{return TIME;}{num}{S}{return TIME;}{num}{H}{Z}{return FREQ;}{num}{K}{H}{Z}{return FREQ;}{num}{ident}{return DIMENSION;}{num}%{return PERCENTAGE;}{num}{return NUMBER;}"url("{w}{string}{w}")" {return URI;}"url("{w}{url}{w}")"    {return URI;}{baduri}                {return BAD_URI;}{ident}"("{return FUNCTION;}.{return *yytext;}

G.3Comparison of tokenization in CSS 2.1 andCSS1

There are some differences in the syntax specified in the CSS1recommendation ([CSS1]), and the one above. Most of these are dueto new tokens in CSS2 that did not exist in CSS1. Others are becausethe grammar has been rewritten to be more readable. However, there aresome incompatible changes, that were felt to be errors in the CSS1syntax. They are explained below.

CSS1 style sheets could only be in 1-byte-per-characterencodings, such as ASCII and ISO-8859-1. CSS 2.1 has no suchlimitation. In practice, there was little difficulty in extrapolatingthe CSS1 tokenizer, and some UAs have accepted 2-byte encodings.
CSS1 only allowed four hex-digits after the backslash (\) to referto Unicode characters, CSS2allows six. Furthermore,CSS2 allows a white space character to delimit the escapesequence. E.g., according to CSS1, the string "\abcdef" has 3 letters(\abcd, e, and f), according to CSS2 it has only one (\abcdef).
The tab character (ASCII 9) was not allowed in strings. However,since strings in CSS1 were only used for font names and for URLs, theonly way this can lead to incompatibility between CSS1 and CSS2 is ifa style sheet contains a font family that has a tab in its name.
Similarly, newlines (escaped with abackslash) were not allowed in strings in CSS1.
CSS2 parses a number immediately followed by an identifier as aDIMENSION token (i.e., an unknown unit), CSS1 parsed it as a number and anidentifier. That means that in CSS1, the declaration 'font:10pt/1.2serif' was correct, as was 'font: 10pt/12pt serif'; in CSS2, aspace is required before "serif". (Some UAs accepted the firstexample, but not the second.)
In CSS1, a class name could start with a digit (".55ft"), unlessit was a dimension (".55in"). In CSS2, such classes are parsed asunknown dimensions (to allow for future additions of new units). Tomake ".55ft" a valid class, CSS2 requires the first digit to beescaped (".\35 5ft")

G.4 Implementation note

The lexical scanner for the CSS core syntax insection 4.1.1 can beimplemented as a scanner without back-up. In Lex notation, thatrequires the addition of the following patterns (which do not changethe returned tokens, only the efficiency of the scanner):

{ident}/\\          return IDENT;#{name}/\\          return HASH;@{ident}/\\         return ATKEYWORD;#/\\                return DELIM;@/\\                return DELIM;@/-                 return DELIM;@/-\\               return DELIM;-/\\                return DELIM;-/-                 return DELIM;\</!                return DELIM;\</!-               return DELIM;{num}{ident}/\\     return DIMENSION;{num}/\\            return NUMBER;{num}/-             return NUMBER;{num}/-\\           return NUMBER;[0-9]+/\.           return NUMBER;u/\+                return IDENT;u\+[0-9a-f?]{1,6}/- return UNICODE_RANGE;

previous next contents properties index

[8]ページ先頭