[haskell][BNFC]n3337 Raw-string
n3337のRaw-stringをBNFCで定義する。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
{- Unicode: Unicode scalar value Basic Multilingual Plane(BMP) = U+0000 - U+FFFF(*) Supplementary Multilingual Plane = U+010000 - U+10FFFF *) Reserved U+D800 - U+DFFF for Surrogate Pair encode rule. => char r-char: Any member of the source character set, except a right parenthesis ) followed by the initial d-char-sequence (which may be empty) followd by a double quote. => (char - [")"]) d-char: Any member of the basic source character set except: space, the left parenthesis (, the right parenthesis ), the backslash \, and the control characters representing horizontal tab, vertical tab, form feed, and newline. => (basic source char - ["() \\\t\v\f\n"]) 2.3 Character sets The basic source character set onsists of 96 characters: the space character, the control characters representing horizontal tab, vertical tab, form feed, and new-line, plus the following 91 graphical characters: a-z, A-Z, 0-9 and _{} []#() <>%:; .?*+- /^&|~ !=,\" ' => ["\t\v\f\n0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_{}[]#()<>%:;.?*+-/^&|~!=,\\\"\'"] -} LTest. Test::= Raw_string; token Raw_string ({"u8"} | ["uUL"])? {"R\""} (["\t\n0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_{}[]#()<>%:;.?*+-/^&|~!=,\\\"\'"] - ["() \\\t\n"])* '(' (char - [")"])* ')' (["\t\n0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_{}[]#()<>%:;.?*+-/^&|~!=,\\\"\'"] - ["() \\\t\n"])* '"'; |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
~$ ./TestRawStr u8R"2sS(あ)6wY" Parse Successful! [Abstract Syntax] LTest (Raw_string "u8R\"2sS(\12354)6wY\"") [Linearized tree] u8R"2sS(あ)6wY" ~$ ~$ ./TestRawStr R"^^(あああ)^^" Parse Successful! [Abstract Syntax] LTest (Raw_string "R\"^^(\12354\12354\12354)^^\"") [Linearized tree] R"^^(あああ)^^" ~$ |
うまくパースしてくれているようで …