{"id":1068,"date":"2017-01-21T17:14:44","date_gmt":"2017-01-21T08:14:44","guid":{"rendered":"http:\/\/randt.jp\/?p=1068"},"modified":"2017-01-21T17:28:32","modified_gmt":"2017-01-21T08:28:32","slug":"haskellbnfcunicode%e3%82%b9%e3%82%ab%e3%83%a9%e5%80%a4%e3%81%ae%e3%82%b5%e3%83%ad%e3%82%b2%e3%83%bc%e3%83%88%e3%83%9a%e3%82%a2%e7%af%84%e5%9b%b2%e8%80%83%e6%85%ae","status":"publish","type":"post","link":"https:\/\/randt.jp\/?p=1068","title":{"rendered":"[haskell][BNFC]Unicode\u30b9\u30ab\u30e9\u5024\u306e\u30b5\u30ed\u30b2\u30fc\u30c8\u30da\u30a2\u7bc4\u56f2\u8003\u616e"},"content":{"rendered":"<p>\u30e6\u30cb\u30b3\u30fc\u30c9\u30b9\u30ab\u30e9\u5024\u306b\u306f\u30b5\u30ed\u30b2\u30fc\u30c8\u30da\u30a2\u306e\u305f\u3081\u306e\u9818\u57df0xD800-0xDFFF\u304c\u3042\u308a\u3001\u3053\u306e\u7bc4\u56f2\u306f\u6587\u5b57\u3092\u5272\u308a\u5f53\u3066\u308b\u3053\u3068\u304c\u304d\u306a\u3044\u3002\u3053\u306e\u305f\u3081char\u306e\u7bc4\u56f2\u304b\u3089\u3053\u306e\u90e8\u5206\u3092\u5916\u3059\u3002<\/p>\n<pre>\r\n~$ git diff\r\ndiff --git a\/source\/src\/BNFC\/Backend\/Haskell\/CFtoAlex3.hs b\/source\/src\/BNFC\/Backend\/Haskell\/CFtoAlex3.hs\r\nindex 054d576..4afdb83 100644\r\n--- a\/source\/src\/BNFC\/Backend\/Haskell\/CFtoAlex3.hs\r\n+++ b\/source\/src\/BNFC\/Backend\/Haskell\/CFtoAlex3.hs\r\n@@ -63,7 +63,7 @@ cMacros = [\r\n   \"$s = [a-z\\\\222-\\\\255] # [\\\\247]    -- small isolatin1 letter FIXME\",\r\n   \"$d = [0-9]                -- digit\",\r\n   \"$i = [$l $d _ ']          -- identifier character\",\r\n-  \"$u = [\\\\0-\\\\255]          -- universal: any character\"\r\n+  \"$u = [\\\\x0000-\\\\x10FFFF] # [\\\\xD800 -\\\\xDFFF]         -- universal: any character\"\r\n   ]\r\n \r\n rMacros :: CF -> [String]\r\n~$\r\n<\/pre>\n<pre>\r\n~$ cat R_char.cf\r\nLTest. Test::= R_char_sequence;\r\ntoken R_char_sequence (char - [\")\"])+;\r\n~$ \r\n<\/pre>\n<p>BNFC\u306fUTF-8\u3067\u53d7\u3051\u53d6\u308b\u305f\u3081\u30010xD800\u3092\u7121\u7406\u3084\u308aUTF-8\u3078\u30b3\u30f3\u30d0\u30fc\u30c8\u3059\u308b\u3002\u30ea\u30c8\u30eb\u30a8\u30f3\u30c7\u30a3\u30a2\u30f3\u3067\u3042\u308b\u306e\u3067\u6700\u521d\u306e0x0a\u306f\u6539\u884c\u3060\u3002\u52dd\u624b\u306b\u3064\u3044\u3066\u304f\u308b\u304c\u304a\u305d\u3089\u304fecho\u304b\u306a\u3002<\/p>\n<pre>\r\n~$ echo -e \"\\x00\\x00\\xD8\\x00\\x00\\x00\\x00\" | iconv -f ISO-10646\/UCS4 -t ISO-10646\/UTF8 - |  od -t x4\r\n0000000 0a80a0ed\r\n0000004\r\n~$ \r\n<\/pre>\n<p>0xeda080\u3092BNFC\u3067\u30d1\u30fc\u30b9\u3057\u3066\u307f\u308b\u304c\u3001haskell\u304c\u53d7\u3051\u53d6\u3063\u3066\u304f\u308c\u306a\u3044\u3002<\/p>\n<pre>~$ echo -e \"\\xed\\xa0\\x80\" | .\/TestRChar\r\nTestRChar: <stdin>: hGetContents: invalid argument (invalid byte sequence)\r\n~$\r\n<\/stdin><\/pre>\n<p>\u30b5\u30ed\u30b2\u30fc\u30c8\u30da\u30a2\u5916\u306e\u6587\u5b57\u30b3\u30fc\u30c9\u3067\u3042\u308c\u3070\u554f\u984c\u306a\u304f\u53d7\u3051\u53d6\u308b\u306e\u3067\u30d1\u30fc\u30b9\u3067\u304d\u308b\u3002<\/p>\n<pre>\r\n~$ echo \"\u3042\" | od -t x4\r\n0000000 0a8281e3\r\n0000004\r\n~$ echo \"\u3042\" | .\/TestRChar\r\n\r\nParse Successful!\r\n\r\n[Abstract Syntax]\r\n\r\nLTest (R_char_sequence \"\\12354\\n\")\r\n\r\n[Linearized tree]\r\n\r\n\u3042\r\n\r\n~$ \r\n<\/pre>\n<p>BNFC\u304c\u751f\u6210\u3057\u305f\u30c6\u30b9\u30c8\u5b9f\u884c\u30b3\u30fc\u30c9\u3092\u5909\u66f4\u3057\u3066\u30c6\u30b9\u30c8\u3057\u3066\u307f\u308b\u3002<\/p>\n<pre>\r\n-- main\u306e\u307f\u629c\u304d\u51fa\u3057\r\nmain :: IO ()\r\nmain = do\r\n  args < - getArgs\r\n  case args of\r\n    [\"--help\"] -> usage\r\n    [\"--test1\"] -> run 2 pTest \"\\x3042\" -- \u30c6\u30b9\u30c8\uff11\uff1a\u6b63\u5e38\u306b\u30d1\u30fc\u30b9\u3067\u304d\u308b\u306f\u305a\u300c\u3042\u300d\r\n    [\"--test2\"] -> run 2 pTest \"\\xD800\" -- \u30c6\u30b9\u30c8\uff12\uff1a\u7bc4\u56f2\u5916\u306a\u306e\u3067\u30d1\u30fc\u30b9\u30a8\u30e9\u30fc\u3068\u306a\u308b\u306f\u305a\u3002\r\n    [\"--test3\"] -> run 2 pTest \"\\x2F852\" -- \u30c6\u30b9\u30c8\uff13\uff1a\u30b5\u30ed\u30b2\u30fc\u30c8\u30da\u30a2\u7bc4\u56f2\u306e\u6587\u5b57\u300c?\u300d\r\n    [] -> getContents >>= run 2 pTest\r\n    \"-s\":fs -> mapM_ (runFile 0 pTest) fs\r\n    fs -> mapM_ (runFile 2 pTest) fs\r\n<\/pre>\n<pre>\r\n~$ .\/TestRChar --test1\r\n\r\nParse Successful!\r\n\r\n[Abstract Syntax]\r\n\r\nLTest (R_char_sequence \"\\12354\")\r\n\r\n[Linearized tree]\r\n\r\n\u3042\r\n~$ .\/TestRChar --test2\r\n\r\nParse              Failed...\r\n\r\nTokens:\r\n[Err (Pn 1 1 2)]\r\nsyntax error at line 1, column 2 due to lexer error\r\n~$ .\/TestRChar --test3\r\n\r\nParse Successful!\r\n\r\n[Abstract Syntax]\r\n\r\nLTest (R_char_sequence \"\\194642\")\r\n\r\n[Linearized tree]\r\n\r\n?\r\n~$ \r\n<\/pre>\n<p>\u30c6\u30b9\u30c8\u304c\u6210\u529f\u3057\u305f\u306e\u3067\u554f\u984c\u306a\u3057\u304b\u306a\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u30e6\u30cb\u30b3\u30fc\u30c9\u30b9\u30ab\u30e9\u5024\u306b\u306f\u30b5\u30ed\u30b2\u30fc\u30c8\u30da\u30a2\u306e\u305f\u3081\u306e\u9818\u57df0xD800-0xDFFF\u304c\u3042\u308a\u3001\u3053\u306e\u7bc4\u56f2\u306f\u6587\u5b57\u3092\u5272\u308a\u5f53\u3066\u308b\u3053\u3068\u304c\u304d\u306a\u3044\u3002\u3053\u306e\u305f\u3081char\u306e\u7bc4\u56f2\u304b\u3089\u3053\u306e\u90e8\u5206\u3092\u5916\u3059\u3002 ~$ git diff diff &#8211;git a\/so &#8230;<\/p>\n<p> <a class=\"continue-reading-link\" href=\"https:\/\/randt.jp\/?p=1068\"><span>Continue reading<\/span><i class=\"crycon-right-dir\"><\/i><\/a> <\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[145,137,19,133],"tags":[],"class_list":["post-1068","post","type-post","status-publish","format-standard","hentry","category-bnfc","category-haskell","category-19","category-133"],"_links":{"self":[{"href":"https:\/\/randt.jp\/index.php?rest_route=\/wp\/v2\/posts\/1068","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/randt.jp\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/randt.jp\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/randt.jp\/index.php?rest_route=\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/randt.jp\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1068"}],"version-history":[{"count":5,"href":"https:\/\/randt.jp\/index.php?rest_route=\/wp\/v2\/posts\/1068\/revisions"}],"predecessor-version":[{"id":1073,"href":"https:\/\/randt.jp\/index.php?rest_route=\/wp\/v2\/posts\/1068\/revisions\/1073"}],"wp:attachment":[{"href":"https:\/\/randt.jp\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1068"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/randt.jp\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1068"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/randt.jp\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1068"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}