Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
518 views
in Technique[技术] by (71.8m points)

Javascript parse error on 'u2028' unicode character

Whenever I use the u2028 character literal in my javascript source with the content type set to "text/html; charset=utf-8" I get a javascript parse errors.

Example:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd">

<html lang="en">
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <title>json</title>

    <script type="text/javascript" charset="utf-8">
    var string = '?    ';
    </script>
</head>
<body>

</body>
</html>

If the <meta http-equiv> is left out everything works as expected. I've tested this on Safari and Firefox, both exhibit the same problem.

Any ideas on why this is happening and how to properly fix this (without removing the encoding)?

Edit: After some more research, the specific problem was that the problem character was returned using JSONP. This was then interpreted by the browser, which reads u2028 as a newline and throws an error about an invalid newline in a string.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Yes, it's a feature of the JavaScript language, documented in the ECMAScript standard (3rd edition section 7.3), that the U+2028 and U+2029 characters count as line endings. Consequently a JavaScript parser will treat any unencoded U+2028/9 character in the same way as a newline. Since you can't put a newline inside a string literal, you get a syntax error.

This is an unfortunate oversight in the design of JSON: it is not actually a proper subset of JavaScript. Raw U+2028/9 characters are valid in string literals in JSON, and will be accepted by JSON.parse, but not so in JavaScript itself.

Hence it is only safe to generate JavaScript code using a JSON parser if you're sure it explicitly u-escapes those characters. Some do, some don't; many u-escape all non-ASCII characters, which avoids the problem.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...