Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Charset in meta content does not correctly parse for trailing semi-colon #92

Open
Labels
@1619digital

Description

@1619digital

Reference:http://www.w3.org/html/wg/drafts/html/master/infrastructure.html#algorithm-for-extracting-a-character-encoding-from-a-meta-element

Because the ContentAttrParser is looking only for a space character to terminate an unquoted charset

<meta http-equiv="Content-Type" content="charset=iso8859-2;text/html">

will incorrectly be inferred to have the charset 'iso8859-2;text/html'. The fix is to add a semicolon to the spaceCharacters scanned in SkipUntil - line 860.

EDIT: as per specification. Also, I don't know what the status is of the parser tests, but they're out of date and incorrect and (obviously) not used. Although most of the tests are still valid, so it would not take much to bring them back into the full test regime.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp