- Notifications
You must be signed in to change notification settings - Fork294
Open
Description
Because the ContentAttrParser is looking only for a space character to terminate an unquoted charset
<meta http-equiv="Content-Type" content="charset=iso8859-2;text/html">
will incorrectly be inferred to have the charset 'iso8859-2;text/html'. The fix is to add a semicolon to the spaceCharacters scanned in SkipUntil - line 860.
EDIT: as per specification. Also, I don't know what the status is of the parser tests, but they're out of date and incorrect and (obviously) not used. Although most of the tests are still valid, so it would not take much to bring them back into the full test regime.