You signed in with another tab or window.Reload to refresh your session.You signed out in another tab or window.Reload to refresh your session.You switched accounts on another tab or window.Reload to refresh your session.Dismiss alert
return false;">Convert</button><spanstyle="padding-top: 20px;"><imgsrc="copy.png"alt="Copy"title="Copy to clipboard"class="icon"onclick="document.getElementById('zeroX').focus();document.getElementById('zeroX').select();document.execCommand('copy')"/></span><imgsrc="selectall.png"alt="Select"title="Select all text in the box"class="icon"onclick="document.getElementById('zeroX').focus();document.getElementById('zeroX').select();"/>
<p>Most of the time you will probably want to drop the text to be converted into the<code>Mixed input</code> field, and hit the associated<code>Convert</code> button. This will convert all escapes to characters, then convert that into each of the forms listed against the boxes below.</p>
354
-
<p>If your text contains bare numbers that you also want to convert, use one of the convert buttons to the right. (Be aware, however, that in this case something like 'ab' could be interpreted as a hex number.)</p>
355
-
<p>Note, also, that escapes of the form \x, where x is one of a-zA-Z0-9 are not recognised by default. If you check the box next to<code>Convert \x</code> only the special JavaScript escapes are recognised (eg. \b, \n, \t, \", etc.) For full CSS behaviour here, use the CSS input field.</p>
353
+
<p>Most of the time you will probably want to drop the text to be converted into the field with the green background, and hit the associated<code>Convert</code> button. This will convert all escapes to characters, then convert those into each of the forms listed against the boxes below.</p>
354
+
<p>If your text contains bare numbers that you also want to convert, use the select control to the right. (Be aware, however, that in this case something like 'ab' could be interpreted as a hex number.)</p>
355
+
<p>Note, also, that the escapes \n, \t, \b, and \" etc, are not recognised by default. If you check the box next to<code>Convert \n etc</code> they will also be recognised. For full CSS behaviour here, use the CSS input field.</p>
356
+
356
357
<p><spanclass="leadin">Special use.</span> If you only want to convert a specific type of escape and leave all others untouched, paste the text into one of the other boxes and hit its associated<code>Convert</code> button.</p>
357
-
<p><spanclass="leadin">Checkboxes.</span> Several of the output fields have checkboxes that allow you to slightly alter the results of a conversion. If an output field already contains a result when you click on a checkbox, you'll often see a change happen as you click. In some cases, however, this doesn't happen, since it is not possible to produce good results.</p>
358
-
<p><spanclass="leadin">Invoking via URL.</span> You can also pass a string to the page using the q parameter in the URI. For example,<ahref="/apps/conversion/?q=Cr%C3%AApes">http://r12a.github.io/apps/conversion/?q=Crêpes</a>. You can also pass a string with escapes in it, but you will need to be careful to percent escape characters such as &, + and # which affect the URI syntax. For example,<ahref="/apps/conversion/?q=CrU%2B00EApes"> http://r12a.github.io/apps/conversion/?q=CrU%2B00EApes</a>.</p>
358
+
359
+
<p><spanclass="leadin">Checkboxes.</span> Several of the output fields have checkboxes that allow you to slightly alter the results of a conversion. If an output field already contains a result when you click on a checkbox, you'll often see a change happen as you click. In a couple of cases, however, this doesn't happen, since it is not possible to produce good results.</p>
360
+
361
+
<p><spanclass="leadin">Invoking via URL.</span> You can also pass a string to the page using the q parameter in the URI. For example,<ahref="/app-conversion/beta?q=Cr%C3%AApes">http://r12a.github.io/app-conversion/?q=Crêpes</a>. You can also pass a string with escapes in it, but you will need to be careful to percent escape characters such as &, + and # which affect the URI syntax. For example,<ahref="/app-conversion/beta?q=CrU%2B00EApes"> http://r12a.github.io/app-conversion/beta?q=CrU%2B00EApes</a>.</p>
359
362
</section>
360
363
<section>
361
364
<h2>Box inputs and outputs</h2>
@@ -389,7 +392,7 @@ <h3>Decimal NCRs</h3>
389
392
390
393
391
394
392
-
<h3>JavaScript/Java/C escapes</h3>
395
+
<h3>JavaScript/Java/C</h3>
393
396
<p><strong> When conversion puts something here:</strong> By default, everything except visible ASCII characters is converted to numeric escapes, and the following escapes are substituted for ASCII characters: \0, \b, \t, \v, \f, \\.</p>
394
397
<p>The default output to this field is in the ES6 style, which is much more useful when dealing with supplementary characters (such as emoji), and is well supported by major browsers, except for Internet Explorer. To generate the old style escapes, or escapes for Java, deselect the<code>ES6-style</code> checkbox. A small number of Java-only named escapes such as<codeclass="kw"translate="no">\e</code> are rendered as numeric escapes.</p>
395
398
<p>If<code>C-style</code> is checked, supplementary characters are rendered by a single number, eight digits long, rather than two adjacent surrogate code point numbers.</p>
<p><strong> When conversion puts something here:</strong> By default, everything except visible ASCII characters is converted to \u{...} escapes, and the following escapes are substituted for ASCII characters: \0, \b, \t, \v, \f, \\. Output for other characters in the ranges U+0001-U+001F and U+0080-U+009F (ie. invisible control characters) uses the \x.. escape format.</p>
410
413
<p>If<code>\n etc</code> is checked, line feeds (\n), tabs (\t), and quotation marks (\") are also escaped.</p>
411
414
@@ -419,68 +422,75 @@ <h3>Rust/Ruby escapes</h3>
419
422
420
423
421
424
422
-
<h3>Perl escapes</h3>
425
+
<h3>Perl/UTR#18</h3>
426
+
<p><strong> When conversion puts something here:</strong> By default, everything except visible ASCII characters is converted to \x{...} escapes, and the following escapes are substituted for ASCII characters: \0, \b, \t, \v, \f, \\. Output for other characters in the ranges U+0001-U+001F and U+0080-U+009F (ie. invisible control characters) uses the \x.. escape format.</p>
427
+
<p>If<code>\n etc</code> is checked, line feeds, tabs, and quotation marks are also escaped.</p>
428
+
423
429
<p><strong>If you start a conversion from here:</strong> It can be a mix of text and escapes. Only the following types of escape are recognised:</p>
424
430
<ul>
425
431
<li>\x{1F468}</li>
426
432
<li>\x10</li>
427
433
<li>\0 \b \t \n \r \v \f \\ \"</li>
428
434
</ul>
429
-
<p><strong> When conversion puts something here:</strong> By default, everything except visible ASCII characters is converted to \x{...} escapes, and the following escapes are substituted for ASCII characters: \0, \b, \t, \v, \f, \\. Output for other characters in the ranges U+0001-U+001F and U+0080-U+009F (ie. invisible control characters) uses the \x.. escape format.</p>
430
-
<p>If<code>\n etc</code> is checked, line feeds, tabs, and quotation marks are also escaped.</p>
431
435
432
436
433
-
<h3>CSS escapes</h3>
434
-
<p><strong>If you start a conversion from here:</strong> It can be a mix of text and escapes.</p>
437
+
<h3>CSS</h3>
435
438
<p><strong> When conversion puts something here:</strong> It does not escape non-control ASCII characters. Output content uses 6-digit escape forms<em>followed by a space</em> for supplementary characters, and 4-digit escapes followed by a space for all other escaped characters.</p>
436
-
439
+
440
+
<p><strong>If you start a conversion from here:</strong> It can be a mix of text and escapes.</p>
441
+
437
442
438
443
<h3>Percent-encoding for URIs</h3>
439
-
<p><strong>If you start a conversion from here:</strong> It can be a mix of text and escapes. Only percent escapes are converted.</p>
440
444
<p><strong> When conversion puts something here:</strong> Characters allowed in URI syntax are not converted.</p>
441
445
446
+
<p><strong>If you start a conversion from here:</strong> It can be a mix of text and escapes. Only percent escapes are converted.</p>
442
447
443
448
444
-
<h3>Unicode U+hex notation</h3>
445
-
<p><strong>If you start a conversion from here:</strong> It can be a mix of text and escapes. Only U+hex escapes are converted.</p>
449
+
<h3>U+hex</h3>
446
450
<p><strong> When conversion puts something here:</strong> By default, everything except ASCII characters is converted.</p>
447
-
<p>You can use the checkboxes to specify whether ANSI (Latin1) characters remain unchanged, or whether all characters are converted. Adjacent escapes (only) are separated by a space.</p>
448
-
<pclass="warning"><strong>Note:</strong> These checkboxes only work during conversions, they don't change text already in the output field.</p>
449
-
<pclass="warning"><strong>Hint:</strong> to separate a sequence of characters by spaces, paste the characters into the<code>Mixed</code> field or<code>Characters</code> field and click<code>Convert</code>. Then click<code>Convert</code> immediately in the<code>Unicode U+hex notation</code> field and look in the<code>Characters</code> field for the result.</p>
451
+
<p>You can use the checkboxes to specify whether Latin1 characters remain unchanged, or whether all characters are converted.</p>
452
+
<p>If you want to insert spaces between adjacent escapes (only) click on the<code>Separate</code> button. Note, however, that if you now click on the<code>Convert</code> button for that field, the output will contain those extra spaces.</p>
453
+
<pclass="warning"><strong>Hint:</strong> to separate a sequence of characters by spaces, paste the characters into the field with a green background and click<code>Convert</code>. Then click<code>Separate</code> followed by<code>Convert</code> in the<code>U+hex</code> field and look in the<code>Characters</code> field for the result.</p>
454
+
455
+
<p><strong>If you start a conversion from here:</strong> It can be a mix of text and escapes. Only U+hex escapes are converted.</p>
456
+
450
457
458
+
<h3>0x...</h3>
459
+
<p><strong> When conversion puts something here:</strong> By default, everything except ASCII characters is converted. You can use the checkboxes to specify whether Latin1 characters remain unchanged, or whether all characters are converted.</p>
460
+
<p>If you want to insert spaces between adjacent escapes (only) click on the<code>Separate</code> button. Note, however, that if you now click on the<code>Convert</code> button for that field, the output will contain those extra spaces.</p>
461
+
<pclass="warning"><strong>Hint:</strong> to separate a sequence of characters by spaces, paste the characters into the field with a green background and click<code>Convert</code>. Then click<code>Separate</code> followed by<code>Convert</code> in the<code>0x...</code> field and look in the<code>Characters</code> field for the result.</p>
451
462
452
-
<h3>0x... hexadecimal notation</h3>
453
463
<p><strong>If you start a conversion from here:</strong> It can be a mix of text and hexadecimal 0x... escapes. Only 0x...escapes are converted.</p>
454
-
<p><strong> When conversion puts something here:</strong> By default, everything except ASCII characters is converted. You can use the checkboxes to specify whether ANSI (Latin1) characters remain unchanged, or whether all characters are converted. Adjacent escapes (only) are separated by a space.</p>
455
-
<pclass="warning"><strong>Note:</strong> These checkboxes only work during conversions, they don't change text already in the output field.</p>
456
-
<pclass="warning"><strong>Hint:</strong> to separate a sequence of characters by spaces, paste the characters into the<code>Mixed</code> field or<code>Characters</code> field and click<code>Convert</code>. Then click<code>Convert</code> immediately in the<code>0x... notation</code> field and look in the<code>Characters</code> field for the result.</p>
457
-
458
-
459
-
<h3>Hexadecimal code points</h3>
460
-
<p><strong>If you start a conversion from here:</strong> It can be a mix of text and hex numbers. Only hex numbers are converted.</p>
461
-
<pclass="warning"><strong>Note</strong> that a sequence of two or more characters in the range a-f, such as<samp>cafe</samp>, will be treated as a hexadecimal number representing a character.</p>
462
464
465
+
466
+
<h3>Hexadecimal code points</h3>
463
467
<p><strong> When conversion puts something here:</strong> By default, you'll see Hex numbers only, all separated by spaces. If you use the checkbox to specify whether ASCII or Latin1 (ANSI) characters remain unchanged, a space is inserted before a code point if the character just before it is in the range [A-Za-z0-9]. (</p>
464
468
<pclass="warning"><strong>Note:</strong> These checkboxes only work during conversions, they don't change text already in the output field.</p>
465
469
<pclass="warning"><strong>Note:</strong> After sending output to this box you will get a different result in the other boxes if you immediately click on<code>Convert</code> above this box.</p>
466
470
471
+
<p><strong>If you start a conversion from here:</strong> It can be a mix of text and hex numbers. Only hex numbers are converted.</p>
472
+
<pclass="warning"><strong>Note</strong> that a sequence of two or more characters in the range a-f, such as<samp>cafe</samp>, will be treated as a hexadecimal number representing a character.</p>
473
+
467
474
468
475
<h3>Decimal code points</h3>
469
-
<p><strong>If you start a conversion from here:</strong> It can be a mix of text and decimal numbers. Only decimal numbers are converted.</p>
470
476
<p><strong> When conversion puts something here:</strong> By default, you'll see decimal numbers only, all separated by spaces.</p>
471
477
<p>If you use the checkbox to specify whether ASCII or Latin1 (ANSI) characters remain unchanged, a space is inserted before a code point if the character just before it is in the range [A-Za-z0-9].</p>
472
478
<pclass="warning"><strong>Note:</strong> These checkboxes only work during conversions, they don't change text already in the output field.</p>
473
479
<pclass="warning"><strong>Note:</strong> After sending output to this box you will get a different result in the other boxes if you immediately click on<code>Convert</code> above this box.</p>
474
-
480
+
481
+
<p><strong>If you start a conversion from here:</strong> It can be a mix of text and decimal numbers. Only decimal numbers are converted.</p>
482
+
475
483
476
484
<h3>UTF-8 code units</h3>
477
-
<p><strong>If you start a conversion from here:</strong> It must be hexadecimal byte codes only, separated by spaces.</p>
478
485
<p><strong> When conversion puts something here:</strong> You'll see pairs of 2-digit hexadecimal numbers representing the bytes that make up the text when encoded in UTF-8.</p>
479
-
486
+
487
+
<p><strong>If you start a conversion from here:</strong> It must be hexadecimal byte codes only, separated by spaces.</p>
488
+
480
489
481
490
<h3>UTF-16 code units</h3>
482
-
<p><strong>If you start a conversion from here:</strong> It must be hexadecimal code units only, separated by spaces.</p>
483
491
<p><strong> When conversion puts something here:</strong> You'll see hexadecimal numbers of 1 to 4 digits representing the UTF-16 code units for the text converted. Supplementary characters are represented by two code units.</p>
492
+
493
+
<p><strong>If you start a conversion from here:</strong> It must be hexadecimal code units only, separated by spaces.</p>