You know aboutzero-width spaces right? It's an invisible character that seems not so useful at first glace.
Most of my experience with them has been negative because they usually show up in a random file I'm parsing and it is the cause of a bizarre bug. Or, it's been the cause of a copy/paste search that yields no results even though there are matches sitting in front of my eyes.
Well, despite my apathy for them, I actually found a good use for them.
I've recently been building outxertz, a static site generator written in TypeScript. It's being used to build this here site. Something that had been bugging me was the indentation of the rendered HTML. I use Handlebars.js templates and include raw HTML which has been converted from Markdown. My templates looks something like the following, where{{ content_html }}
is the raw HTML bit:
<body> {{> header }} {{> sidebar }}<articleclass="post"> {{{ content_html }}}</article> {{> footer }}</body>
The problem is, newlines in{{ content_html }}
do not get indented so the actual output looks something like this:
<body><header>My Site</header><aside>Sidebar here</aside><articleclass="post"><p>Lorem ipsum dolor sit amet, usu an justo deterruisset. Est ad discere nominati,erroribus dissentias mei ne, appetere qualisque eloquentiam sea et.</p><imgalt="An image"src="my-image.jpg"/><p>Lorem ipsumdolor sit amet, usu an justo deterruisset. Est ad discere nominati</p></article><footer>My footer</footer></body>
That's a simple example. In practice it's much worse. Yes, this has no effect on the actual layout and rendering of the webpage
but I'm a developer and care about the source and what it looks like.
I tried to create a Handlebars helper called "indent" so I could indent each newline. I called it like this:
...<articleclass="post"> {{{ indent content_html 2 }}}</article>...
The helper is pretty simple. It just replaces newlines (\n) with a newline followed by a number of spaces:
functionindent(input:string,width:number){constintendation=input.replace(/\n/g,"\n"+newArray(width).join(""));returninput.replace(/\n/g,match=>match.replace(/\n/,`\n${intendation}`)}
That worked pretty well until my<pre>
code blocks started looking like this:
const my_var; const another_var;
The HTML looked like this:
<body><header>My Site</header><aside>Sidebar here</aside><articleclass="post"><p>Below is a code block</p><pre>const my_var; const another_var;</pre><p>Lorem ipsum dolor sit amet, usu an justo deterruisset. Est ad discere nominati, dissentias mei ne</p></article><footer>My footer</footer></body>
Uh oh, I'm getting indentation in my code blocks. Yes, since my code blocks are using<pre>
any spaces inside of them will be rendered as is.<pre>
means "preformatted" after all.
So then I thought I neededhint to signal to theindent helper to skip indentation in these<pre>
tags.
Adding hints in the<pre>
tags seemed feasible because I usePrism withMarked to convert Markdown code blocks like:
```javascriptconst my_var = "Hello";```
into<pre>
blocks. It's quite easy to modify the output of these tags because you provide a function that returns something like this:
return`<prep">${className}"><codep">${className}">${code}</code></pre>`;
Easy to modify, yes. I thought, "can I add some character(s) to end of<pre>
tags lines that my indent helper could skip?" But since I'm using Regex to add the indentation in myindent helper, I can only use a single character to be able to include a negated character (e.g.[^!]
) in my RegEx without having to do a negative look-behind (Javascript doesn't support these anyway).
Ok, so, I just need Prism to add a single character that will not be visible to the end of lines that are inside of<pre>
blocks. Then myindent helper can ignore these. How do I do this?
Zero-width spaces, of course!
Now, my code formatting function preceeds newlines in my code blocks with a zero-width space. It looks like this:
constcodeWithNewlineHints=code.replace(/\n/g,// Prepend each newline with a zero-width space character so we can signal to any upstream formatting to leave the formatted code alone."\u200b\n");return`<prep">${className}"><codep">${className}">${codeWithNewlineHints}</code></pre>`;
In myindent helper, I simply ignore lines containing these characters preceding a newline.
constintendation=input.replace(/\n/g,"\n"+newArray(width).join(""));returninput.replace(/\n/g,match=>match.replace(/[^\u200b]\n/,`\n${intendation}`)
See that RegEx there?/[^\u200b]\n/
means only match newlines if they arenot preceded by a zero-width character (\u200b). So, with this, indentation will only be added to lines not preceeded by these characters.
I've gained a newfound respect zero-width spaces.
Top comments(0)
For further actions, you may consider blocking this person and/orreporting abuse