One of the ways I put my talents as a software developer to use for the churchis I maintain theCatholic Stuff You ShouldKnow website. I love reading stories aboutinteresting bugs other developers have encountered, so I thought I’d share myown story about a bug I recently fixed oncatholicstuffpodcast.com.
To understand the bug, it will be helpful to have a little bit of the backstory.The Catholic Stuff You Should Know website wasoriginallybuilt withWordPress. In 2016, I migrated the websiteoff of WordPress and ontoJekyll (which is what thesite still uses today). As part of that migration, I added episode numbers toall of our existing episodes to help keep things organized.
I even still have the simple PHP script I used to pull these episode numbers outof the audio filenames (which already had them) and add them to the Jekyllfrontmatter for the episode so we could show the episode numbers on the website.It’s a little bit hacky, but it got the job done.
<?php$postFiles=scandir("./_posts");foreach($postFilesas$file){if($file=="."||$file==".."){continue;}$contents=file_get_contents("./_posts/".$file);preg_match("/audio: '(\d{1,3})-/",$contents,$matches);$number=$matches[1];print$number."\n";$contents=str_replace("\ntitle:","\nnumber:$number\ntitle:",$contents);file_put_contents("./_posts/".$file,$contents);}
Recently, one of our listeners notified us that the episode numbers for some ofour episodes from summer 2010 appeared to be incorrect. I took a peek, and sureenough, the listener was right!
I looked around a little more and compiled this list of our first 24 episodes,as they appeared on our website.
06 Jan 2010 · #1 Stylites12 Jan 2010 · #2 Indulgences15 Jan 2010 · #3 Prayer, Contemplation, and Liturgy20 Jan 2010 · #4 Tetragrammaton25 Jan 2010 · #5 Le Grande Chartreuse28 Jan 2010 · #6 Eutrapelia and The Risus Paschalis03 Feb 2010 · #7 Who Punched Arius?09 Feb 2010 · #008 Cecchina Cabrini18 Feb 2010 · #009 Ash Wednesday25 Feb 2010 · #8 Ethiopian Christianity03 Mar 2010 · #9 Campion's Brag11 Mar 2010 · #10 Gregorian Chant01 Apr 2010 · #11 Tenebrae09 Apr 2010 · #12 The Holy Sepulcher16 Apr 2010 · #13 Bona Coniugali23 Apr 2010 · #14 How to Make a Priest24 May 2010 · #15 Skellig Michael01 Jun 2010 · #019 Abstinence01 Jun 2010 · #018 Schisms15 Jun 2010 · #16 Quiz Show14 Sep 2010 · #17 Eros and Agape21 Sep 2010 · #18 New Translation of the Mass28 Sep 2010 · #19 When Bad Popes Go Good05 Oct 2010 · #20 Peter's Bones
If you can correctly identify the bug from that list, well done! And if youcan’t… Neither could I, initially. I was perplexed – there’s nothingcomplicated about the way our episode numbers work. We simply pull them from theYAML frontmatter for the jekyll blog post. From this initial list I hadcompiled, it looked like whatever was assigning episode numbers had somehowignored a few episodes in the numbering.
Because this is a Jekyll site, each “episode” is just a “post”, which is just amarkdown file in a folder in the repository. I broke out some Unix tools to tryto pin down the problem. As I write this, the most recent episode on our websiteis #471. And that matches the number of files in our_posts
directory.
$ ls _posts/*.md | wc -l471
Well, it’s good news that our total number of episodes matches the currentepisode number. But what the heck is going on then? Knowing that our frontmatteruses anumber:
key for episode numbers, I tried this:
$ cat _posts/*.md | grep "number:" | cut -f 2 -d ' ' | sort -n001002003...469470471
At a glance, that list looks correct. Let’s look at all frontmatter “source” forall the numbers for all episodes from 2010.
$ grep "number:" _posts/2010-*_posts/2010-01-06-stylites.md:number: 001_posts/2010-01-12-indulgences.md:number: 002_posts/2010-01-15-prayer-contemplation-and-liturgy.md:number: 003_posts/2010-01-20-tetragrammaton.md:number: 004_posts/2010-01-25-le-grande-chartreuse.md:number: 005_posts/2010-01-28-eutrapelia-and-the-risus-paschalis.md:number: 006_posts/2010-02-03-who-punched-arius.md:number: 007_posts/2010-02-09-cecchina-cabrini.md:number: 008_posts/2010-02-18-ash-wednesday.md:number: 009_posts/2010-02-25-ethiopian-christianity.md:number: 010...
Everything’s apparently correct!
Still confused, I started looking at a specific post in more detail. I startedwith February 2010, the first month where broken posts show up. And I noticedthat the “Ethiopian Christianity” episode was showing number 8 on the site butthe frontmatter was number 10:
title:'EthiopianChristianity'number:010
Can you spot the bug? Our episode numbers were padded with zeros - a commonpractice that makes them sort correctly in filenames. (Remember, I initiallypulled these episode numbers from our audio filenames.) But in this case, Ruby’sYAML parser was interpreting the number as an octal number because of theleading 0!
irb>require'yaml'irb>YAML.load("---\nnumber: 010")=>{"number"=>8}
Fortunately for us, this ceases to be a problem after episode 100 (when we nolonger have any leading zeros). So our most recent 371 episodes are numberedcorrectly. But some episodes between 1 and 100 have incorrect numbering. Infact,most do. All numbers between 7 and 100 that are valid octal numbers willbe incorrect because only numbers 1-7 have the same representation in octal andbase 10. Some numbers, like009
, don’t parse correctly as an octal number. Ithink, in this case, the YAML parser falls back to a string, which is why theleading zeros showed up on the web for these numbers.
Although the problem was hard to identify, The fix is easy – we just need toremove the leading zeros from our YAML!
sed-i's/number: [0]*\([1-9]*\)/number: \1/' _posts/*.md
So that’s how we found and fixed an episode numbering but for the Catholic StuffYou Should Know website. I hope you enjoyed the story. This wasn’t the first bugI’ve seen related to parsing octals, and it probably won’t be the last! So it’sgood to remember this can happen any time you see a leading zero on a number -particularly number literals in YAML or code. And despite this small bug,working with Jekyll for the last 4 years has been a pleasure overall.
Catholic Stuff You Should Know is a weekly podcast that explores Catholictopics. Learn more atcatholicstuffpodcast.com.