Hakyll Pt. 2 – Generating a Sitemap XML File
Info
Summary | Generate a sitemap.xml file for your hakyll site. |
---|---|
Shared | 2018-11-17 |
Revised | 2023-02-11 @ 16:00 UTC |
This is part 2 of a multipart series where we will look at getting a website / blog set up withhakyll and customized a fair bit.
- Pt. 1 – Setup & Initial Customization
- Pt. 2 – Generating a Sitemap XML File
- Pt. 3 – Generating RSS and Atom XML Feeds
- Pt. 4 – Copying Static Files For Your Build
- Pt. 5 – Generating Custom Post Filenames From a Title Slug
- Pt. 6 – Pure Builds With Nix
- The hakyll-nix-template Tutorial
Overview
Adding a Sitemap Template
A sitemap.xml template, just likethe templates in the last post, receives context fields to work with (variables, essentially), and outputs the result of applying said context to the template. Here is what our sitemap template will look like today in our project’stemplates/sitemap.xml
:
<?xml version="1.0" encoding="UTF-8"?><urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9"xmlns:news="http://www.google.com/schemas/sitemap-news/0.9"xmlns:xhtml="http://www.w3.org/1999/xhtml"xmlns:mobile="http://www.google.com/schemas/sitemap-mobile/1.0"xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"xmlns:video="http://www.google.com/schemas/sitemap-video/1.1"><url><loc>$root$</loc><changefreq>daily</changefreq><priority>1.0</priority></url>$for(pages)$<url><loc>$root$$url$</loc><lastmod>$if(updated)$$updated$$else$$if(date)$$date$$endif$$endif$</lastmod><changefreq>weekly</changefreq><priority>0.8</priority></url>$endfor$</urlset>
Apart from the normal sitemap boilerplate, you can seeroot
,pages
,url
,date
andupdated
context fields. Whiledate
andupdated
would come from your metadata fields defined for a post, and theurl
is built from hakyll’sdefaultContext
, theroot
andpages
fields are custom defined in what will be our very ownsitemapCtx
context. In the next section, we’ll use this template to generate our sitemap.xml file.
Generating the Sitemap XML File
If youcreate a hakyll project from scratch, you will start out with a few files that we can add to our sitemap: *index.html
*about.rst
*contact.markdown
*posts/2015-08-12-spqr.html
*posts/2015-10-07-rosa-rosa-rosam.html
*posts/2015-11-28-carpe-diem.html
*posts/2015-12-07-tu-quoque.html
You should note that yoursite.hs
file also has the following:
main::IO()main=hakyllWithconfig$do-- ...match(fromList["about.rst","contact.markdown"])$doroute$setExtension"html"compile$pandocCompiler>>=loadAndApplyTemplate"templates/default.html"defaultContextmatch"posts/*"$doroute$setExtension"html"compile$pandocCompiler>>=loadAndApplyTemplate"templates/post.html"postCtx>>=loadAndApplyTemplate"templates/default.html"postCtx
It’s important that you understand that any files you want to be loaded and sent totemplates/sitemap.xml
must first bematch
ed andcompile
dbefore the sitemap can be built. If you don’t do this, you’ll pull your hair out wondering why the file (or folder) you’re trying to include in the sitemap never shows up.
Now, there is something that we are going to emulate to make this sitemap a reality (this should already be insite.hs
):
main::IO()main=hakyllWithconfig$do-- ...create["archive.html"]$dorouteidRoutecompile$doposts<-recentFirst=<<loadAll"posts/*"letarchiveCtx=listField"posts"postCtx(returnposts)`mappend`constField"title""Archives"`mappend`defaultContextmakeItem"">>=loadAndApplyTemplate"templates/archive.html"archiveCtx>>=loadAndApplyTemplate"templates/default.html"archiveCtx
Reading the code above, this essentially says 1. here’s a file we want to create thatdoes not yet exist (howcreate
differs frommatch
) 1. when you create the route, keep the filename (whatidRoute
does) 1. when you compile, load all the posts, specify what the context to send to each template will be, then make the item (the""
is an identifier… seethe source for more), then pass the context to the archive template and pass that on to the default template, ultimately building up a full webpage from the inside-out
Let’s change this 3-step rule to suit our needs before we wrangle the code. We want our rules to say: 1. here’s a file we want to create thatdoes not yet exist (sitemap.xml
) 1. when you create the route, keep the filename (whatidRoute
does) 1. when you compile, load all the posts, load all the other pages, specify what the context to send to each template will be, then make the item, then pass the context to the sitemap template, ultimately building up an XML file
This is almost the same! Let’s write it:
main::IO()main=hakyllWithconfig$do-- ...create["sitemap.xml"]$dorouteidRoutecompile$do-- load and sort the postsposts<-recentFirst=<<loadAll"posts/*"-- load individual pages from a list (globs DO NOT work here)singlePages<-loadAll(fromList["about.rst","contact.markdown"])-- mappend the posts and singlePages togetherletpages=posts<>singlePages-- create the `pages` field with the postCtx-- and return the `pages` value for itsitemapCtx=listField"pages"postCtx(returnpages)-- make the item and apply our sitemap templatemakeItem"">>=loadAndApplyTemplate"templates/sitemap.xml"sitemapCtx
This is starting to look good! But what’s wrong here? Remember theroot
context bits? We’re going to need to define what that is, and the best way that I’ve found right now is simply as aString
; if you want to do something fancy with configuration or reading it in dynamically, then go nuts.
root::Stringroot="https://ourblog.com"
With that defined, we can add it to our contexts:
main::IO()main=hakyllWithconfig$do-- ...create["sitemap.xml"]$dorouteidRoutecompile$doposts<-recentFirst=<<loadAll"posts/*"singlePages<-loadAll(fromList["about.rst","contact.markdown"])letpages=posts<>singlePagessitemapCtx=constField"root"root<>-- herelistField"pages"postCtx(returnpages)makeItem"">>=loadAndApplyTemplate"templates/sitemap.xml"sitemapCtx-- ...postCtx::ContextStringpostCtx=constField"root"root<>-- heredateField"date""%Y-%m-%d"<>defaultContext
Hint: if the<>
is throwing you for a loop, it’s defined as the same as thing asmappend
.
See how we definedconstField "root" root
in two places? We’re talking about two different contexts here: thesitemap context and thepost context. While you could have thepostCtx
be combined with thesitemapCtx
, thus giving thepages
field access to theroot
field, you probably want to useroot
(and perhaps other constants) wherever you work with posts, so adding them topostCtx
for use everywhere seems like the right thing to do.
Once you’ve got all this, run the following to build (or rebuild) yourdocs/sitemap.xml
file: 1.λ stack build
1.λ stack exec site clean
1.λ stack exec site build
Yourdocs/sitemap.xml
should now have all your pages defined in it!
Adding Other Pages and Directories
We’ve done some epic traveling in New Zealand and now want to include a bunch of pages we’ve written in the sitemap. Those pages are: *new-zealand/index.md
*new-zealand/otago/index.md
*new-zealand/otago/dunedin-area.md
*new-zealand/otago/queenstown-area.md
*new-zealand/otago/wanaka-area.md
First, we make sure that our pages get compiled (we’ll usepostCtx
for them):
main::IO()main=hakyllWithconfig$do-- ...match"new-zealand/**"$doroute$setExtension"html"compile$pandocCompiler>>=loadAndApplyTemplate"templates/post.html"postCtx>>=loadAndApplyTemplate"templates/default.html"postCtx
And then we want to make sure we add them to ourcreate
function:
main::IO()main=hakyllWithconfig$do-- ... match code up herecreate["sitemap.xml"]$dorouteidRoutecompile$doposts<-recentFirst=<<loadAll"posts/*"singlePages<-loadAll(fromList["about.rst","contact.markdown"])nzPages<-loadAll"new-zealand/**"-- hereletpages=posts<>singlePages<>nzPages-- heresitemapCtx=constField"root"root<>listField"pages"postCtx(returnpages)makeItem"">>=loadAndApplyTemplate"templates/sitemap.xml"sitemapCtx
I could not figure out how to mix globs (new-zealand/**
) in with individual file paths (included infromList
), so I had to load them separately; if you figure out how, let me know!
Once you’ve got all this, run the following to rebuild yourdocs/sitemap.xml
file: 1.λ stack build
1.λ stack exec site rebuild
Wrapping Up
In this lesson we learned how to dynamically generate a sitemap.xml file usinghakyll. Next time, we’ll use these same skills to generate our own RSS and Atom XML feeds.
Next up: *Pt. 3 – Generating RSS and Atom XML Feeds *Pt. 4 – Copying Static Files For Your Build *Pt. 5 – Generating Custom Post Filenames From a Title Slug *(wip) Pt. 6 – Customizing Markdown Compiler Options
Thank you for reading!
Robert