Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Cover image for Introduction to Code Generation in Rust
CrabNebulaDev profile imageCrabNebula
CrabNebula forCrabNebulaDev

Posted on • Originally published atcrabnebula.dev

     

Introduction to Code Generation in Rust

This article is about generating Rust code from other Rust code,not forthe code generation step of the rustc compiler. Another term for source code generation is metaprogramming, but it will be referred to as code generation here. The reader is expected to have some Rust knowledge.

What problems can it solve?

I want to ship a web frontend embedded inside a Rust binary to end users, such as a desktop application. Projects likeTauri achieve embedding with code generation by writing Rust code that generates more Rust code. Why does Tauri choose to use code generation over less complicated solutions? Let’s take a look at what that solution might look like.

Imagine the output of our web frontend looks like:

dist├── assets│  ├── script-44b5bae5.js│  ├── style-48a8825f.css├── index.html
Enter fullscreen modeExit fullscreen mode

Let’s embed these in our Rust project by usinginclude_str!(), which adds the content of the specified file into the binary. That would look something like this:

usestd::collections::HashMap;fnmain(){letmutassets=HashMap::new();assets.insert("/index.html",include_str!("../dist/index.html"));assets.insert("/assets/script-44b5bae5.js",include_str!("../dist/assets/script-44b5bae5.js"));assets.insert("/assets/style-48a8825f.css",include_str!("../dist/assets/style-48a8825f.css"));}
Enter fullscreen modeExit fullscreen mode

Straightforward enough, now we can grab those assets directly from the final binary! However, what if we don’t always know the assets’ filenames ahead of time? Let’s say we have worked more on our frontend project and now its output looks like:

dist├── assets│  │# script-44b5bae5.js previously│  ├── script-581f5c69.js│  ││  │# style-48a8825f.css previously│  ├── style-e49f12aa.css├── index.html
Enter fullscreen modeExit fullscreen mode

Ah… the filenames of our assets have changed due to our frontend bundler utilizingcache busting. The Rust code no longer compiles until we fix the filenames inside of it. It would be a terrible developer experience if we had to update our Rust code every time we changed the frontend - imagine if we had dozens of assets! Tauri uses code generation to avoid this by finding the assets at compile time and generating Rust code which calls the correct assets.

Tools

Let’s talk about a few tools for code generation and then use them to implement our own simple asset bundler.

  • Thequote crate enables us to write Rust code that gets transformed into data which then generates syntactically correct Rust code. This crate is ubiquitous across the Rust ecosystem for writing code generation.

  • Thewalkdir crate provides an easy way to recursively grab all items in a directory. This crate is highly applicable for our asset bundler use-case.

  • Thephf crate implements a HashMap implementation usingperfect hash functions. This is useful when all keys and values in the map are known before it’s built. This crate is highly applicable for our asset bundler use-case.

Rust code generation typically occurs inbuild scripts ormacros. We will be building our simple asset bundler using build scripts because we will be accessing the disk. Whileprocedural macros can also do that, it can be problematic in a few ways.

Building the Assets Bundler

The source code isavailable on GitHub if you want to see how everything is put together afterwards.

Create our library

Let’s start off with creating a new Rust library:

cargo new--lib asset-bundlercdasset-bundler
Enter fullscreen modeExit fullscreen mode

We want to create a way for applications that use this library to grab the assets, so let’s create that first. This will involve us creating a wrapper aroundphf::Map and a method to let callers get the content.

cargo add phf--features macros
Enter fullscreen modeExit fullscreen mode

We don’t need too much functionality from ourAssets struct, just a way to create it and a way to get at what’s inside of it. The following goes intosrc/lib.rs:

pubusephf;// re-export phf so we can use it latertypeMap=phf::Map<&'staticstr,&'staticstr>;/// Container for compile-time embedded assets.pubstructAssets(Map);implFrom<Map>forAssets{fnfrom(value:Map)->Self{Self(value)}}implAssets{/// Get the contents of the specified asset path.pubfnget(&self,path:&str)->Option<&str>{self.0.get(path).copied()}}
Enter fullscreen modeExit fullscreen mode

Codegen

Now, we build the library that will be used in a build script to generate our code. Because we will be having multiple crates in the same repository, let’s quickly convert the project to acargo workspace. Let’s add the following to the top of ourCargo.toml:

[workspace]members=["codegen"]
Enter fullscreen modeExit fullscreen mode

Now we are ready to continue creating our codegen library. Run these commands to create our project and grab our dependencies:

cargo new--lib codegen--name asset-bundler-codegencargo add quote walkdir--package asset-bundler-codegen
Enter fullscreen modeExit fullscreen mode

Time to think a bit of what functionality we need and boil it down into a few concrete steps.

  • We pass an assets path to our function, which we will callbase.

  • We check ifbase exists, or else we can’t do anything.

  • Recursively gather all file paths insidebase.

  • Generate code to embed all the file paths.

One last thing to mention, we want to get assets by passing in a relative path. We wantassets.get("index.html"), notassets.get("../dist/index.html"). This means we will need to keep track of thatbase directory passed into our function. Let’s write those requirements down as code inside ofcodegen/src/lib.rs:

/// Generate Rust code to create an [`asset-bundler::Asset`] from the passed path.pubfncodegen(path:&Path)->std::io::Result<String>{// canonicalize also checks if the path exists// which is the only case that makes sense for usletbase=path.canonicalize()?;letpaths=gather_asset_paths(&base);Ok(generate_code(&paths,&base))}/// Recursively find all files in the passed directory.fngather_asset_paths(base:&Path)->Vec<PathBuf>{todo!()}/// Generate Rust code to create an [`asset-bundler::Asset`].fngenerate_code(paths:&[PathBuf],base:&Path)->String{todo!()}
Enter fullscreen modeExit fullscreen mode

Let’s take ongather_assets_paths first, since it’s more specific to our project than codegen. We will usewalkdir to recursively grab all the files from the passedbase directory. This is a simple example project, so we will ignore errors for now by usingflatten() which removes nested iterators. BecauseResult also implement’sIntoIterator, we are only left with successful values. Let’s implement it incodegen/src/lib.rs:

/// Recursively find all files in the passed directory.fngather_asset_paths(base:&Path)->Vec<PathBuf>{letmutpaths=Vec::new();forentryinWalkDir::new(base).into_iter().flatten(){// we only care about files, ignore directoriesifentry.file_type().is_file(){paths.push(entry.into_path())}}paths}
Enter fullscreen modeExit fullscreen mode

Cool cool cool.

Now we have a list of all asset files that are supposed to be included in the binary. The second function will generate the actual Rust code, but let’s see what the code we are generating should look like. We need to make sure that:

  • We import the correct dependencies.

  • Thephf::Map is created with all the values, we can usephf::phf_map! to help.

  • OurAssets struct from our first library is created.

The first point is pretty important, we need to make sure we are calling the correct library. We can prevent crate name collisions by using a leading:: on ouruse statement. Additionally, we need to make sure we have our re-exportedphf, otherwise the end application will fail to compile if it itself doesn’t depend onphf.

Using the frontend example from above, this is howphf_map! should look like:

use::asset_bundler::{Assets,phf::{self,phf_map}};letmap=phf_map!{"index.html"=>include_str!("../dist/index.html"),"assets/script-44b5bae5.js"=>include_str!("../dist/assets/script-44b5bae5.js"),"assets/style-48a8825f.css"=>include_str!("../dist/assets/style-48a8825f.css")};letassets=Assets::from(map);
Enter fullscreen modeExit fullscreen mode

Our first problem comes from us only having the paths used ininclude_str!(), we don’t have the “key” paths. We also need to turn our paths into strings at some point, because that is how they are used in the generated code. Let’s first figure out how to transform our list of paths into a list of strings suitable for keys. We need to strip thebase prefix we resolved earlier from all the paths, so let’s write that inside ofcodegen/src/lib.rs:

/// Turn paths into relative paths suitable for keys.fnkeys(paths:&[PathBuf],base:&Path)->Vec<String>{letmutkeys=Vec::new();forpathinpaths{// ignore this failure case for this exampleifletOk(key)=path.strip_prefix(base){keys.push(key.to_string_lossy().into())}}keys}
Enter fullscreen modeExit fullscreen mode

Thevalues of the map are easier. Their paths are already the ones [include_dir!()] need, so we just need to turn them into strings. Let’s write this one with an Iterator, which we also could have done withkeys:

letvalues=paths.iter().map(|p|p.to_string_lossy());
Enter fullscreen modeExit fullscreen mode

So now we have bothkeys andvalues in usable formats. Next comes the macro part, where we will actually be generating code from all the data.

Let’s talk about how we are about to use double brackets. This isnot something required when doing code generation, but in our case we want to use the resultingAssets anywhere. By using a block expression we can use it anywhere an expression is valid, which is lots of places.

Second, we are about to use some very unfamiliar syntax for those of you who have not written macros before. While it may seem strange at first, the syntax here is widely used across the ecosystem. In particular, we are going to be using the repetition syntax ofquote. This allows us to use our two collections ofkeys andvalues together.

Let’s do it:

quote!{{use::asset_bundler::{Assets,phf::{self,phf_map}};Assets::from(phf_map!{    #( #keys=>include_str!(#values)),*})}}
Enter fullscreen modeExit fullscreen mode

While the syntax is surely a departure from normal Rust code, hopefully you are able to recognize some familiar patterns we already went over. Here’s a side-by-side comparison to thephf_map! example we did before:

letkeys=["key1","key2","key3"];letvalues=["value1","value2","value3"];quote!{phf_map!{    #( #keys=>include_str!(#values)),*}}// turns into thisphf_map!{"key1"=>include_str!("value1"),"key2"=>include_str!("value2"),"key3"=>include_str!("value3")}
Enter fullscreen modeExit fullscreen mode

With all that out of the way, let’s plug that into ourgenerate_code function we created earlier to see how it interacts with the rest of the code. Inside ofcodegen/src/lib.rs:

/// Generate Rust code to create an [`asset-bundler::Asset`].fngenerate_code(paths:&[PathBuf],base:&Path)->String{letkeys=keys(paths,base);letvalues=paths.iter().map(|p|p.to_string_lossy());// double brackets to make it a block expressionletoutput=quote!{{use::asset_bundler::{Assets,phf::{self,phf_map}};Assets::from(phf_map!{            #( #keys=>include_str!(#values)),*})}};output.to_string()}/// Turn paths into relative paths suitable for keysfnkeys(paths:&[PathBuf],base:&Path)->Vec<String>{letmutkeys=Vec::new();forpathinpaths{// ignore this failure case for this exampleifletOk(key)=path.strip_prefix(base){keys.push(key.to_string_lossy().into())}}keys}
Enter fullscreen modeExit fullscreen mode

Phew! That actually wraps up the codegen library. I’ll drop the fullcodegen/src/lib.rs here, and then we can skedaddle to actually using what we just worked on:

usequote::quote;usestd::path::{Path,PathBuf};usewalkdir::WalkDir;/// Generate Rust code to create an [`asset-bundler::Asset`] from the passed path.pubfncodegen(path:&Path)->std::io::Result<String>{// canonicalize also checks if the path exists// which is the only case that makes sense for usletbase=path.canonicalize()?;letpaths=gather_asset_paths(&base);Ok(generate_code(&paths,&base))}/// Recursively find all files in the passed directory.fngather_asset_paths(base:&Path)->Vec<PathBuf>{letmutpaths=Vec::new();forentryinWalkDir::new(base).into_iter().flatten(){// we only care about files, ignore directoriesifentry.file_type().is_file(){paths.push(entry.into_path())}}paths}/// Generate Rust code to create an [`asset-bundler::Asset`].fngenerate_code(paths:&[PathBuf],base:&Path)->String{letkeys=keys(paths,base);letvalues=paths.iter().map(|p|p.to_string_lossy());// double brackets to make it a block expressionletoutput=quote!{{use::asset_bundler::{Assets,phf::{self,phf_map}};Assets::from(phf_map!{            #( #keys=>include_str!(#values)),*})}};output.to_string()}/// Turn paths into relative paths suitable for keys.fnkeys(paths:&[PathBuf],base:&Path)->Vec<String>{letmutkeys=Vec::new();forpathinpaths{// ignore this failure case for this exampleifletOk(key)=path.strip_prefix(base){keys.push(key.to_string_lossy().into())}}keys}
Enter fullscreen modeExit fullscreen mode

Using it

We just made a simple asset bundler in 50 lines of code, and it’s time to use it! We will start off with creating a new example project to consume the two libraries we just created.

First, add a new item to the rootCargo.toml:

[workspace]members=["codegen","example"]
Enter fullscreen modeExit fullscreen mode

Then, we create the example binary and add our dependencies:

cargo new--bin examplecargo add asset-bundler--path.--package examplecargo add--build asset-bundler-codegen--path codegen--package exampletouchexample/build.rsmkdir-p example/assets/scripts
Enter fullscreen modeExit fullscreen mode

Let’s start off the Rust code with the build script since we just created our codegen library. We will want to call thecodegen function we created earlier to get the generated code. Now we can write this generated Rust code to somewhere our other code can use it. This is going into ourexample/build.rs:

usestd::path::Path;fnmain(){letassets=Path::new("assets");letcodegen=matchasset_bundler_codegen::codegen(assets){Ok(codegen)=>codegen,Err(err)=>panic!("failed to generate asset bundler codegen: {err}"),};letout=std::env::var("OUT_DIR").unwrap();letout=Path::new(&out).join("assets.rs");std::fs::write(out,codegen.as_bytes()).unwrap();}
Enter fullscreen modeExit fullscreen mode

We ended up writing the code to$OUT_DIR/assets.rs because build scripts set$OUT_DIR to a unique directory for each crate, and new versions of the same crate. The path we just wrote to will be important in just a second, but first let’s create some assets to actually use.

We want to create some assets that are somewhat representative of the examplewe used at the start. In this case, let’s imagine that these assets are for a webserver and the files are served to the browser. This article isn’t the place for implementing the server, but we will mimic theindex.html’s script dependencies by using what asset they require as their contents. Run these commands to create them:

echo-n"scripts/loader-a1b2c3.js"> example/assets/index.htmlecho-n"scripts/dashboard-f0e9d8.js"> example/assets/scripts/loader-a1b2c3.jsecho-n"console.log('dashboard stuff')"> example/assets/scripts/dashboard-f0e9d8.js
Enter fullscreen modeExit fullscreen mode

It’s time to put it together and get a glimpse of how it works! We set up the examples so that there is only a single “always known” filenameindex.html. Our goal is to get the content of that dashboard script using only aindex.html literal. Here we will jump to the each next asset inexample/src/main.rs:

fnmain(){// include the assets our build script createdletassets=include!(concat!(env!("OUT_DIR"),"/assets.rs"));letindex=assets.get("index.html").unwrap();letloader=assets.get(index).unwrap();letdashboard=assets.get(loader).unwrap();assert_eq!(dashboard,"console.log('dashboard stuff')");}
Enter fullscreen modeExit fullscreen mode

Don’t forget, you can seeall the code on GitHub.

That’s it!

A very bare-bones asset bundler in 94 lines of code, including the example. Treating code generation like any other Rust code is an important aspect to keeping it understandable and maintainable. In those 90 lines of code, there were only a handful of lines for doing actual code generation. Let’s break down what we did…

  • We created theasset-bundler crate that provides theAssets type and re-exportedphf to ensure that our codegen crate could use it.

  • We created theasset-bundler-codegen crate to hold all the functionality codegen uses, along with providing a public functioncodegen to utilize it.

  • We created the example build script to call thecodegen function on its own assets. The generated code was written to a file which we then included in ourexample/main.rs.

While having a separate crate isn’t necessary for specifically build script code generation, it is very common. Not only does it help separate concerns and prevent unused dependencies, it also helps prevent circular dependencies on more complex projects. Having a separate crate isrequired for performing code generation withprocedural macros.

Code generation is a powerful tool to bring advanced functionality to your Rust programs. Our example from earlier,Tauri, uses it extensively to perform code injection, compression, and validation for its own asset bundling.

Demystify code generation by writing it as regular Rust code, empowering you to build powerful software.


Author: Chip Reed, Security Engineer atCrabNebula

Top comments(0)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

Build, secure, and ship the next generation of apps.

More fromCrabNebulaDev

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp