Serverless Rust using WASM and Cloudflare

I run a website for Haskellers. People are able to put their email addresses on this website for others to contact them. These email addresses were historically protected by Mailhide, which would use a Captcha to prevent bots from scraping that information. Unfortunately, Mailhide was shut down. And from there, Sorta Secret was born.

Sorta Secret provides a pretty simple service, as well as a simple API. Using the encrypt endpoint, you can get an encrypted version of your secret. Using the show endpoint, you can get a webpage that will decrypt the information after passing a Recaptcha. That's basically it. You can go to my Haskellers profile and click "Reveal email address" to see this in action.

I originally wrote Sorta Secret a year ago in Rust using actix-web and deployed it, like most services we write at FP Complete, to our Kubernetes cluster. When Rust 1.39 was released with async/await support, and then Hyper 0.13 was released using that support, I decided I wanted to try rewriting against Hyper. But that's a story for another time.

After that, more out of curiosity than anything else, I decided to rewrite it as a serverless application using Cloudflare Workers, a serverless platform that supports Rust and WASM. To quote the Cloudflare page on the topic:

Serverless computing is a method of providing backend services on an as-used basis. A Serverless provider allows users to write and deploy code without the hassle of worrying about the underlying infrastructure.

This post will describe my experiences doing this, what I thought worked well (and not so well), and why you may consider doing something like this yourself.

Advantages

Let me start off with the major advantages of using Cloudflare Workers over my previous setup:

Geographic distribution A typical hosting setup, including the Kubernetes cluster I deploy to, is set up in a single geographic location. For an embarrassingly parallel application like this, having your code run in all of Cloudflare's data centers is pretty awesome.
Setup time/cost I already have access to a Kubernetes cluster. But for someone who doesn't already have a preexisting server or cluster to deploy their service, the time to set up a secure, high availability deployment environment, and the cost of running these machines, can be high. I'm currently paying $0 to host this service on Cloudflare.
Ease of testing/deployment The Cloudflare team has done a great job with the Wrangler tool. Deploying an update is a call to wrangler publish. I can do testing with wrangler preview --watch. This is pretty awesome. And the publishing is fast.

Disadvantages

There are definitely some hurdles to overcome along the way.

Lack of examples I found it very difficult to get even basic things working correctly. I'm hoping this post helps with that.
WASM libraries didn't work perfectly Most libraries designed to help with WASM are targeted at the browser. In a Cloudflare Worker, for example, there's no Window. Instead, to call fetch, I needed a ServiceWorkerGlobalScope.
Slower dev cycle than I like While wrangler preview is awesome, it still takes quite a bit of time to see a change. Each code change requires recompiling the Rust code, packaging up the bundle, sending it to Cloudflare, and a refresh of the page. Especially since I was using compile-time-checked HTML templates, this ended up being pretty slow.
Secrets management Unlike Kubernetes, there's no built in secrets management in Cloudflare Workers. Someone on the Cloudflare team advised me that I could use their key/value store for secrets. I elected to be really dumb and compile the secrets (encryption key and Recaptcha secret key) directly into the executable.
Difficult debugging It seems that the combination of async code, panics, and the bridge to JavaScript results in error messages getting completely dropped, which makes debugging very difficult.

That's enough motivation and demotivation for now. Let's see how this all fits together.

Getting started

The Cloudflare team has put together a very nice command line tool, wrangler, which happens to be written in Rust. Getting started with a brand new Cloudflare Workers Rust project is nice and easy, you don't even need to set up an account or provide any credentials.

cargo install wrangler
wrangler generate wasm-worker https://github.com/cloudflare/rustwasm-worker-template.git
cd wasm-worker
wrangler preview --watch

The problem is that this template doesn't do much. There's a Rust function called greet that returns a String. That Rust function is exposed to the JavaScript world via wasm-bindgen. There's a small JavaScript wrapper that imports that function and calls it when a new request comes in. However, we want to do a lot more in this application:

Perform routing inside Rust
Perform async operations (specifically making requests to the Recaptcha server)
Generating more than just 200 success status responses
Parse submitted JSON bodies
Use HTML templating

So let's dive down the rabbit hole!

wasm-bindgen

I've played with WASM a bit before this project, but not much. Coming up to speed with wasm-bindgen was honestly pretty difficult for me, and involved a lot of trial-and-error. Ultimately, I discovered that I could probably get away with one of two approaches for the binding layer between the JavaScript and Rust worlds:

Have a thin wrapper in JavaScript that produces simple JSON objects, and then use serde inside Rust to turn those into nice structs
Use the Request and Response types in web-sys directly

I discovered the first approach first, and went with it. I briefly played with moving over to the second approach, but it involved a lot of overhaul to the code, so I ended up sticking with my approach 1. Those more skilled with this may disagree with this approach. Anyway, here's what the JavaScript half of this looks like:

const { respond_wrapper } = wasm_bindgen;
await wasm_bindgen(wasm)

var body;
if (request.body) {
    body = await request.text();
} else {
    body = "";
}

var headers = {};
for(var key of request.headers.keys()) {
    headers[key] = request.headers.get(key);
}

const response = await respond_wrapper({
    method: request.method,
    headers: headers,
    url: request.url,
    body: body,
})
return new Response(response.body, {
    status: response.status,
    headers: response.headers,
})

Some interesting things to note here:

I'm pulling in the entire request body as a string. That works for our case (the only request body is form data), but isn't intelligent enough in general.
The respond_wrapper itself is returning a Promise on the JavaScript side. We're about to see some wasm-bindgen awesomeness.
There's not much work to convert between the simplified JSON values and the real JavaScript objects.

Now let's look at the Rust side of the equation. First we've got our Request and Response structs with appropriate serde deriving:

#[derive(Deserialize)]
pub struct Request {
    method: String,
    headers: HashMap<String, String>,
    url: String,
    body: String, // should really be Vec<u8>, I'm cheating here
}

#[derive(Serialize)]
pub struct Response {
    status: u16,
    headers: HashMap<String, String>,
    body: String,
}

Within the Rust world we want to deal exclusively with these types, and so our application lives inside a function with signature:

async fn respond(req: Request) -> Result<Response, Box<dyn std::error::Error>>

However, we can't export that to the JavaScript world. We need to ensure that our input and output types are things wasm-bindgen can handle. And to achieve that, we have a wrapper function that deals with the serde conversions and displaying the errors:

#[wasm_bindgen]
pub async fn respond_wrapper(req: JsValue) -> Result<JsValue, JsValue> {
    let req = req.into_serde().map_err(|e| e.to_string())?;
    let res = respond(req).await.map_err(|e| e.to_string())?;
    let res = JsValue::from_serde(&res).map_err(|e| e.to_string())?;
    Ok(res)
}

A wasm-bindgen function can accept JsValues (and lots of other types), and can return a Result<JsValue, JsValue>. In the case of an Err return, we'll get a runtime exception in the JavaScript world. We make our function pub so it can be exported. And by marking it async, we generate a Promise on the JavaScript side that can be awaited.

Other than that, it's some fairly standard serde stuff: converting from a JsValue into a Request via its Deserialize and converting a Response into a JsValue via its Serialize. In between those, we call our actual respond function, and map all error values into a String representation.

Routing

Our respond function receives a Request, and that Request has a url: String field. I was able to pull in the url crate directly, and then use its Url struct for easier processing:

let url: url::Url = req.url.parse()?;

Also, I wanted all requests to land on the www.sortasecret.com subdomain, so I added a bare domain redirect:

fn redirect_to_www(mut url: url::Url) -> Result<Response, url::ParseError> {
    url.set_host(Some("www.sortasecret.com"))?;
    let mut headers = HashMap::new();
    headers.insert("Location".to_string(), url.to_string());
    Ok(Response {
        status: 307,
        body: format!("Redirecting to {}", url),
        headers,
    })
}

if url.host_str() == Some("sortasecret.com") {
    return Ok(redirect_to_www(url)?);
}

This is already giving us some nice type safety guarantees from the Rust world, which I'm very happy to take advantage of. Next comes the routing itself. If I was more of a purist, I would make sure I was checking the request methods correctly, returning 405 "bad method" responses in some cases, and so on. Instead, I went for a very hacky implementation:

Ok(match (req.method == "GET", url.path()) {
    (true, "/") => html(200, server::homepage_html()?),
    (true, "/v1/script.js") => js(200, server::script_js()?),
    (false, "/v1/decrypt") => {
        let (status, body) = server::decrypt(&req.body).await;
        html(status, body)
    }
    (true, "/v1/encrypt") => {
        let (status, body) = server::encrypt(&req.url.parse()?)?;
        html(status, body)
    }
    (true, "/v1/show") => {
        let (status, body) = server::show_html(&req.url.parse()?)?;
        html(status, body)
    }
    (_method, path) => html(404, format!("Not found: {}", path)),
})

Which relies on some helper functions:

fn html(status: u16, body: String) -> Response {
    let mut headers = HashMap::new();
    headers.insert("Content-Type".to_string(), "text/html; charset=utf-8".to_string());
    Response { status, headers, body }
}

fn js(status: u16, body: String) -> Response {
    let mut headers = HashMap::new();
    headers.insert("Content-Type".to_string(), "text/javascript; charset=utf-8".to_string());
    Response { status, headers, body }
}

Let's dig in on some of these route handlers.

Templating

I'm using the askama crate for templating. This provides compile-time-parsed templates. For me, this is great because:

Errors are caught at compile time
Less files need to be shipped to the deployed system

The downside is you have to go through a complete compile/link step before you can see your changes.

I'm happy to report that there were absolutely no issues using askama on this project. It compiled without any difference in the code for WASM.

I have just one HTML template, which I use for both the homepage and the /v1/show route. There is only one variable in the template: the encrypted secret value. In the case of the homepage, we use some default message. For /v1/show, we use the value provided by the query string. Let's look at the entirety of the homepage logic:

#[derive(Template)]
#[template(path = "homepage.html")]
struct Homepage {
    secret: String,
}

fn make_homepage(keypair: &Keypair) -> Result<String, Box<dyn std::error::Error>> {
    Ok(Homepage {
        secret: keypair.encrypt("The secret message has now been decrypted, congratulations!")?,
    }.render()?)
}

Virtually all of the work is handled for us by askama itself. I defined a struct, added a few attributes, and then called render() on the value. Easy! I won't bore you with the details of the HTML here, but if you want, feel free to check out homepage.html on Github.

The story for the script.js is similar, except it takes the Recaptcha site key as a variable.

#[derive(Template)]
#[template(path = "script.js", escape = "none")]
struct Script<'a> {
    site: &'a str,
}

pub(crate) fn script_js() -> Result<String, askama::Error> {
    Script {
        site: super::secrets::RECAPTCHA_SITE,
    }.render()
}

Cryptography

When I originally wrote Sorta Secret using actix-web, I used the sodiumoxide crate to access the sealedbox approach within libsodium. This provides a public key-based method of encrypting a secret. Unfortunately, sodiumoxide didn't compile trivially with WASM, which isn't surprising given that it's a binding to a C library. It may have been possible to brute force my way through this, but I decided to take a different approach.

Instead, I moved over to the pure-Rust cryptoxide crate. It doesn't provide the same high-level APIs of sodiumoxide, but it does provide chacha20poly1305, which is more than enough to implement symmetric key encryption.

This meant I also needed to generate some random values to create nonces, which was my first debugging nightmare. I used the getrandom crate to generate the random values, and initially added the dependency as:

getrandom = "0.1.13"

I naively assumed that it would automatically turn on the correct set of features to use WASM-relevant random data sources. Unfortunately, that wasn't the case. Instead, the calls to getrandom would simply panic about an unsupported backend. And while Cloudflare's preview system overall gives a great experience with error messages, the combination of a panic and a Promise meant that the exception was lost. By temporarily turning off the async bits and some other hacky workarounds, I eventually found out what the problem was, and eventually fixed it all by replacing the above line with:

getrandom = { version = "0.1.13", features = ["wasm-bindgen"] }

If you're curious, you can check out the encrypt and decrypt methods on Github. One pleasant finding was that, once I got the code compiling, all of the tests passed the first time, which is always an experience I strive for in strongly typed languages.

Parsing query strings

Both the /v1/encrypt and /v1/show endpoints take a single query string parameter, secret. In the case of encrypt, this is a plaintext value. In the case of show, it's the encrypted ciphertext. However, they both parse initially to a String, so I used the same (poorly named) struct to handle parsing both of them. If you remember from before, I already parsed the requested URL into a url::Url value. Using serde_urlencoded makes it easy to throw all of this together:

#[derive(Deserialize, Debug)]
struct EncryptRequest {
    secret: String,
}

impl EncryptRequest {
    fn from_url(url: &url::Url) -> Option<Self> {
        serde_urlencoded::from_str(url.query()?).ok()
    }
}

Using this from the encrypt endpoint looks like this:

pub(crate) fn encrypt(url: &url::Url) -> Result<(u16, String), Box<dyn std::error::Error>> {
    match EncryptRequest::from_url(url) {
        Some(encreq) => {
            let keypair = make_keypair()?;
            let encrypted = keypair.encrypt(&encreq.secret)?;
            Ok((200, encrypted))
        }
        None => Ok((400, "Invalid parameters".into())),
    }
}

Feel free to check out the show_html endpoint too.

Parsing JSON request body

On the homepage and /v1/show page, we load up the script.js file to talk to the Recaptcha servers, get a token, and then send the encrypted secrets and that token to the /v1/decrypt endpoint. This data is sent in a PUT request with a JSON request body. We call this a DecryptRequest, and once again we can use serde to handle all of the parsing:

#[derive(Deserialize)]
struct DecryptRequest {
    token: String,
    secrets: Vec<String>,
}

pub(crate) async fn decrypt(body: &str) -> (u16, String) {
    let decreq: DecryptRequest = match serde_json::from_str(body) {
        Ok(x) => x,
        Err(_) => return (400, "Invalid request".to_string()),
    };

    ...
}

At the beginning of this post, I mentioned the possibility of using the original JavaScript Request value instead of creating a simplified JSON representation of it. If we did so, we could call out to the json method instead. As it stands now, converting the request body to a String and parsing with serde works just fine.

I haven't looked into them myself, but there are certainly performance and code size trade-offs to be considered around this for deciding what the best solution here would be.

Outgoing HTTP requests

The final major hurdle was making the outgoing HTTP request to the Recaptcha server. When I did my Hyper implementation of Sorta Secret, I used the surf crate, which seemed at first to have WASM support. Unfortunately, I ended up running into two major (and difficult to debug) issues trying to use Surf for the WASM part of this:

The Surf code assumes that there will be a Window, and panics if there isn't. Within Cloudflare, there isn't a Window available. Instead, I had to use a ServiceWorkerGlobalScope. Debugging this was again tricky because of the dropped error messages. But I eventually fixed this by tweaking the Surf codebase with a function like:
```
pub fn worker_global_scope() -> Option<web_sys::ServiceWorkerGlobalScope> {
    js_sys::global().dyn_into::<web_sys::ServiceWorkerGlobalScope>().ok()
}
```
However, once I did this, I kept getting 400 invalid parameter responses from the Recaptcha servers. I eventually spun up a local server to dump all request information, used ngrok to make that service available to Cloudflare, and pointed the code at that ngrok hostname. I found out that it wasn't sending any request body at all.

I dug through the codebase a bit, and eventually found issue #26, which demonstrated that body uploads weren't supported yet. I considered trying to patch the library to add that support, but after a few initial attempts it looks like that will require some deeper modifications than I was ready to attempt.

So instead, I decided to go the opposite direction, and directly call the fetch API myself via the web-sys crate. This involves these logic steps:

Create a RequestInit value
Fill it with the appropriate request method and form data
Create a Request from that RequestInit and the Recaptcha URL
Get the global ServiceWorkerGlobalScope
Call fetch on it
Convert some Promises into Futures and .await them
Use serde to convert the JsValue containing the JSON response body into a VerifyResponse

Got that? Great! Putting all of that together looks like this:

use web_sys::{Request, RequestInit, Response};
let mut opts = RequestInit::new();
opts.method("POST");
let form_data = web_sys::FormData::new()?; // web-sys should really require mut here...
form_data.append_with_str("secret", body.secret)?;
form_data.append_with_str("response", &body.response)?;
opts.body(Some(&form_data));
let request = Request::new_with_str_and_init(
    "https://www.google.com/recaptcha/api/siteverify",
    &opts,
)?;

request.headers().set("User-Agent", "sortasecret")?;

let global = worker_global_scope().ok_or(VerifyError::NoGlobal)?;
let resp_value = JsFuture::from(global.fetch_with_request(&request)).await?;
let resp: Response = resp_value.dyn_into()?;
let json = JsFuture::from(resp.json()?).await?;
let verres: VerifyResponse = json.into_serde()?;

Ok(verres)

And with that, all was well!

Surprises

I've called out a few of these above, but let me collect some of my surprise points while implementing this.

The lack of error messages during the panic and async combo was a real killer. Maybe there's a way to improve that situation that I haven't figured out yet.
I was pretty surprised that getrandom would panic without the correct feature set.
I was also surprised that Surf silently dropped all form data, and implicitly expected a Window context that wasn't there.

On the Cloudflare side itself, the only real hurdles I hit were when it came to deploying to my own domain name instead of a workers.dev domain. The biggest gotcha was that I needed to fill in a dummy A record. I eventually found an explanation here. I got more confused during the debugging of this due to DNS propagation issues, but that's entirely my own fault.

Also, I shot myself in the foot with the route syntax in the wrangler.toml. I had initially put www.sortasecret.com, which meant it used workers to handle the homepage, but passed off requests for all other paths to my original actix-web service. I changed my route to be:

route = "*sortasecret.com/*"

I don't really blame Cloudflare docs for that, it's pretty well spelled out, but I did overlook it.

Once all of that was in place, it's wonderful to have access to the full suite of domain management tools for Cloudflare, such as HTTP to HTTPS redirection, and the ability to set virtual CNAMEs on the bare domain name. This made it trivial to set up my redirect from sortasecret.com to www.sortasecret.com.

Conclusion

I figured this rewrite would be a long one, and it was. I was unfamiliar with basically all of the technologies I ended up using: wasm-bindgen, Cloudflare Workers, and web-sys. Given all that, I'm not disappointed with the time investment.

If I was going to do this again, I'd probably factor out a significant number of common components to a cloudflare crate I could use, and provide things like:

More fully powered Request and Response types
A wrapper function to promote a async fn (Request) -> Result<Response, Box<dyn Error>> into something that can be exported by wasm-bindgen
Helper functions for the fetch API
Possibly wrap some of the other JavaScript and WASM APIs around things like JSON and crypto (though cryptoxide worked great for me)

With those tools in place, I would definitely consider using Cloudflare Workers like this again. The cost and maintenance benefits are great, the performance promises to be great, and I get to keep the safety guarantees I love about Rust.

Are others using Cloudflare Workers with Rust? Interested in it? Please let me know on Twitter.

And if your company is considering options in the DevOps, serverless, or Rust space, please consider reaching out to our team to find out how we can help you.

Set up an engineering consultation