I run a website for Haskellers. People are able to put their email addresses on this website for others to contact them. These email addresses were historically protected by Mailhide, which would use a Captcha to prevent bots from scraping that information. Unfortunately, Mailhide was shut down. And from there, Sorta Secret was born.
Sorta Secret provides a pretty simple service, as well as a simple API. Using the encrypt endpoint, you can get an encrypted version of your secret. Using the show endpoint, you can get a webpage that will decrypt the information after passing a Recaptcha. That's basically it. You can go to my Haskellers profile and click "Reveal email address" to see this in action.
I originally wrote Sorta Secret a year ago in Rust using actix-web
and deployed it, like most services we write at FP Complete, to our Kubernetes cluster. When Rust 1.39 was released with async
/await
support, and then Hyper 0.13 was released using that support, I decided I wanted to try rewriting against Hyper. But that's a story for another time.
After that, more out of curiosity than anything else, I decided to rewrite it as a serverless application using Cloudflare Workers, a serverless platform that supports Rust and WASM. To quote the Cloudflare page on the topic:
Serverless computing is a method of providing backend services on an as-used basis. A Serverless provider allows users to write and deploy code without the hassle of worrying about the underlying infrastructure.
This post will describe my experiences doing this, what I thought worked well (and not so well), and why you may consider doing something like this yourself.
Advantages
Let me start off with the major advantages of using Cloudflare Workers over my previous setup:
- Geographic distribution A typical hosting setup, including the Kubernetes cluster I deploy to, is set up in a single geographic location. For an embarrassingly parallel application like this, having your code run in all of Cloudflare's data centers is pretty awesome.
- Setup time/cost I already have access to a Kubernetes cluster. But for someone who doesn't already have a preexisting server or cluster to deploy their service, the time to set up a secure, high availability deployment environment, and the cost of running these machines, can be high. I'm currently paying $0 to host this service on Cloudflare.
- Ease of testing/deployment The Cloudflare team has done a great job with the Wrangler tool. Deploying an update is a call to
wrangler publish
. I can do testing with wrangler preview --watch
. This is pretty awesome. And the publishing is fast.
Disadvantages
There are definitely some hurdles to overcome along the way.
- Lack of examples I found it very difficult to get even basic things working correctly. I'm hoping this post helps with that.
- WASM libraries didn't work perfectly Most libraries designed to help with WASM are targeted at the browser. In a Cloudflare Worker, for example, there's no
Window
. Instead, to call fetch
, I needed a ServiceWorkerGlobalScope
.
- Slower dev cycle than I like While
wrangler preview
is awesome, it still takes quite a bit of time to see a change. Each code change requires recompiling the Rust code, packaging up the bundle, sending it to Cloudflare, and a refresh of the page. Especially since I was using compile-time-checked HTML templates, this ended up being pretty slow.
- Secrets management Unlike Kubernetes, there's no built in secrets management in Cloudflare Workers. Someone on the Cloudflare team advised me that I could use their key/value store for secrets. I elected to be really dumb and compile the secrets (encryption key and Recaptcha secret key) directly into the executable.
- Difficult debugging It seems that the combination of
async
code, panics, and the bridge to JavaScript results in error messages getting completely dropped, which makes debugging very difficult.
That's enough motivation and demotivation for now. Let's see how this all fits together.
Getting started
The Cloudflare team has put together a very nice command line tool, wrangler
, which happens to be written in Rust. Getting started with a brand new Cloudflare Workers Rust project is nice and easy, you don't even need to set up an account or provide any credentials.
cargo install wrangler
wrangler generate wasm-worker https://github.com/cloudflare/rustwasm-worker-template.git
cd wasm-worker
wrangler preview --watch
The problem is that this template doesn't do much. There's a Rust function called greet
that returns a String
. That Rust function is exposed to the JavaScript world via wasm-bindgen
. There's a small JavaScript wrapper that imports that function and calls it when a new request comes in. However, we want to do a lot more in this application:
- Perform routing inside Rust
- Perform async operations (specifically making requests to the Recaptcha server)
- Generating more than just 200 success status responses
- Parse submitted JSON bodies
- Use HTML templating
So let's dive down the rabbit hole!
wasm-bindgen
I've played with WASM a bit before this project, but not much. Coming up to speed with wasm-bindgen
was honestly pretty difficult for me, and involved a lot of trial-and-error. Ultimately, I discovered that I could probably get away with one of two approaches for the binding layer between the JavaScript and Rust worlds:
- Have a thin wrapper in JavaScript that produces simple JSON objects, and then use
serde
inside Rust to turn those into nice struct
s
- Use the
Request
and Response
types in web-sys
directly
I discovered the first approach first, and went with it. I briefly played with moving over to the second approach, but it involved a lot of overhaul to the code, so I ended up sticking with my approach 1. Those more skilled with this may disagree with this approach. Anyway, here's what the JavaScript half of this looks like:
const { respond_wrapper } = wasm_bindgen;
await wasm_bindgen(wasm)
var body;
if (request.body) {
body = await request.text();
} else {
body = "";
}
var headers = {};
for(var key of request.headers.keys()) {
headers[key] = request.headers.get(key);
}
const response = await respond_wrapper({
method: request.method,
headers: headers,
url: request.url,
body: body,
})
return new Response(response.body, {
status: response.status,
headers: response.headers,
})
Some interesting things to note here:
- I'm pulling in the entire request body as a string. That works for our case (the only request body is form data), but isn't intelligent enough in general.
- The
respond_wrapper
itself is returning a Promise
on the JavaScript side. We're about to see some wasm-bindgen
awesomeness.
- There's not much work to convert between the simplified JSON values and the real JavaScript objects.
Now let's look at the Rust side of the equation. First we've got our Request
and Response
structs with appropriate serde
deriving:
#[derive(Deserialize)]
pub struct Request {
method: String,
headers: HashMap<String, String>,
url: String,
body: String, // should really be Vec<u8>, I'm cheating here
}
#[derive(Serialize)]
pub struct Response {
status: u16,
headers: HashMap<String, String>,
body: String,
}
Within the Rust world we want to deal exclusively with these types, and so our application lives inside a function with signature:
async fn respond(req: Request) -> Result<Response, Box<dyn std::error::Error>>
However, we can't export that to the JavaScript world. We need to ensure that our input and output types are things wasm-bindgen
can handle. And to achieve that, we have a wrapper function that deals with the serde
conversions and displaying the errors:
#[wasm_bindgen]
pub async fn respond_wrapper(req: JsValue) -> Result<JsValue, JsValue> {
let req = req.into_serde().map_err(|e| e.to_string())?;
let res = respond(req).await.map_err(|e| e.to_string())?;
let res = JsValue::from_serde(&res).map_err(|e| e.to_string())?;
Ok(res)
}
A wasm-bindgen
function can accept JsValue
s (and lots of other types), and can return a Result<JsValue, JsValue>
. In the case of an Err
return, we'll get a runtime exception in the JavaScript world. We make our function pub
so it can be exported. And by marking it async
, we generate a Promise
on the JavaScript side that can be await
ed.
Other than that, it's some fairly standard serde
stuff: converting from a JsValue
into a Request
via its Deserialize
and converting a Response
into a JsValue
via its Serialize
. In between those, we call our actual respond
function, and map all error values into a String
representation.
Routing
Our respond
function receives a Request
, and that Request
has a url: String
field. I was able to pull in the url
crate directly, and then use its Url
struct for easier processing:
let url: url::Url = req.url.parse()?;
Also, I wanted all requests to land on the www.sortasecret.com
subdomain, so I added a bare domain redirect:
fn redirect_to_www(mut url: url::Url) -> Result<Response, url::ParseError> {
url.set_host(Some("www.sortasecret.com"))?;
let mut headers = HashMap::new();
headers.insert("Location".to_string(), url.to_string());
Ok(Response {
status: 307,
body: format!("Redirecting to {}", url),
headers,
})
}
if url.host_str() == Some("sortasecret.com") {
return Ok(redirect_to_www(url)?);
}
This is already giving us some nice type safety guarantees from the Rust world, which I'm very happy to take advantage of. Next comes the routing itself. If I was more of a purist, I would make sure I was checking the request methods correctly, returning 405 "bad method" responses in some cases, and so on. Instead, I went for a very hacky implementation:
Ok(match (req.method == "GET", url.path()) {
(true, "/") => html(200, server::homepage_html()?),
(true, "/v1/script.js") => js(200, server::script_js()?),
(false, "/v1/decrypt") => {
let (status, body) = server::decrypt(&req.body).await;
html(status, body)
}
(true, "/v1/encrypt") => {
let (status, body) = server::encrypt(&req.url.parse()?)?;
html(status, body)
}
(true, "/v1/show") => {
let (status, body) = server::show_html(&req.url.parse()?)?;
html(status, body)
}
(_method, path) => html(404, format!("Not found: {}", path)),
})
Which relies on some helper functions:
fn html(status: u16, body: String) -> Response {
let mut headers = HashMap::new();
headers.insert("Content-Type".to_string(), "text/html; charset=utf-8".to_string());
Response { status, headers, body }
}
fn js(status: u16, body: String) -> Response {
let mut headers = HashMap::new();
headers.insert("Content-Type".to_string(), "text/javascript; charset=utf-8".to_string());
Response { status, headers, body }
}
Let's dig in on some of these route handlers.
Templating
I'm using the askama
crate for templating. This provides compile-time-parsed templates. For me, this is great because:
- Errors are caught at compile time
- Less files need to be shipped to the deployed system
The downside is you have to go through a complete compile/link step before you can see your changes.
I'm happy to report that there were absolutely no issues using askama
on this project. It compiled without any difference in the code for WASM.
I have just one HTML template, which I use for both the homepage and the /v1/show
route. There is only one variable in the template: the encrypted secret value. In the case of the homepage, we use some default message. For /v1/show
, we use the value provided by the query string. Let's look at the entirety of the homepage logic:
#[derive(Template)]
#[template(path = "homepage.html")]
struct Homepage {
secret: String,
}
fn make_homepage(keypair: &Keypair) -> Result<String, Box<dyn std::error::Error>> {
Ok(Homepage {
secret: keypair.encrypt("The secret message has now been decrypted, congratulations!")?,
}.render()?)
}
Virtually all of the work is handled for us by askama
itself. I defined a struct
, added a few attributes, and then called render()
on the value. Easy! I won't bore you with the details of the HTML here, but if you want, feel free to check out homepage.html on Github.
The story for the script.js
is similar, except it takes the Recaptcha site key as a variable.
#[derive(Template)]
#[template(path = "script.js", escape = "none")]
struct Script<'a> {
site: &'a str,
}
pub(crate) fn script_js() -> Result<String, askama::Error> {
Script {
site: super::secrets::RECAPTCHA_SITE,
}.render()
}
Cryptography
When I originally wrote Sorta Secret using actix-web
, I used the sodiumoxide crate to access the sealedbox approach within libsodium
. This provides a public key-based method of encrypting a secret. Unfortunately, sodiumoxide
didn't compile trivially with WASM, which isn't surprising given that it's a binding to a C library. It may have been possible to brute force my way through this, but I decided to take a different approach.
Instead, I moved over to the pure-Rust cryptoxide crate. It doesn't provide the same high-level APIs of sodiumoxide
, but it does provide chacha20poly1305
, which is more than enough to implement symmetric key encryption.
This meant I also needed to generate some random values to create nonces, which was my first debugging nightmare. I used the getrandom
crate to generate the random values, and initially added the dependency as:
getrandom = "0.1.13"
I naively assumed that it would automatically turn on the correct set of features to use WASM-relevant random data sources. Unfortunately, that wasn't the case. Instead, the calls to getrandom
would simply panic
about an unsupported backend. And while Cloudflare's preview system overall gives a great experience with error messages, the combination of a panic and a Promise
meant that the exception was lost. By temporarily turning off the async
bits and some other hacky workarounds, I eventually found out what the problem was, and eventually fixed it all by replacing the above line with:
getrandom = { version = "0.1.13", features = ["wasm-bindgen"] }
If you're curious, you can check out the encrypt and decrypt methods on Github. One pleasant finding was that, once I got the code compiling, all of the tests passed the first time, which is always an experience I strive for in strongly typed languages.
Parsing query strings
Both the /v1/encrypt
and /v1/show
endpoints take a single query string parameter, secret
. In the case of encrypt
, this is a plaintext value. In the case of show
, it's the encrypted ciphertext. However, they both parse initially to a String
, so I used the same (poorly named) struct
to handle parsing both of them. If you remember from before, I already parsed the requested URL into a url::Url
value. Using serde_urlencoded
makes it easy to throw all of this together:
#[derive(Deserialize, Debug)]
struct EncryptRequest {
secret: String,
}
impl EncryptRequest {
fn from_url(url: &url::Url) -> Option<Self> {
serde_urlencoded::from_str(url.query()?).ok()
}
}
Using this from the encrypt
endpoint looks like this:
pub(crate) fn encrypt(url: &url::Url) -> Result<(u16, String), Box<dyn std::error::Error>> {
match EncryptRequest::from_url(url) {
Some(encreq) => {
let keypair = make_keypair()?;
let encrypted = keypair.encrypt(&encreq.secret)?;
Ok((200, encrypted))
}
None => Ok((400, "Invalid parameters".into())),
}
}
Feel free to check out the show_html
endpoint too.
Parsing JSON request body
On the homepage and /v1/show
page, we load up the script.js
file to talk to the Recaptcha servers, get a token, and then send the encrypted secrets and that token to the /v1/decrypt
endpoint. This data is sent in a PUT
request with a JSON request body. We call this a DecryptRequest
, and once again we can use serde
to handle all of the parsing:
#[derive(Deserialize)]
struct DecryptRequest {
token: String,
secrets: Vec<String>,
}
pub(crate) async fn decrypt(body: &str) -> (u16, String) {
let decreq: DecryptRequest = match serde_json::from_str(body) {
Ok(x) => x,
Err(_) => return (400, "Invalid request".to_string()),
};
...
}
At the beginning of this post, I mentioned the possibility of using the original JavaScript Request
value instead of creating a simplified JSON representation of it. If we did so, we could call out to the json
method instead. As it stands now, converting the request body to a String
and parsing with serde
works just fine.
I haven't looked into them myself, but there are certainly performance and code size trade-offs to be considered around this for deciding what the best solution here would be.
Outgoing HTTP requests
The final major hurdle was making the outgoing HTTP request to the Recaptcha server. When I did my Hyper implementation of Sorta Secret, I used the surf crate, which seemed at first to have WASM support. Unfortunately, I ended up running into two major (and difficult to debug) issues trying to use Surf for the WASM part of this:
-
The Surf code assumes that there will be a Window
, and panic
s if there isn't. Within Cloudflare, there isn't a Window available. Instead, I had to use a ServiceWorkerGlobalScope
. Debugging this was again tricky because of the dropped error messages. But I eventually fixed this by tweaking the Surf codebase with a function like:
pub fn worker_global_scope() -> Option<web_sys::ServiceWorkerGlobalScope> {
js_sys::global().dyn_into::<web_sys::ServiceWorkerGlobalScope>().ok()
}
-
However, once I did this, I kept getting 400 invalid parameter responses from the Recaptcha servers. I eventually spun up a local server to dump all request information, used ngrok
to make that service available to Cloudflare, and pointed the code at that ngrok
hostname. I found out that it wasn't sending any request body at all.
I dug through the codebase a bit, and eventually found issue #26, which demonstrated that body uploads weren't supported yet. I considered trying to patch the library to add that support, but after a few initial attempts it looks like that will require some deeper modifications than I was ready to attempt.
So instead, I decided to go the opposite direction, and directly call the fetch
API myself via the web-sys
crate. This involves these logic steps:
- Create a
RequestInit
value
- Fill it with the appropriate request method and form data
- Create a
Request
from that RequestInit
and the Recaptcha URL
- Get the global
ServiceWorkerGlobalScope
- Call
fetch
on it
- Convert some
Promise
s into Future
s and .await
them
- Use serde to convert the
JsValue
containing the JSON response body into a VerifyResponse
Got that? Great! Putting all of that together looks like this:
use web_sys::{Request, RequestInit, Response};
let mut opts = RequestInit::new();
opts.method("POST");
let form_data = web_sys::FormData::new()?; // web-sys should really require mut here...
form_data.append_with_str("secret", body.secret)?;
form_data.append_with_str("response", &body.response)?;
opts.body(Some(&form_data));
let request = Request::new_with_str_and_init(
"https://www.google.com/recaptcha/api/siteverify",
&opts,
)?;
request.headers().set("User-Agent", "sortasecret")?;
let global = worker_global_scope().ok_or(VerifyError::NoGlobal)?;
let resp_value = JsFuture::from(global.fetch_with_request(&request)).await?;
let resp: Response = resp_value.dyn_into()?;
let json = JsFuture::from(resp.json()?).await?;
let verres: VerifyResponse = json.into_serde()?;
Ok(verres)
And with that, all was well!
Surprises
I've called out a few of these above, but let me collect some of my surprise points while implementing this.
- The lack of error messages during the
panic
and async
combo was a real killer. Maybe there's a way to improve that situation that I haven't figured out yet.
- I was pretty surprised that
getrandom
would panic
without the correct feature set.
- I was also surprised that Surf silently dropped all form data, and implicitly expected a
Window
context that wasn't there.
On the Cloudflare side itself, the only real hurdles I hit were when it came to deploying to my own domain name instead of a workers.dev
domain. The biggest gotcha was that I needed to fill in a dummy A record. I eventually found an explanation here. I got more confused during the debugging of this due to DNS propagation issues, but that's entirely my own fault.
Also, I shot myself in the foot with the route
syntax in the wrangler.toml
. I had initially put www.sortasecret.com
, which meant it used workers to handle the homepage, but passed off requests for all other paths to my original actix-web
service. I changed my route
to be:
route = "*sortasecret.com/*"
I don't really blame Cloudflare docs for that, it's pretty well spelled out, but I did overlook it.
Once all of that was in place, it's wonderful to have access to the full suite of domain management tools for Cloudflare, such as HTTP to HTTPS redirection, and the ability to set virtual CNAMEs on the bare domain name. This made it trivial to set up my redirect from sortasecret.com
to www.sortasecret.com
.
Conclusion
I figured this rewrite would be a long one, and it was. I was unfamiliar with basically all of the technologies I ended up using: wasm-bindgen
, Cloudflare Workers, and web-sys
. Given all that, I'm not disappointed with the time investment.
If I was going to do this again, I'd probably factor out a significant number of common components to a cloudflare
crate I could use, and provide things like:
- More fully powered
Request
and Response
types
- A wrapper function to promote a
async fn (Request) -> Result<Response, Box<dyn Error>>
into something that can be exported by wasm-bindgen
- Helper functions for the
fetch
API
- Possibly wrap some of the other JavaScript and WASM APIs around things like JSON and crypto (though
cryptoxide
worked great for me)
With those tools in place, I would definitely consider using Cloudflare Workers like this again. The cost and maintenance benefits are great, the performance promises to be great, and I get to keep the safety guarantees I love about Rust.
Are others using Cloudflare Workers with Rust? Interested in it? Please let me know on Twitter.
And if your company is considering options in the DevOps, serverless, or Rust space, please consider reaching out to our team to find out how we can help you.
Set up an engineering consultation
Read more from out blog | Rust | DevOps
Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.
Do you like this blog post and need help with Next Generation Software Engineering, Platform Engineering or Blockchain & Smart Contracts? Contact us.