Most of the web services I've written in Rust have used actix-web
. Recently, I needed to write something that will provide some reverse proxy functionality. I'm more familiar with the hyper-powered HTTP client libraries (reqwest
in particular). I decided this would be a good time to experiment again with hyper on the server side as well. The theory was that having matching Request
and Response
types between the client and server would work nicely. And it certainly did.
In the process, I ended up with an interesting example of battling ownership through closures and async blocks. This is a topic I typically mention in my Rust training sessions as the hardest thing I had to learn when learning Rust. So I figure a blog post demonstrating one of these crazy cases would be worthwhile.
Side note: If you're interested in learning more about Rust, we'll be offering a free Rust training course in December. Sign up for more information.
Cargo.toml
If you want to play along, you should start off with a cargo new
. I'm using the following [dependencies]
in my Cargo.toml
[dependencies]
hyper = "0.13"
tokio = { version = "0.2", features = ["full"] }
log = "0.4.11"
env_logger = "0.8.1"
hyper-tls = "0.4.3"
I'm also compiling with Rust version 1.47.0. If you'd like, you can add 1.47.0
to your rust-toolchain
. And finally, my full Cargo.lock
is available as a Gist.
Basic web service
To get started with a hyper-powered web service, we can use the example straight from the hyper homepage:
use std::{convert::Infallible, net::SocketAddr};
use hyper::{Body, Request, Response, Server};
use hyper::service::{make_service_fn, service_fn};
async fn handle(_: Request<Body>) -> Result<Response<Body>, Infallible> {
Ok(Response::new("Hello, World!".into()))
}
#[tokio::main]
async fn main() {
let addr = SocketAddr::from(([127, 0, 0, 1], 3000));
let make_svc = make_service_fn(|_conn| async {
Ok::<_, Infallible>(service_fn(handle))
});
let server = Server::bind(&addr).serve(make_svc);
if let Err(e) = server.await {
eprintln!("server error: {}", e);
}
}
It's worth explaining this a little bit, since at least in my opinion the distinction between make_service_fn
and service_fn
wasn't clear. There are two different things we're trying to create here:
- A
MakeService
, which takes a &AddrStream
and gives back a Service
- A
Service
, which takes a Request
and gives back a Response
This glosses over a number of details, such as:
- Error handling
- Everything is async (
Future
s are everywhere)
- Everything is expressed in terms of general purpose
trait
s
To help us with that "glossing", hyper provides two convenience functions for creating MakeService
and Service
values, make_service_fn
and service_fn
. Each of these will convert a closure into their respective types. Then the MakeService
closure can return a Service
value, and the MakeService
value can be provided to hyper::server::Builder::serve
. Let's get even more concrete from the code above:
async fn handle(_: Request<Body>) -> Result<Response<Body>, Infallible> {...}
let make_svc = make_service_fn(|_conn| async {
Ok::<_, Infallible>(service_fn(handle))
});
The handle
function takes a Request<Body>
and returns a Future<Output=Result<Response<Body, Infallible>>>
. The Infallible
is a nice way of saying "no errors can possibly occur here." The type signatures at play require that we use a Result
, but morally Result<T, Infallible>
is equivalent to T
.
service_fn
converts this handle
value into a Service
value. This new value implements all of the appropriate traits to satisfy the requirements of make_service_fn
and serve
. We wrap up that new Service
in its own Result<_, Infallible>
, ignore the input &AddrStream
value, and pass all of this to make_service_fn
. make_svc
is now a value that can be passed to serve
, and we have "Hello, world!"
And if all of this seems a bit complicated for a "Hello world," you may understand why there are lots of frameworks built on top of hyper to make it easier to work with. Anyway, onwards!
Initial reverse proxy
Next up, we want to modify our handle
function to perform a reverse proxy instead of returning the "Hello, World!" text. For this example, we're going to hard-code https://www.fpcomplete.com
as the destination site for this reverse proxy. To make this happen, we'll need to:
- Construct a
Request
value, based on the incoming Request
's request headers and path, but targeting the www.fpcomplete.com
server
- Construct a
Client
value from hyper with TLS support
- Perform the request
- Return the
Response
as the response from handle
- Introduce error handling
I'm also going to move over to the env-logger
and log
crates for producing output. I did this when working on the code myself, and switching to RUST_LOG=debug
was a great way to debug things. (When I was working on this, I forgot I needed to create a special Client
with TLS support.)
So from the top! We now have the following use
statements:
use hyper::service::{make_service_fn, service_fn};
use hyper::{Body, Client, Request, Response, Server};
use hyper_tls::HttpsConnector;
use std::net::SocketAddr;
We next have three constants. The SCHEME
and HOST
are pretty self-explanatory: the hardcoded destination.
const SCHEME: &str = "https";
const HOST: &str = "www.fpcomplete.com";
Next we have some HTTP request headers that should not be forwarded onto the destination server. This blacklist approach to HTTP headers in reverse proxies works well enough. It's probably a better idea in general to follow a whitelist approach. In any event, these six headers have the potential to change behavior at the transport layer, and therefore cannot be passed on from the client:
/// HTTP headers to strip, a whitelist is probably a better idea
const STRIPPED: [&str; 6] = [
"content-length",
"transfer-encoding",
"accept-encoding",
"content-encoding",
"host",
"connection",
];
And next we have a fairly boilerplate error type definition. We can generate a hyper::Error
when performing the HTTP request to the destination server, and a hyper::http::Error
when constructing the new Request
. Arguably we should simply panic if the latter error occurs, since it indicates programmer error. But I've decided to treat it as its own error variant. So here's some boilerplate!
#[derive(Debug)]
enum ReverseProxyError {
Hyper(hyper::Error),
HyperHttp(hyper::http::Error),
}
impl From<hyper::Error> for ReverseProxyError {
fn from(e: hyper::Error) -> Self {
ReverseProxyError::Hyper(e)
}
}
impl From<hyper::http::Error> for ReverseProxyError {
fn from(e: hyper::http::Error) -> Self {
ReverseProxyError::HyperHttp(e)
}
}
impl std::fmt::Display for ReverseProxyError {
fn fmt(&self, fmt: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(fmt, "{:?}", self)
}
}
impl std::error::Error for ReverseProxyError {}
With all of this in place, we can finally start writing our handle
function:
async fn handle(mut req: Request<Body>) -> Result<Response<Body>, ReverseProxyError> {
}
We're going to mutate the incoming Request
to have our new destination, and then pass it along to the destination server. This is where the beauty of using hyper for client and server comes into play: no need to futz around with changing body or header representations. The first thing we do is strip out any of the STRIPPED
request headers:
let h = req.headers_mut();
for key in &STRIPPED {
h.remove(*key);
}
Next, we're going to construct the new request URI by combining:
- The hard-coded scheme (
https
)
- The hard-coded authority (
www.fpcomplete.com
)
- The path and query from the incoming request
let mut builder = hyper::Uri::builder()
.scheme(SCHEME)
.authority(HOST);
if let Some(pq) = req.uri().path_and_query() {
builder = builder.path_and_query(pq.clone());
}
*req.uri_mut() = builder.build()?;
Panicking if req.uri().path_and_query()
is None
would be appropriate here, but as is my wont, I'm avoiding panics if possible. Next, for good measure, let's add in a little bit of debug output:
log::debug!("request == {:?}", req);
Now we can construct our Client
value to perform the HTTPS request:
let https = HttpsConnector::new();
let client = Client::builder().build(https);
And finally, let's perform the request, log the response, and return the response:
let response = client.request(req).await?;
log::debug!("response == {:?}", response);
Ok(response)
Our main
function looks pretty similar to what we had before. I've added in initialization of env-logger
with a default to info
level output, and modified the program to abort
if the server produces any errors:
#[tokio::main]
async fn main() {
env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info")).init();
let addr = SocketAddr::from(([0, 0, 0, 0], 3000));
let make_svc = make_service_fn(|_conn| async {
Ok::<_, ReverseProxyError>(service_fn(handle))
});
let server = Server::bind(&addr).serve(make_svc);
log::info!("Server started, bound on {}", addr);
if let Err(e) = server.await {
log::error!("server error: {}", e);
std::process::abort();
}
}
The full code is available as a Gist. This program works as expected, and if I cargo run
it and connect to http://localhost:3000
, I see the FP Complete homepage. Yay!
Wasteful Client
The problem with this program is that it constructs a brand new Client
value on every incoming request. That's expensive. Instead, we would like to produce the Client
once, in main
, and reuse it for each request. And herein lies the ownership puzzle. While we're at this, let's move away from using const
s for the scheme and host, and instead bundle together the client, scheme, and host into a new struct
:
struct ReverseProxy {
scheme: String,
host: String,
client: Client<HttpsConnector<hyper::client::HttpConnector>>,
}
Next, we'll want to change handle
from a standalone function to a method on ReverseProxy
. (We could equivalently pass in a reference to ReverseProxy
for handle
, but this feels more idiomatic):
impl ReverseProxy {
async fn handle(&self, mut req: Request<Body>) -> Result<Response<Body>, ReverseProxyError> {
...
}
}
Then, within handle
, we can replace SCHEME
and HOST
with &*self.scheme
and &*self.host
. You may be wondering "why &*
and not &
." Without &*
, you'll get an error message:
error[E0277]: the trait bound `hyper::http::uri::Scheme: std::convert::From<&std::string::String>` is not satisfied
--> src\main.rs:59:14
|
59 | .scheme(&self.scheme)
| ^^^^^^ the trait `std::convert::From<&std::string::String>` is not implemented for `hyper::http::uri::Scheme`
This is one of those examples where the magic of deref coercion seems to fall apart. Personally, I prefer using self.scheme.as_str()
instead of &*self.scheme
to be more explicit, but &*self.scheme
is likely more idiomatic.
Anyway, the final change within handle
is to remove the let https = ...;
and let client = ...;
statements, and instead construct our response with:
let response = self.client.request(req).await?;
With that, our handle
method is done, and we can focus our efforts on the true puzzle: the main
function itself.
The easy part
The easy part of this is great: construct a ReverseProxy
value, and provide the make_svc
to serve
:
#[tokio::main]
async fn main() {
env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info")).init();
let addr = SocketAddr::from(([0, 0, 0, 0], 3000));
let https = HttpsConnector::new();
let client = Client::builder().build(https);
let rp = ReverseProxy {
client,
scheme: "https".to_owned(),
host: "www.fpcomplete.com".to_owned(),
};
// here be dragons
let server = Server::bind(&addr).serve(make_svc);
log::info!("Server started, bound on {}", addr);
if let Err(e) = server.await {
log::error!("server error: {}", e);
std::process::abort();
}
}
That middle part is where the difficulty lies. Previously, this code looked like:
let make_svc = make_service_fn(|_conn| async {
Ok::<_, ReverseProxyError>(service_fn(handle))
});
We no longer have a handle
function. Working around that little enigma doesn't seem so bad initially. We'll create a closure as the argument to service_fn
:
let make_svc = make_service_fn(|_conn| async {
Ok::<_, ReverseProxyError>(service_fn(|req| {
rp.handle(req)
}))
});
While that looks appealing, it fails lifetimes completely:
error[E0597]: `rp` does not live long enough
--> src\main.rs:90:13
|
88 | let make_svc = make_service_fn(|_conn| async {
| ____________________________________-------_-
| | |
| | value captured here
89 | | Ok::<_, ReverseProxyError>(service_fn(|req| {
90 | | rp.handle(req)
| | ^^ borrowed value does not live long enough
91 | | }))
92 | | });
| |_____- returning this value requires that `rp` is borrowed for `'static`
...
101 | }
| - `rp` dropped here while still borrowed
Nothing in the lifetimes of these values tells us that the ReverseProxy
value will outlive the service. We cannot simply borrow a reference to ReverseProxy
inside our closure. Instead, we're going to need to move ownership of the ReverseProxy
to the closure.
let make_svc = make_service_fn(|_conn| async {
Ok::<_, ReverseProxyError>(service_fn(move |req| {
rp.handle(req)
}))
});
Note the addition of move
in front of the closure. Unfortunately, this doesn't work, and gives us a confusing error message:
error[E0495]: cannot infer an appropriate lifetime for autoref due to conflicting requirements
--> src\main.rs:90:16
|
90 | rp.handle(req)
| ^^^^^^
|
note: first, the lifetime cannot outlive the lifetime `'_` as defined on the body at 89:47...
--> src\main.rs:89:47
|
89 | Ok::<_, ReverseProxyError>(service_fn(move |req| {
| ^^^^^^^^^^
note: ...so that closure can access `rp`
--> src\main.rs:90:13
|
90 | rp.handle(req)
| ^^
= note: but, the lifetime must be valid for the static lifetime...
note: ...so that the type `hyper::proto::h2::server::H2Stream<impl std::future::Future, hyper::Body>` will meet its required lifetime bounds
--> src\main.rs:94:38
|
94 | let server = Server::bind(&addr).serve(make_svc);
| ^^^^^
error: aborting due to previous error
Instead of trying to parse that, let's take a step back, reassess, and then try again.
So many layers!
Remember way back to the beginning of this post. I went into some details around the process of having a MakeService
, which would be run for each new incoming connection, and a Service
, which would be run for each new request on an existing connection. The way we've written things so far, the first time we handle a request, that request handler will consume the ReverseProxy
. That means that we would have a use-after-move for each subsequent request on that connection. We'd also have a use-after-move for each subsequent connection we receive.
We want to share our ReverseProxy
across multiple different MakeService
and Service
instantiations. Since this will occur across multiple system threads, the most straightforward way to handle this is to wrap our ReverseProxy
in an Arc
:
let rp = std::sync::Arc::new(ReverseProxy {
client,
scheme: "https".to_owned(),
host: "www.fpcomplete.com".to_owned(),
});
Now we're going to need to play around with clone
ing this Arc
at appropriate times. In particular, we'll need to clone twice: once inside the make_service_fn
closure, and once inside the service_fn
closure. This will ensure that we never move the ReverseProxy
value out of the closure's environment, and that our closure can remain a FnMut
instead of an FnOnce
.
And, in order to make that happen, we'll need to convince the compiler through appropriate usages of move
to move ownership of the ReverseProxy
, instead of borrowing a reference to a value with a different lifetime. This is where the fun begins! Let's go through a series of modifications until we get to our final mind-bender.
Adding move
To recap, we'll start with this code:
let rp = std::sync::Arc::new(ReverseProxy {
client,
scheme: "https".to_owned(),
host: "www.fpcomplete.com".to_owned(),
});
let make_svc = make_service_fn(|_conn| async {
Ok::<_, ReverseProxyError>(service_fn(|req| {
rp.handle(req)
}))
});
The first thing I tried was adding an rp.clone()
inside the first async
block:
let make_svc = make_service_fn(|_conn| async {
let rp = rp.clone();
Ok::<_, ReverseProxyError>(service_fn(|req| {
rp.handle(req)
}))
});
This doesn't work, presumably because I need to stick some move
s on the initial closure and async
block like so:
let make_svc = make_service_fn(move |_conn| async move {
let rp = rp.clone();
Ok::<_, ReverseProxyError>(service_fn(|req| {
rp.handle(req)
}))
});
This unfortunately still doesn't work, and gives me the error message:
error[E0507]: cannot move out of `rp`, a captured variable in an `FnMut` closure
--> src\main.rs:88:60
|
82 | let rp = std::sync::Arc::new(ReverseProxy {
| -- captured outer variable
...
88 | let make_svc = make_service_fn(move |_conn| async move {
| ____________________________________________________________^
89 | | let rp = rp.clone();
| | --
| | |
| | move occurs because `rp` has type `std::sync::Arc<ReverseProxy>`, which does not implement the `Copy` trait
| | move occurs due to use in generator
90 | | Ok::<_, ReverseProxyError>(service_fn(|req| {
91 | | rp.handle(req)
92 | | }))
93 | | });
| |_____^ move out of `rp` occurs here
It took me a while to grok what was happening. And in fact, I'm not 100% certain I grok it yet. But I believe what is happening is:
- The closure grabs ownership of
rp
(good)
- The
async
block grabs ownership of rp
, which seemed good, but isn't
- Inside the
async
block, we make a clone of rp
- When the
async
block is dropped, its ownership of the original rp
is dropped
- Since the
rp
was moved out of the closure, the closure is now an FnOnce
and cannot be called a second time
That's no good! It turns out the trick to fixing this isn't so difficult. Don't grab ownership in the async
block. Instead, clone the rp
in the closure, before the async
block:
let make_svc = make_service_fn(move |_conn| {
let rp = rp.clone();
async move {
Ok::<_, ReverseProxyError>(service_fn(|req| {
rp.handle(req)
}))
}
});
Woohoo! One clone
down. This code still doesn't compile, but we're closer. The next change to make is simple: stick a move
on the inner closure:
let make_svc = make_service_fn(move |_conn| {
let rp = rp.clone();
async move {
Ok::<_, ReverseProxyError>(service_fn(move |req| {
rp.handle(req)
}))
}
});
This also fails, but going back to our description before, it's easy to see why. We still need a second clone
, to make sure we aren't moving the ReverseProxy
value out of the closure. Making that change is easy, but unfortunately doesn't fully solve our problem. This code:
let make_svc = make_service_fn(move |_conn| {
let rp = rp.clone();
async move {
Ok::<_, ReverseProxyError>(service_fn(move |req| {
let rp = rp.clone();
rp.handle(req)
}))
}
});
Still gives us the error message:
error[E0515]: cannot return value referencing local variable `rp`
--> src\main.rs:93:17
|
93 | rp.handle(req)
| --^^^^^^^^^^^^
| |
| returns a value referencing data owned by the current function
| `rp` is borrowed here
What's going on here?
Did your Future borrow my reference?
Again, referring to the introduction, I mentioned that the service_fn
parameter had to return a Future<Output...>
. This is an example of the impl Trait
approach. I've previously blogged about ownership and impl trait. There are some pain points around this combination. And we've hit one of them.
The return type of our handle
method doesn't indicate what underlying type is implementing Future
. That underlying implementation may choose to hold onto references passed into the handle
method. That would include references to &self
. And that means if we return that Future
outside of our closure, a reference may outlive the value.
I can think of two ways to solve this problem, though there are probably more. The first one I'll show you isn't the one I prefer, but is the one that likely gets the idea across more clearly. Our handle
method is taking a reference to ReverseProxy
. But if it didn't take a reference, and instead received the ReverseProxy
by move, there would be no references to accidentally end up in the Future
.
Cloning the ReverseProxy
itself is expensive. Fortunately, we have another option: pass in the Arc<ReverseProxy>
!
impl ReverseProxy {
async fn handle(self: std::sync::Arc<Self>, mut req: Request<Body>) -> Result<Response<Body>, ReverseProxyError> {
...
}
}
Without changing any code inside the handle
method or the main
function, this compiles and behaves correctly. But like I said: I don't like it very much. This is limiting the generality of our handle
method. It feels like putting the complexity in the wrong place. (Maybe you'll disagree and say that this is the better solution. That's fine, I'd be really interested to hear people's thoughts.)
Instead, another possibility is to introduce an async move
inside main
. This will take ownership of the Arc<ReverseProxy>
, and ensure that it lives as long as the Future
generated by that async move
block itself. This solution looks like this:
let make_svc = make_service_fn(move |_conn| {
let rp = rp.clone();
async move {
Ok::<_, ReverseProxyError>(service_fn(move |req| {
let rp = rp.clone();
async move { rp.handle(req).await }
}))
}
});
We need to call .await
inside the async
block to ensure we don't return a future-of-a-future. But with that change, everything works. I'm not terribly thrilled with this. It feels like an ugly hack. I don't have any recommendations, but I hope there are improvements to the impl Trait
ownership story in the future.
One final improvement
One final tweak. We put async move
after the first rp.clone()
originally. This helped make the error messages more tractable. But it turns out that that move
isn't doing anything useful. The move
on the inner closure
already forces a move of the cloned rp
. So we can simplify our code by removing just one move
:
let make_svc = make_service_fn(move |_conn| {
let rp = rp.clone();
async {
Ok::<_, ReverseProxyError>(service_fn(move |req| {
let rp = rp.clone();
async move { rp.handle(req).await }
}))
}
});
This final version of the code is available as a Gist too.
Conclusion
I hope this was a fun trip down ownership lane. If this seemed overly complicated, keep in mind a few things:
- It's strongly recommended to use a higher level web framework when writing server side code in Rust
- We ended up implementing something pretty sophisticated in about 100 lines of code
- Hopefully the story around ownership and
impl Trait
will improve over time
If you enjoyed this, you may want to check out our Rust Crash Course, or sign up for our free December Rust training course.
Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.
Do you like this blog post and need help with Next Generation Software Engineering, Platform Engineering or Blockchain & Smart Contracts? Contact us.