Let’s talk about exceptions. Programs do a thing successfully
all the time, except sometimes when things didn’t work out. So we
have “exceptions”, which, like anything fun in programming
languages, was invented in Lisp in the 60’s .
They’re not perfect in Haskell. They’re not perfect in any
language, really. We’re always making adjustments to what we think
is the best way to handle exceptional circumstances. We’re muddling
through, as a field. Let’s discuss the options in Haskell.
There have been
other blog posts which discuss how to raise exceptions
in Haskell–from using the IO exceptions (throw
), to
ExceptT
-like monad transformers, to just returning an
Either Error a
–but we’re going to look at how we
define the exception type itself.
Use String
A very common way to define exceptions in Haskell is to use the
String
type (aka [Char]
)–yes, that list
of Char
that we abuse for everything. There are
actually many advantages to this:
- The first advantage is that it is built into the Haskell
language; it’s available from any package anywhere.
- It’s already a human readable error message, by
definition!
- You can already throw this using the
error
function which comes in the Prelude
module .
- Furthermore, many packages in Haskell such as parsec and
attoparsec, among others, already use the
String
type
for providing error messages.
- Any instance of
Exception
can be converted to a
String
using either show
or
displayException
, meaning you can easily include
other exception types in your representation.
It’s not all tea and crumpets, though.
Many existing error messages in the library ecosystem of Haskell
fail to include key information in their string error message. If
the error was provided as a data type then it could have included
what went wrong in the data type, and that would let you access the
information yourself.
Consider, e.g.
if waterTemp < boilingPoint
then error "The tea isn't coming!"
else ...
This message doesn’t tell me why the tea isn’t coming. Is the
kettle broken? Did someone forget to turn it on? Ideally, this
error string should say:
if waterTemp < idealTemp
then error ("The tea isn't coming, water temperature (" ++
waterTemp ++ ") is below ideal temp (" ++
idealTemp ++ ")".
else ...
This message is an improvement because it gives me some details
about why something failed. But it’s still not ideal.
Another problem is that you have to construct new messages ahead
of time, which means that the user of your exception does not have
the option to display the message how they like, such as in
different languages (bye, bye,
i18n), or with different layouts, or different file formats
(e.g. generating a JSON message). What if you the user of your
function wants to display your exception in Mandarin? Or display it
nicely formatted in HTML?
Furthermore, sometimes exceptions contain sensitive information,
like a password or an API key: this might be useful for the
developer, but it’s not something you always want displayed to the
whole world, either in log files or to the end-user. With a
String
type, stripping out this information is at best
a moving target.
It gets worse. Because it’s a string type, it’s impossible to
rely on this presentation for a means of inspection. For example,
if I receive an email from Elizabeth II
<[email protected]>
, I can do an SPF record
lookup on the domain name’s DNS records to see whether the IP
sending me that email is a valid sender for that domain.
But if I use a library to do it, I can get an error like
this:
> queryTXT (Network.DNS.Name "palace.gov.uk")
*** Exception: user error (res_query(3) failed)
Well, this is true! The query did fail. But I need more
information than that. It could be that your connection is having
trouble, it could be that the domain has no SPF record, or it could
be that the domain name is not valid; if the domain has no SPF
record, that’s a problem. If my connection is having trouble,
that’s fine - I can try again later. If this is a string type you
cannot make that decision.
Lastly, after all this woe, you do not have exhaustiveness
checking on a String
like you would in a sum type, so
you don’t know whether you have handled all the available
cases.
Click below to learn
more about a unique offer
The mega exception type
Another very common way of expressing errors is to use a large
sum type where every constructor represents a different error case.
This is used extensively in, for example, http -conduit, stack,
etc.
Example
from http -client:
data HttpExceptionContent
= StatusCodeException (Response ()) S.ByteString
| TooManyRedirects [Response L.ByteString]
| OverlongHeaders
| ResponseTimeout
| ...
The advantage of this approach is that all types of exceptions
are centralised in one location, which includes being able to
inspect every possible error case when you pattern match (with
GHC’s exhaustiveness checking), like:
catch (makeRequest)
(\case
StatusCodeException r s -> ...
OverlongHeaders -> ...
...)
Additionally, when you look at the exception type in haddock you are able to discern what things
can go wrong ahead of time. You can also put any relevant data into
the constructor.
In other words, this approach is self documenting . It’s
transparent.
The disadvantage is that you have to centralise all of
your work in one place, which requires an extra maintenance burden
over simply writing an error as a string. And whenever you add a
new constructor it causes a breaking change to downstream users of
your library. This may be considered an advantage or disadvantage
.
It can also be difficult to add context to your constructors,
especially if you don’t want to repeat that context in every single
constructor. For example, in the http -conduit package, it has
about 30 constructors. You wouldn’t want to copy the same context
to every constructor. Instead, it’s probably better to have a
separate exception type which contains the context and then has a
field for your sum type. In fact, this is the approach taken by the
http -conduit package in the
HttpException
type:
data HttpException
= HttpExceptionRequest Request HttpExceptionContent
| InvalidUrlException String String
The Request
is the context, and the
HttpExceptionContent
is the actual problem that
occurred.
One final point: the mega exception type implies some kind of
completeness; that if you catch this exception type when using a
library, you’ve handled all possible exceptions. But that’s not
true, a library can still throw a different exception type. So this
approach may give people a false sense of security.
Individual exception types
This approach is similar to the mega exception type except each
error condition has a separate data type. The advantages of this is
in the Either
case : you know exactly what will go
wrong because there is only one possible error case for this type
of exception, whereas in the mega exception type, you can catch the
type, but then you might have 30 different error conditions that
could happen. It’s unlikely that all 30 of those error cases could
go wrong, but you would have to assume they could, because of the
type.
For example, if a file doesn’t exist, that’s one type of
exception. If a directory doesn’t exist, that is another type of
exception. So a function such as openFile can claim to throw a
product of these two exception types, the file not existing, or the
directory not existing, or even that you do not have access to the
directory, etc.
Throws (FileNotFound,DirectoryNotFound,AccessDenied)
The disadvantage is it’s hard to combine them. Product types
such as tuples don’t really scale. Type signatures also become very
unwieldy when dozens of different types come into play such as in
the IO APIs in base
. It’s hard to really manage that.
Just consider the large number of exceptions throwable by
base
.
Another issue is that users will have to know about all the
different types of exceptions that can be thrown in order to catch
everything from your library in the impure throw
case
, whereas in the mega exception case it’s easy to catch everything
because everything has already been put in one place for you.
Abstract exception type
This approach is where you have an exception type which is
entirely opaque and not inspectable except by use of accessors. An
example of this is the IOError type in Haskell, which is standard
and used throughout the IO library. For example, the error accessor
isDoesNotExistError
.
The advantage to this is that you can change the internals
without breaking the API. Another advantage is that you can easily
capture the context of an error, because you just put that in an
accessor .
Another advantage is that predicates can be applied to more than
one constructor, such as isFileOpenError
; this could
be an accessor to indicate that you could not open a file but the
reason can be more detailed, such as “unable to access directory”,
“no such file”, or whatever.
The disadvantage of this is: it’s not self documenting on
haddock in the way the transparent exception type is, so you’re
essentially hiding what the different options are from you
users.
Plus, maybe you should break your users’ code when you
change how errors can be thrown; maybe hiding the details just
makes things worse.
Why not both?
Another option is to provide both of these things. So, you
provide a set of constructors but you also provide a set of
predicates which can be used on these types and other accessors .
Or even using pattern synonyms to provide a documented accessor
set, without exposing the internal data type. This would give you
some flexibility, but I do not know of any example in the wild
which implements this approach.
Matt Parsons explores an approach to errors using
prisms and generic-lens that’s worth taking a look at.
Terminate the program
Finally, we can take the C approach and terminate the program,
set the return code to -1. The disadvantage of this is that you
will have your Haskell card torn up and you will be banished to
work on node.js projects forever.
Conclusions
It seems like the base Haskell packages favor the opaque
approach, and many standard task libraries use the mega exception
approach. We’ve discussed the trade-offs. At this point, how to
model your exceptions is strongly in the category of a judgment
call than a clear cut decision.
The only thing that is clear cut to me, is that
String
(or Text
) for error messages is
always the wrong decision, for the reasons outlined above. In
parser libraries, for example, the old approach has been this one.
But in more modern libraries, such as megaparsec, the error
type is provided by the caller of the library. So there isn’t a
need to decide on a concrete type ahead of time.
If you need any help with Haskell please contact
us.
Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.
Do you like this blog post and need help with Next Generation Software Engineering, Platform Engineering or Blockchain & Smart Contracts? Contact us.