At FP Complete we have experience writing Medical Device software that has to go through
rigorous compliance steps and eventually be approved by a
government regulatory body such as the US Food and Drug Administration (FDA).
In this post we'd like to share some of the best practices and
pitfalls we have learned when working in this area.
You may find this blog post especially relevant if you are
- a Software Engineer working on any form of medical
software,
- a Researcher or Data Scientist trying to turn your research
into a product, or
- an Engineering Manager or Product Manager trying to deliver a
medical product,
but of course you are also invited to read and discuss this
topic with us when you are none of these.
Before we get to the problems and best practices, we'll give
some context by describing what a common project setup for Medical
Device software may look like.
Typical Medical Device project setup
A common team structure inside a company working on a Medical
Device might be:
- 10 researchers (mathematicians, statisticians, chemists,
medical experts, data scientists),
- 10 software engineers,
- a project manager, some people managers, some regulatory
experts, and a product owner/business representative.
Together, they want to develop a software product that makes
some form of medical statement (which could be a diagnosis of a
disease, a forecast of how a patient will react to some treatment,
or a recommendation of treatment), or one that takes medical action
(e.g. control logic of a physical device performing a treatment or
administering a medicine).
Because of that, the software will be classified as a “Medical Device” by regulatory bodies such as the
FDA or EMA, even though it is just a computer program.
A common project history is:
- There was an internal R&D phase in which the key algorithms were
discovered; this phase was free of any form of regulation, mainly
containing researchers and no engineers.
- Now the project is entering a productisation phase where the
“device” is built based on that R&D, but many algorithm details
are still unclear.
- Researchers need to continually run smaller experiments on
their machines, and from time to time longer batch experiments that
may run over night . They often do so on separately purchased data
sets to check and tune their algorithms.
During the productisation phase, you are obliged to operate in
“regulation-safe mode”, meaning that all processes and decisions
need to be well-informed and documented. They must be able to get
through a regulatory audit, if you do not want to be at risk that
your product will be denied approval by your regulator and thus
cannot be used or marketed.
The regulatory experts on your team will help you with this,
telling you what certifications you'll need to get and in which
order to perform which steps. However, they are typically not
experts at Software Engineering, and will rarely be able to provide
concrete advice on how to do your Software Engineering to support
the “regulation-safe mode” as much as possible.
Best practices and pitfalls
Now that the project setup is clear, let's get into some best
practices to exercise and pitfalls to avoid when working on Medical
Device software.
Special considerations for working in a regulated medical
environment
Make a list of “reserved terminology keywords”.
The fields of programming and medical regulation have some
overlap in terminology that can result in disastrous
miscommunication unless special care is taken to avoid this. For
example, “unit
testing” may mean two different things from the
engineering and regulation perspectives. Your regulatory expert may
completely misinterpret what regulatory steps you have already
completed when you tell them that you've just finished writing some
“unit tests”.
Disambiguate it to e.g. “engineering unit tests” and
“regulatory unit tests” and enforce across your team that
everybody use only these explicitly qualified phrases, and never
“unit tests” alone.
A list of terminology that we found ambiguous between engineers
and regulatory people includes:
- unit testing
- code review
- verification
- quality
- performance
- “the device”
Many companies in the medical space may have no experience with
software .
Consequently they may try to apply processes to software that
were designed for other products, and do not apply to software.
A common example is the assumption that after the product is
“done”, it will never change again. This is a sensible
expectation for a drug.
Of course this doesn't work with software: Continuous modifications
are needed, already for routine security updates. (You can
drastically reduce the frequency of such updates being necessary by
using an advanced programming language such as Haskell, which is designed for safety and
reliability and thus our language of choice for medical software;
however you will never be able to entirely rule out the need for
post-release updates.)
While this is obvious and natural to any programmer, it may not
be to medical experts, and not understood in many medical
companies. You may meet heavy resistance to any form of agile
development model, continuous deployment setup, and frequent code
changes after the release of the software. You should ensure that
you train the managers and medical experts of the project on this
aspect of software before you start the project, define clear
boundaries between “device-software updates” and “security-software
updates”, and set expectations, e.g. that the software may have to
be recompiled and re-deployed should a security update for an
underlying software library be necessary.
Make CI the central point where all work comes together.
Continuous
Integration (CI) means merging everybody's work together
frequently and running automated tests on it. While CI is common in
software teams by now, researchers and data scientists may not be
used to it. They may be more familiar with the workflow of
developing their own, often one-off scripts and programs on their
PCs and rarely sharing the code with their team members, instead
only sharing the results.
For a regulated project, you should enforce that
everyone on the team checks any code ever
produced for any purpose of the project, into source code version
control. That the results produced by this code should be generated
or reproduced on the shared CI servers, as opposed to be generated
only on a researcher's own PC. This ensures that it is recorded
which exact code produced which exact results in which exact
environment, which helps a lot when making regulatorily relevant
statements such as “our experiments have confirmed our thesis X”.
It also speeds up development, because everybody on the team can
see what everybody else does, or get notified by the CI server when
accidentally breaking somebody else's program or workflow. You
should, where possible, refuse to accept results as certain unless
you have seen them produced by your CI server, and train everybody
on the team how to follow this workflow.
Advertise your tools in the right language.
When we as programmers use advanced technical tooling like
Haskell, we
can easily enumerate the various features that will make the
software more correct and reliable. However, these features may
mean nothing to a medical expert, and thus may not be easily used
by your team for advertising or explaining to a regulator why your
software is especially safe. Consequently you should do research on
what terms will be understood by medical experts, and map your
tools and features into their terminology.
For example, if you use a compiler featuring static analysis,
you might explicitly advertise this as a form of “formal software
verification”, which is a term most medical experts are familiar
with.
Here's a list of cool tools we've used in the past that fall
under “formal software verification”:
- strong typing
- referential transparency (pure functions)
- parametricity
- generative testing
- code coverage
- model checking
- theorem proving
Unexpected changes are the worst.
Code and product changes
As a programmer, you should:
-
Try to make all programs deterministic, ideally
up to byte-identical output.
This will drastically help you get needed code refactorings past
regulatory review, as you can provide evidence that your changes
did not change the functioning of the device.
-
Set up “gold-standard” testing in CI to notice
any change.
Gold-standard testing means that you store the last-approved
outputs of your algorithms on a (large) data set of inputs in your
version control system or (if it doesn't fit in there) in another
form of storage. Each code commit message should then indicate
whether it is expected to change the results or not. After the
results have been computed by CI, proceed according to the
table:
|
results are
identical with gold-standard |
results are
different from gold- standard |
commit message
does not expect change |
good to merge |
not good to merge,
investigate why results changed |
commit message
expects change |
not good to merge,
investigate why change didn't have the desired effect |
possibly good to merge,
let medical / data science team sign off the changed results, then
update the gold-standard outputs |
Note how this is different from engineering unit-testing:
In engineering unit-testing, the programmer defines and
understands precisely what the output of the algorithm is for each
single test case. In gold-standard testing, the idea is not to
understand the output for each input, but to get notified when
outputs change (independent of what exactly the outputs look like).
Because of this, gold-standard tests are easier to write: They
require no thinking effort from the programmer, they only require
input data to run on.
Make only controlled changes:
- Make people announce when they expect a change.
- Roll back any unexpected change.
- Every change must be traceable to a concrete requirement. This
bit can be done with low overhead by having commits and code
comments reference issue tracker entries, and the issue tracker
being well maintained to link together code features with technical
requirements (“feature X shouldn't crash and be easy to
understand”), regulatory requirements (“computation X must not
store user data”), or business requirements (“computation X must
finish in under an hour”).
Process changes
While software engineers love to upgrade their stack and switch
tools and processes frequently, medical people tend to hate it.
However, there are ways to make them more comfortable with it.
As a product manager or similar role, when you
want to make a process change, stick to a predictable order such
as:
- Analyse what change needs to be made
- Announce that the team will be moving to a new approach X in
the future, with a concrete proposal.
- Collect feedback, inviting everyone whose workflow might be
touched by this move to provide input of how and when it should be
done to reduce disruption to a minimum.
- Give it a memorable name that people can use for referring to
the motion.
- Perform coordinated switchover at a pre-announced time, making
sure everybody knows about it in advance.
Here is an example:
Let's say it is necessary that data scientists switch their
working environment operating system (OS) from Windows to Linux so
that developers can more easily reproduce their results in the
production software.
- Investigate in audience-limited conversations (e.g. with
programmers) whether the data scientists' desktop OS has to be
changed, or whether it is sufficient that they connect to a Linux
machine from their current Windows machines.
- Announce that the team would like to move the data scientist
workspaces from Windows to Linux within the next three months, and
present your concrete proposal so far which may include a
video-tutorial based training on how to use the new work spaces
remotely from Windows, as well as a dedicated engineer to help with
the migration.
- Collect feedback such as a data scientist saying that some
scripts don't work on Windows. Discuss with this data scientist
(but in public) whether an engineer helping to port these script to
Windows before the move would address that issue. Another data
scientist may point out that the move should be done after
producing results X but before starting feature Y. Refine the
schedule accordingly.
- Call the motion “Datasci-Linux”.
- Ensure everybody knows that “Datasci-Linux” will be performed
in the last week of April.
Team organisation
Have a real ops , tools and help team .
A lean “DevOps”- only approach usually
doesn't work with researchers.
While developers like to control machines and servers themselves
and the team can be made more efficient that way, researchers like
to have their heavy machinery moved by people who understand what
they are doing.
Thus, as a manager, you should make sure that:
- Ops should take care of researchers' working environments,
software needed, computing clusters and so on, so that researchers
don't have to spend time on trial and error (unless they
want to learn it).
- If a researcher wants to do some overnight computation job,
assign them an engineer to execute it properly.
- Recurring jobs are coded up so they can be more automated.
Non-software people are surprisingly unfamiliar with that idea and
will happily do the same manual task again and again.
- The rule of thumb is: Do it manually 3 times, then code it up
(this is a good rule for general software development, but you may
have to emphasise it especially in a medical environment where
manual procedures that cannot be automated are very common).
- Ensure you have people who can continually help with every-day
issues with tools the team uses, and are tasked to train everybody
in using and understanding version control software and the
development model. A lot of time can be wasted if somebody does not
understand how to get their changes in the right place with
git
,
pushes things to the wrong branch, and so on.
Separate roles
Define ahead of time what role can block what activity to avoid
unnecessary project slowdowns.
As a Project Manager, you should make sure that:
- Regulatory people don't use their almost unlimited veto power
to block decisions that are outside of their domain. For example, a
regulatory reviewer should not use their veto to enforce changes
that are irrelevant for regulatory review.
- Programmers should be able to block researcher or regulatory
decisions when they are not realisable , such as using a given
method when it cannot be implemented correctly, accurately, or in
time.
- You (the Project Manager) are actually able to exercise the
power over the schedule and work items that was given to you. A
Project Manager's responsibility is to ensure realistic estimates,
also at times pushing back against features that executives may
want to see in short time, if, based on programmer or researcher
feedback, they cannot be realised that quickly.
Managing code, processes and documentation
Version control
Enforce that all code be checked into version
control. Make no exceptions here.
Arrange for personal scrap spaces in version
control, that are clearly marked as not being under the same
scrutiny as “device code”. If you do not do this, researchers and
programmers will not check their experiments into version control,
and the project will suffer. Examples for such scrap spaces are
branches prefixed with wip/
(for work-in-progress),
and a personal-workspaces/username
directory
hierarchy.
In general, always clearly separate device-code and
non-device code. This need not mean that they should be in
independent source code repositories (as that would forbid ensuring
experimental scripts work with the latest version of device-code).
Instead, use other explicit means as separation, such as having one
directory for device , and one for non-device code.
Relatedly, separate the device from the
platform needed to run the device (such as deployment
infrastructure and server tools). As mentioned earlier, this is
especially important for infrastructure security updates.
You should optimise version control usage for
efficiency. For example: Have branches with a
doc-
prefix only run documentation builds, and skip
the big or costly stages other builds may include. People will hate
tools for structured working such as version control and CI if it
makes their workflow slow. Always provide fast ways to do
things.
If possible, use a linear development model in
version control (such as a “rebasing” workflow in
git
). In an environment where reproducibility is of
utmost importance, being able to do automatic
bisections to find regressions is more important than
developers having to resolve more merge conflicts.
Be especially careful with development practices that can scare
regulatory people.
TODOs
As a programmer or data scientist,
Don't write : TODO: fix this code
.
- This may suggest there is a flaw in the device that can make it
unsafe, or that it is unfinished.
- Assume that regulatory reviewers have no understanding of
programming and take you literally by the words you write.
Do write: TODO-ENG: Future performance
enhancement: While this computes the correct result and is safe to
use, we should make this faster by doing XYZ.
For each project, define and document clear criteria for labels
like TODO
.
For example, you might designate TODO-ENG
as a
label to mean “irrelevant for the medical device operating
correctly, but engineering would like to change this”, and
TODO-DEVICE
as a label to mean “this must be
changed before the release or next major milestone on the
roadmap”. You can then ensure before the next milestone that
all TODO-DEVICE
labels are gone.
Ensure everybody (including regulatory people) know which label
means what. Add this information to your documentation. Also see
the next point for more on that.
Enforce documentation for all coding processes
Whenever you make a decision of how things are done in the
project, write it down, ideally in version control.
Don't propagate engineering, review, and other process rules by
word of mouth. One way regulators assess you is whether you stick
to your own processes; they will not be able to find evidence of
you doing so if you haven't written the processes down.
Finding documentation
Only having documentation is not enough. It also needs
to be discoverable.
Use simple and
obvious ways for people to find any documentation they might need.
An approach that works well is to place a README
file
in each sub-project's top level directory (of course under version
control), and link to other documents from this entry point.
Use a simple tagging scheme, such as tags in brackets (e.g.
[ALIEN-SALIVA-DENSITY-ESTIMATION]
) that allows you to
place textual anchors and references to them in code and
documentation. This is because linking from documentation to
documentation (which may be easier, e.g. using hyperlinks) is not
enough; you will also need to link from code to docs and from docs
to code (and referring to file name plus line number is obviously
not a good choice given that code can move around).
Medical device software tends to have a lot of
documentation, so you will have many links and references in your
project. At the time of an audit, you don't want auditors unable to
follow outdated documentation links. Have your tools team write
tooling to find dangling links and references, possibly also to
produce simple graphs so that you can easily visualise
documentation references.
Interaction between researchers and engineers
You cannot simply throw a bunch of engineers and researchers
together and expect that they will work in perfect symbiosis and
produce the desired results.
In many companies, R&D and Engineering may be separate
departments that may have developed different ways of working and
communicating. This maybe even more true when one of the two sides
is brought in by a different company or via contracting . Bringing
them together often warrants extra planning and being more explicit
than usual when setting up joint workflows.
Define clearly who is in charge at each stage of the
project.
- In the R&D phase, engineering should
likely assist researchers to get good results, quickly.
- In the productisation phase, researchers
should likely assist engineers to make an excellent product.
Discourage walled-off thinking.
Make clear that the success of the project depends on the
successful interaction between researchers and engineers.
Most importantly, be aware of the the “my side is
fine” problem.
Researchers like to think:
These are my preconditions, and
they have to be provided by the engineers. If those are provided,
we'll be fine.
Engineers like to think:
As long as I code up these maths
written by the researchers, I'll be safe.
As a result, neither of the two sides makes sure that the
critical preconditions that make the system work are actually
provided.
To avoid this, you should make sure each side understands the
other well, that the interface between them is understood
especially well by both, and that they talk often about it.
Encourage mutual training: Have Researchers train
Engineers to understand their maths, and Engineers train
Researchers to read their code.
Establish critical thinking and a culture where everyone can
ask everything.
This is one of the most important bits when trying to make a
safe device.
Allow and encourage any form of understanding
question. “Is this safe to do, and why?” should
be a common thing to be heard and written in your project.
Establish that this does not question anybody’s
reputation. Employ blame-free evaluation and analysis
techniques.
Summary
Hopefully you have found these insights useful or
interesting.
If you'd like our help with delivering Medical Device software,
don't hesitate to contact us.
Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.
Do you like this blog post and need help with Next Generation Software Engineering, Platform Engineering or Blockchain & Smart Contracts? Contact us.