Best Practices for Developing Medical Device Software

At FP Complete we have experience writing Medical Device software that has to go through rigorous compliance steps and eventually be approved by a government regulatory body such as the US Food and Drug Administration (FDA).

In this post we'd like to share some of the best practices and pitfalls we have learned when working in this area.

You may find this blog post especially relevant if you are

a Software Engineer working on any form of medical software,
a Researcher or Data Scientist trying to turn your research into a product, or
an Engineering Manager or Product Manager trying to deliver a medical product,

but of course you are also invited to read and discuss this topic with us when you are none of these.

Before we get to the problems and best practices, we'll give some context by describing what a common project setup for Medical Device software may look like.

Typical Medical Device project setup

A common team structure inside a company working on a Medical Device might be:

10 researchers (mathematicians, statisticians, chemists, medical experts, data scientists),
10 software engineers,
a project manager, some people managers, some regulatory experts, and a product owner/business representative.

Together, they want to develop a software product that makes some form of medical statement (which could be a diagnosis of a disease, a forecast of how a patient will react to some treatment, or a recommendation of treatment), or one that takes medical action (e.g. control logic of a physical device performing a treatment or administering a medicine).

Because of that, the software will be classified as a “Medical Device” by regulatory bodies such as the FDA or EMA, even though it is just a computer program.

A common project history is:

There was an internal R&D phase in which the key algorithms were discovered; this phase was free of any form of regulation, mainly containing researchers and no engineers.
Now the project is entering a productisation phase where the “device” is built based on that R&D, but many algorithm details are still unclear.
Researchers need to continually run smaller experiments on their machines, and from time to time longer batch experiments that may run over night . They often do so on separately purchased data sets to check and tune their algorithms.

During the productisation phase, you are obliged to operate in “regulation-safe mode”, meaning that all processes and decisions need to be well-informed and documented. They must be able to get through a regulatory audit, if you do not want to be at risk that your product will be denied approval by your regulator and thus cannot be used or marketed.

The regulatory experts on your team will help you with this, telling you what certifications you'll need to get and in which order to perform which steps. However, they are typically not experts at Software Engineering, and will rarely be able to provide concrete advice on how to do your Software Engineering to support the “regulation-safe mode” as much as possible.

Best practices and pitfalls

Now that the project setup is clear, let's get into some best practices to exercise and pitfalls to avoid when working on Medical Device software.

Special considerations for working in a regulated medical environment

Make a list of “reserved terminology keywords”.

The fields of programming and medical regulation have some overlap in terminology that can result in disastrous miscommunication unless special care is taken to avoid this. For example, “unit testing” may mean two different things from the engineering and regulation perspectives. Your regulatory expert may completely misinterpret what regulatory steps you have already completed when you tell them that you've just finished writing some “unit tests”.

Disambiguate it to e.g. “engineering unit tests” and “regulatory unit tests” and enforce across your team that everybody use only these explicitly qualified phrases, and never “unit tests” alone.

A list of terminology that we found ambiguous between engineers and regulatory people includes:

unit testing
code review
verification
quality
performance
“the device”

Many companies in the medical space may have no experience with software .

Consequently they may try to apply processes to software that were designed for other products, and do not apply to software.

A common example is the assumption that after the product is “done”, it will never change again. This is a sensible expectation Medical Device Concept - Small.jpg for a drug. Of course this doesn't work with software: Continuous modifications are needed, already for routine security updates. (You can drastically reduce the frequency of such updates being necessary by using an advanced programming language such as Haskell, which is designed for safety and reliability and thus our language of choice for medical software; however you will never be able to entirely rule out the need for post-release updates.)

While this is obvious and natural to any programmer, it may not be to medical experts, and not understood in many medical companies. You may meet heavy resistance to any form of agile development model, continuous deployment setup, and frequent code changes after the release of the software. You should ensure that you train the managers and medical experts of the project on this aspect of software before you start the project, define clear boundaries between “device-software updates” and “security-software updates”, and set expectations, e.g. that the software may have to be recompiled and re-deployed should a security update for an underlying software library be necessary.

Make CI the central point where all work comes together.

Continuous Integration (CI) means merging everybody's work together frequently and running automated tests on it. While CI is common in software teams by now, researchers and data scientists may not be used to it. They may be more familiar with the workflow of developing their own, often one-off scripts and programs on their PCs and rarely sharing the code with their team members, instead only sharing the results.

For a regulated project, you should enforce that everyone on the team checks any code ever produced for any purpose of the project, into source code version control. That the results produced by this code should be generated or reproduced on the shared CI servers, as opposed to be generated only on a researcher's own PC. This ensures that it is recorded which exact code produced which exact results in which exact environment, which helps a lot when making regulatorily relevant statements such as “our experiments have confirmed our thesis X”. It also speeds up development, because everybody on the team can see what everybody else does, or get notified by the CI server when accidentally breaking somebody else's program or workflow. You should, where possible, refuse to accept results as certain unless you have seen them produced by your CI server, and train everybody on the team how to follow this workflow.

Advertise your tools in the right language.

When we as programmers use advanced technical tooling like Haskell, we can easily enumerate the various features that will make the software more correct and reliable. However, these features may mean nothing to a medical expert, and thus may not be easily used by your team for advertising or explaining to a regulator why your software is especially safe. Consequently you should do research on what terms will be understood by medical experts, and map your tools and features into their terminology.

For example, if you use a compiler featuring static analysis, you might explicitly advertise this as a form of “formal software verification”, which is a term most medical experts are familiar with.

Here's a list of cool tools we've used in the past that fall under “formal software verification”:

strong typing
referential transparency (pure functions)
parametricity
generative testing
code coverage
model checking
theorem proving

Unexpected changes are the worst.

Code and product changes

As a programmer, you should:

Try to make all programs deterministic, ideally up to byte-identical output.

This will drastically help you get needed code refactorings past regulatory review, as you can provide evidence that your changes did not change the functioning of the device.

Set up “gold-standard” testing in CI to notice any change.

Gold-standard testing means that you store the last-approved outputs of your algorithms on a (large) data set of inputs in your version control system or (if it doesn't fit in there) in another form of storage. Each code commit message should then indicate whether it is expected to change the results or not. After the results have been computed by CI, proceed according to the table:

	results are identical with gold-standard	results are different from gold- standard
commit message does not expect change	good to merge	not good to merge, investigate why results changed
commit message expects change	not good to merge, investigate why change didn't have the desired effect	possibly good to merge, let medical / data science team sign off the changed results, then update the gold-standard outputs

Note how this is different from engineering unit-testing:

In engineering unit-testing, the programmer defines and understands precisely what the output of the algorithm is for each single test case. In gold-standard testing, the idea is not to understand the output for each input, but to get notified when outputs change (independent of what exactly the outputs look like). Because of this, gold-standard tests are easier to write: They require no thinking effort from the programmer, they only require input data to run on.

Make only controlled changes:

Make people announce when they expect a change.
Roll back any unexpected change.
Every change must be traceable to a concrete requirement. This bit can be done with low overhead by having commits and code comments reference issue tracker entries, and the issue tracker being well maintained to link together code features with technical requirements (“feature X shouldn't crash and be easy to understand”), regulatory requirements (“computation X must not store user data”), or business requirements (“computation X must finish in under an hour”).

Process changes

While software engineers love to upgrade their stack and switch tools and processes frequently, medical people tend to hate it. However, there are ways to make them more comfortable with it.

As a product manager or similar role, when you want to make a process change, stick to a predictable order such as:

Analyse what change needs to be made
Announce that the team will be moving to a new approach X in the future, with a concrete proposal.
Collect feedback, inviting everyone whose workflow might be touched by this move to provide input of how and when it should be done to reduce disruption to a minimum.
Give it a memorable name that people can use for referring to the motion.
Perform coordinated switchover at a pre-announced time, making sure everybody knows about it in advance.

Here is an example:

Let's say it is necessary that data scientists switch their working environment operating system (OS) from Windows to Linux so that developers can more easily reproduce their results in the production software.

Investigate in audience-limited conversations (e.g. with programmers) whether the data scientists' desktop OS has to be changed, or whether it is sufficient that they connect to a Linux machine from their current Windows machines.
Announce that the team would like to move the data scientist workspaces from Windows to Linux within the next three months, and present your concrete proposal so far which may include a video-tutorial based training on how to use the new work spaces remotely from Windows, as well as a dedicated engineer to help with the migration.
Collect feedback such as a data scientist saying that some scripts don't work on Windows. Discuss with this data scientist (but in public) whether an engineer helping to port these script to Windows before the move would address that issue. Another data scientist may point out that the move should be done after producing results X but before starting feature Y. Refine the schedule accordingly.
Call the motion “Datasci-Linux”.
Ensure everybody knows that “Datasci-Linux” will be performed in the last week of April.

Team organisation

Have a real ops , tools and help team .

A lean “DevOps”- only approach usually doesn't work with researchers.

While developers like to control machines and servers themselves and the team can be made more efficient that way, researchers like to have their heavy machinery moved by people who understand what they are doing.

Thus, as a manager, you should make sure that:

Ops should take care of researchers' working environments, software needed, computing clusters and so on, so that researchers don't have to spend time on trial and error (unless they want to learn it).
If a researcher wants to do some overnight computation job, assign them an engineer to execute it properly.
Recurring jobs are coded up so they can be more automated. Non-software people are surprisingly unfamiliar with that idea and will happily do the same manual task again and again.
The rule of thumb is: Do it manually 3 times, then code it up (this is a good rule for general software development, but you may have to emphasise it especially in a medical environment where manual procedures that cannot be automated are very common).
Ensure you have people who can continually help with every-day issues with tools the team uses, and are tasked to train everybody in using and understanding version control software and the development model. A lot of time can be wasted if somebody does not understand how to get their changes in the right place with git , pushes things to the wrong branch, and so on.

Separate roles

Define ahead of time what role can block what activity to avoid unnecessary project slowdowns.

As a Project Manager, you should make sure that:

Regulatory people don't use their almost unlimited veto power to block decisions that are outside of their domain. For example, a regulatory reviewer should not use their veto to enforce changes that are irrelevant for regulatory review.
Programmers should be able to block researcher or regulatory decisions when they are not realisable , such as using a given method when it cannot be implemented correctly, accurately, or in time.
You (the Project Manager) are actually able to exercise the power over the schedule and work items that was given to you. A Project Manager's responsibility is to ensure realistic estimates, also at times pushing back against features that executives may want to see in short time, if, based on programmer or researcher feedback, they cannot be realised that quickly.

Managing code, processes and documentation

Version control

Enforce that all code be checked into version control. Make no exceptions here.

Arrange for personal scrap spaces in version control, that are clearly marked as not being under the same scrutiny as “device code”. If you do not do this, researchers and programmers will not check their experiments into version control, and the project will suffer. Examples for such scrap spaces are branches prefixed with wip/ (for work-in-progress), and a personal-workspaces/username directory hierarchy.

In general, always clearly separate device-code and non-device code. This need not mean that they should be in independent source code repositories (as that would forbid ensuring experimental scripts work with the latest version of device-code). Instead, use other explicit means as separation, such as having one directory for device , and one for non-device code.

Relatedly, separate the device from the platform needed to run the device (such as deployment infrastructure and server tools). As mentioned earlier, this is especially important for infrastructure security updates.

You should optimise version control usage for efficiency. For example: Have branches with a doc- prefix only run documentation builds, and skip the big or costly stages other builds may include. People will hate tools for structured working such as version control and CI if it makes their workflow slow. Always provide fast ways to do things.

If possible, use a linear development model in version control (such as a “rebasing” workflow in git). In an environment where reproducibility is of utmost importance, being able to do automatic bisections to find regressions is more important than developers having to resolve more merge conflicts.

Be especially careful with development practices that can scare regulatory people.

TODOs

As a programmer or data scientist,

Don't write : TODO: fix this code .

This may suggest there is a flaw in the device that can make it unsafe, or that it is unfinished.
Assume that regulatory reviewers have no understanding of programming and take you literally by the words you write.

Do write:

TODO-ENG: Future performance
enhancement: While this computes the correct result and is safe to
use, we should make this faster by doing XYZ.

For each project, define and document clear criteria for labels like TODO .

For example, you might designate TODO-ENG as a label to mean “irrelevant for the medical device operating correctly, but engineering would like to change this”, and TODO-DEVICE as a label to mean “this must be changed before the release or next major milestone on the roadmap”. You can then ensure before the next milestone that all TODO-DEVICE labels are gone.

Ensure everybody (including regulatory people) know which label means what. Add this information to your documentation. Also see the next point for more on that.

Enforce documentation for all coding processes

Whenever you make a decision of how things are done in the project, write it down, ideally in version control.

Don't propagate engineering, review, and other process rules by word of mouth. One way regulators assess you is whether you stick to your own processes; they will not be able to find evidence of you doing so if you haven't written the processes down.

Finding documentation

Only having documentation is not enough. It also needs to be discoverable.

Software Documentation.jpg Use simple and obvious ways for people to find any documentation they might need. An approach that works well is to place a README file in each sub-project's top level directory (of course under version control), and link to other documents from this entry point.

Use a simple tagging scheme, such as tags in brackets (e.g. [ALIEN-SALIVA-DENSITY-ESTIMATION]) that allows you to place textual anchors and references to them in code and documentation. This is because linking from documentation to documentation (which may be easier, e.g. using hyperlinks) is not enough; you will also need to link from code to docs and from docs to code (and referring to file name plus line number is obviously not a good choice given that code can move around).

Medical device software tends to have a lot of documentation, so you will have many links and references in your project. At the time of an audit, you don't want auditors unable to follow outdated documentation links. Have your tools team write tooling to find dangling links and references, possibly also to produce simple graphs so that you can easily visualise documentation references.

Interaction between researchers and engineers

You cannot simply throw a bunch of engineers and researchers together and expect that they will work in perfect symbiosis and produce the desired results.

In many companies, R&D and Engineering may be separate departments that may have developed different ways of working and communicating. This maybe even more true when one of the two sides is brought in by a different company or via contracting . Bringing them together often warrants extra planning and being more explicit than usual when setting up joint workflows.

Define clearly who is in charge at each stage of the project.

In the R&D phase, engineering should likely assist researchers to get good results, quickly.
In the productisation phase, researchers should likely assist engineers to make an excellent product.

Discourage walled-off thinking.

Make clear that the success of the project depends on the successful interaction between researchers and engineers.

Most importantly, be aware of the the “my side is fine” problem.

Researchers like to think:

These are my preconditions, and they have to be provided by the engineers. If those are provided, we'll be fine.

Engineers like to think:

As long as I code up these maths written by the researchers, I'll be safe.

As a result, neither of the two sides makes sure that the critical preconditions that make the system work are actually provided.

To avoid this, you should make sure each side understands the other well, that the interface between them is understood especially well by both, and that they talk often about it. Encourage mutual training: Have Researchers train Engineers to understand their maths, and Engineers train Researchers to read their code.

Establish critical thinking and a culture where everyone can ask everything.

This is one of the most important bits when trying to make a safe device.

Allow and encourage any form of understanding question. “Is this safe to do, and why?” should be a common thing to be heard and written in your project. Establish that this does not question anybody’s reputation. Employ blame-free evaluation and analysis techniques.

Summary

Hopefully you have found these insights useful or interesting.

If you'd like our help with delivering Medical Device software, don't hesitate to contact us.

Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.

Do you like this blog post and need help with Next Generation Software Engineering, Platform Engineering or Blockchain & Smart Contracts? Contact us.