People used to think software development was “done” when the code was written and passed all its tests. But modern IT systems aren’t done until they are online, running, and integrated with their data feeds, storage, networks, and administration systems. This can involve very elaborate operations steps, such as dynamically creating a whole array of virtual servers, storage devices, accounts, software configurations, and network configurations.
These days, almost every company wants DevOps to automate such powerful infrastructure, speed up release cycles, and improve reliability and uptime. As an ultra-high-value and rapidly changing industry, Financial Technology lives near the cutting edge of innovation. Its DevOps can include continuous integration and continuous deployment (CI/CD), automated testing, containerization (as with Docker and Kubernetes), system monitoring, use of cloud features (like AWS, Azure, or a private cloud), virtual private clouds, extensive firewalls, advanced network security, and more.
FinTech IT projects have a lot at stake, and wise engineers will hand-pick DevOps priorities to match the project’s objectives and exposures. Let’s look first at what FinTech overall should expect from DevOps, and then at how different subfields should emphasize additional, specialized DevOps requirements.
What every FinTech solution needs from DevOps
Compared to other industries, FinTech places an unusually high priority on:
- Maintainability: improved analyses, features, and data integrations must roll out very frequently, with very low latency
- Quality: an uncaught mistake or security hole quickly runs to millions of US Dollars in cost, sometimes much more
- Data integration: FinTech applications are fundamentally about digesting a never-ending stream of new information, and the more feeds (or the more atomic inputs) that can be handled, the better.
DevOps is maintainability’s best friend. As far back as 2013, here at FP Complete, we were releasing large upgrades to major server applications every few weeks. These days, at large Internet companies it is routine to see daily release cycles, and faster is quite doable.
FP Complete recommends a fully automated Continuous Integration and Continuous Deployment (CI/CD) system, including automated builds, an automated test suite, immutable containerized servers, and post-deployment health checks with rollback capability (Blue-Green deployment. If you haven’t implemented containers yet, you almost certainly should.
FP Complete also recommends a formal Quality Assurance system.
Data integration can be quite application-specific, but FP Complete recommends choosing a very small number of supported data formats, and having a clean layer providing these formats after data ingestion, normalization, and cleansing from a more diverse set of inputs. A service-oriented architecture (SOA) can make it easy to add new data-feed parsers in a completely language-independent manner, ensuring system extensibility. (See my comments on “Modular Design” here. Automated deployment and monitoring let you have more services running as separate processes, by eliminating the need to manually examine each server’s status constantly.
So DevOps can be a great help to FinTech in general. But there is still far more to be had -- and our next priorities depend on which kind of FinTech solution we are building.
DevOps for Cryptocurrency
Cryptocurrency systems are of course sensitive to attacks in which a person attempts to steal the coins. Numerous real-world losses have been traced to a failure to implement proper DevOps, leaving opportunities for criminals.
If you’re implementing or trading a cryptocurrency, here are some DevOps issues to focus on right away:
- Automated testing. If your build system allows you to release code that has not been through your test suite -- or worse, allows you to be unsure whether the released code was tested -- you are taking undue risk. Quality assurance automation should be a core part of your build system. This is even truer if you are using CI/CD, where code improvements may be released quite frequently. “I write quality code in the first place” is great, but it’s not a substitute for automated testing.
- Component isolation. To minimize the chances of malfeasance, sensitive systems should be modular in design, and unrelated components should run in separate processes -- ideally in completely separate VMs separated by firewalls. A defect, code injection, privilege escalation, or social-engineering attack on one service or component should still be unable to tamper with another.
- Storage redundancy. It’s amazing to think that some people implement trading and coin-storage systems without redundant storage. With many cryptocurrencies, your coins can be permanently lost if a unique code number, only a few kilobytes in length, is lost. Use automated deployments on the virtual cloud to ensure that all your trading and management systems are always deployed to redundant cloud storage with inherent fault tolerance and permanent backups.
- Separation of roles. The amount of value accessed by some cryptocurrency components is so high that you must consider the impact of a compromised person. If your deployment architecture has a single “admin” role that gives one person the ability to deploy code, access storage, turn off monitors, and change audit logs, you asking for that person to get into trouble. Don’t tempt anyone to pressure your staff: make it impossible for any one person to change where large sums of money go, or at least make it impossible for them to do so without setting off alarms. Create different admin roles for different system components and layers -- roles that are not available to the same person at the same time.
DevOps for automated trading
If you are trusting your computer system to move and trade assets autonomously, you need absolute correctness, inviolable security, and very rapid response to trouble incidents. In addition to the concerns I listed for cryptocurrency , pay attention to these DevOps priorities:
- Automated load testing and regression testing. Since you are likely to update your trading algorithms many times a year, there are many opportunities to introduce performance problems. If a slow trade can be worse than no trade, no build should be allowed to go into production without automated performance testing under heavy simulated load. It’s not enough to say “my code runs fast,” you need to be able to say “my whole deployed system runs fast, even when bombarded by fast inputs.”
- Immutable servers It is incredibly tempting to patch production systems with improvements. But without careful controls, this leads to having production servers in a state that is completely unprecedented in your test environment. Instead, use automated deployment to create new copies of your servers with the new code already in place -- and when these pass your test suite, swap them in, make sure they’re up, then shut down and delete the old unpatched virtual servers. This kind of roll-forward can be completely automated with tools like Kubernetes, and can take advantage of cautious switch-over techniques like blue-green deployments or canary deployments.
Download our new guide
DevOps for human-assisted trading
This is a relatively forgiving application if your users are in-house or otherwise very tolerant of imperfection. (If you’re providing trading services to external parties, you can expect to be held to a very high standard.) Ask yourself, “what is the cost of a typical failure to our business?” Systems with a human in the loop are sometimes more error-tolerant than systems without a human in the loop.
However, you will need to think much more about usability testing, because a confusing UI update can introduce human error; and your automated test suite should include tests that drive the system through the UI, to detect coding bugs in that layer.
Human users of FinTech systems are often very powerful people, and a small number of unhappy users can make for a very bad day. So in addition to extensive testing, your DevOps practices may need to include gradual deployment of new versions to a test population of a few users (canary deployments), and support for both halts and rollbacks in case of significant trouble reports.
DevOps for asset valuation, market analysis, and research
These tasks often amount to a medium amount of math, performed on a large number of input feeds and databases. Many firms construct unique asset valuation formulas by insightfully combining data that no one else was combining, or doing complex combinations with uniquely clever functions and formulas. Competitive advantage comes from generating unique insights, and these come from the ability to scale up innovative formulas and innovative data integrations quickly.
At FP Complete we regularly hear from FinTech firms that have built important analyses running on just one or a few desktops, who need these scaled up to a reliable server-based system. Beyond all the usual software engineering and cloud deployment problems, a key concern is maintainability . Analysts are used to updating their formulas several times per month and then sharing them with colleagues. And it’s important that new versions don’t always offline the old versions, which a colleague may still be using.
A version control system attached to a CI/CD system works wonders for safe maintainability. But it needs to be coupled to a simple metadata system that keeps track of which versions are now running at which addresses -- and which allows versions with zero remaining clients to be shut down.
DevOps for consumer banking and account management
These applications require an exceptional amount of integration with legacy systems, some very old. They have to maintain an extremely consistent user interface, for use by clients who can be upset by the unnecessary change. They have all the requirements of an e-commerce application, such as resistance to sudden surges in demand. And they are subject to extremely large-scale Web-based security attacks, as the payoff for a successful criminal break-in could be enormous.
DevOps for voting
Voting, such as for shareholder votes or Board of Directors elections, is a particularly sensitive subject, with huge decisions being made and significant legal exposure. Ordinary voting is rather similar to consumer banking and e-commerce (using votes as the currency), but where governance rules require anonymity, standards increase enormously. You must earn voter confidence, and your systems should be able to pass a really rigorous audit, including against insider malfeasance, while protecting the privacy of each voter.
For such systems we recommend a DataOps solution in which raw inputs (from user interaction) are fully quarantined from the apps that handle persistent data storage, using very assertive firewalls and very low-permission accounts. An auditor must be able to verify that systems holding private user data are completely inaccessible from unauthorized locations.
Since the anonymization steps at the application layer may be intentionally irreversible, anonymized data should be stored with very high redundancy. It may be impossible to auditably reconstruct from scratch after identity data has been discarded.
Compliance, regulation, and auditability
For applications subject to extensive outside controls, it’s important to demonstrate adherence to the spec (application verification) and to be able to trace concerning behavior back from the running system to the code and checkins that caused it (traceability).
For verification, ensure that you have an automated test suite with organized test case management and that it is automatically fired up as a part of your CI/CD system. Be sure your full test suite is run before real deployments, not just your quick check test suite (sometimes called the smoke test) which is automatically run every time a build is done.
For traceability, ensure that your CI/CD system inserts serial numbers into your distributable software containers and other built artifacts, and records what artifact version numbers were used (including source code, libraries, and tools). And require that checkins to your version control system include links back to the requirements they were meant to satisfy.
To ensure that what went into production is still what’s in production, don’t grant permission to apply manual changes to running servers. Create admin accounts with limited permissions that can’t be used as “back doors,” and use immutable servers so that a new deployment is required when someone wants to change what’s on a server.
Security and endurance against direct attack
If your application is on the public Web, the server cloud design, the software maintenance schedule, and the network/firewall design all need to be designed to withstand malicious treatment. The average IT organization currently spends 12% to 15% of its budget on security.
DevOps can do much more to defend you than many people realize, and can make the most of your security budget.
Many security breaches happen through social engineering. Reduce these opportunities by automating control of your servers under distinct robot admin accounts, ones that normal users never use. Keep dangerous permissions away from regular IT staff going about their days.
Other critical breaches have famously happened through old, unpatched software with known vulnerabilities. Routinely audit the versions of operating systems and runtime components that are installed on all your servers, to ensure you don’t have obsolete ones in production. This can be largely automated.
Security breaches are worsened by having far too much access available in a single place, allowing a small intrusion to escalate into a big one. Take advantage of cloud network configurability , separate VMs and containers, and firewalls, to ensure that critical attack targets (like databases and production servers) are hard to reach and extra hard to enter as administrator, and to ensure that critical attack vectors (like front-end servers) are quarantined, firewalled, and monitored for unauthorized activity.
How do we get there from here?
Unlike some older technologies, DevOps is not monolithic and can be implemented in small steps over an arbitrary period of time. Even the sequence of these steps is flexible. FP Complete recommends an incremental approach.
If you already have traditional software engineering and traditional system operations and monitoring in place, focus next on (A) streamlining your software engineering environment, or (B) containerizing and automating your deployment systems. Either makes a great next step and will put you well on the road to complete DevOps.
As always, remember that FP Complete is available to do a readiness assessment project for your DevOps, cloud, and other IT systems engineering. With experience on numerous advanced IT projects, we’re happy to team up with you with planning, design, implementation, knowledge transfer, audits, and upgrades.
For More Information
- Ten Common Mistakes to Avoid in FinTech Software Development
- How to Measure the Success of DevOps
- How do I build my DevOps Team?