How to start a Disaster Recovery Plan for your website
“Why do websites fail, Sir?”
“So that we can learn to back them up”
I’m pretty sure that’s what Alfred meant to say in the now famous scene from Batman Begins.
Most modern organisations view their website as their primary marketing channel. When your entire marketing, lead generation and customer engagement platform is digital – you can’t afford for it to be insecure, unstable, unresponsive, or worse still, offline.
Nobody wants to see their website fail, so having in place a solid plan for recovery in the event that you do suffer a website outage is the best way to ensure that you limit the damage done by such failures.
And these failures most certainly do happen. Remember at the beginning of 2017 when United Airlines grounded all flights due to a computer outage? Or the Salesforce outage that purportedly cost the company $20 million?
While these failures will often make the headlines, it’s the way you handle them that determines whether your company will survive. Earlier this year, GitLab, a start-up funded to the tune of $25 million, suffered an 18 hour outage after a staff member deleted the wrong files. However, GitLab actually earned themselves more customers and a stronger reputation to boot, due to being transparent about what went wrong, getting up and running again as quickly as possible, and putting in place new procedures to ensure the same failure couldn’t happen again.
Extremely impressed with the level of transparency- good luck getting it cleaned up.
— Adam Caudill (@adamcaudill) February 1, 2017
Our Technical Lead, Jonathan Rhodes, was recently interviewed for a Business Guide on Disaster Recovery and so we thought you’d be interested to learn more about DR strategies and the security precautions that can be employed to mitigate the risk of website failures. Hopefully these will help you in the event that you ever find yourself in the same position as those described above.
The Weakest Link
What causes the disasters that our websites need to recover from?
Obviously, power outages with your webhost and other similar scenarios do happen but can be easily addressed by using cloud-based hosting like AWS, or edge caching services like Cloudflare to keep a basic version of your site running, even if extended functionality is unavailable. Hacking is undoubtedly a serious concern, depending on the type of information your platform provides access to, and there are many examples of exploits – some, like the Shadow Brokers Exploit dating back to 2013 – that are still causing problems now. Here, we’re going to focus on security-related areas that you should consider addressing.
The security of your platform can be compromised in a number of ways, including disgruntled or misinformed employees, lack of encryption, out-of-date software and external contractors. Following the advice below should help put you on the right track to bolstering your security.
Humans can be unpredictable. While in most cases this is a good thing and creates an interesting and diverse workplace, sometimes a disgruntled employee and can take direct action against the company, and being so close to your systems they can do the most damage.
The principle of least privilege (PoLP; also known as the principle of least authority) is an important concept in computer security, promoting minimal user profile privileges on computers, based on users’ job necessities.
- Identify privileged accounts and terminate for those no longer in use or are connected to employees that are no longer at the company.
- Monitor, control and manage privileged credentials to prevent exploitation.
- Implement necessary protocols and infrastructure to track, log and record privileged account activity, and create alerts, to allow for a quick response to malicious activity and mitigate potential damage early in the attack cycle.
While a disgruntled employee can mean bad news for your business, sometimes a well-meaning but misinformed employee can do just as much damage.
- Train staff in security best practises and keep them up-to-date.
- Use strong passwords and consider enforcing policies around how frequently staff should reset them. The Force Password Change module allows this functionality on a Drupal website.
- Don’t give away the keys to the city: Use separate passwords for each website account.
- Use multifactor authentication where available. Drupal websites can provide two-factor authentication through integration with 3rd party services like Authy and Duo.
- Encrypt data at rest with strong encryption, such as AES 256-bit, as well as when it is being transferred over a network. Drupal has the Encrypt module for that.
The reality of it is, no software or website is ever 100% free from bugs. While often these bugs are benign, sometimes they leave the door wide open for malicious attacks. Fortunately, with Open Source platforms like Drupal, there is a large community of developers constantly working to identify and fix bugs in the underlying CMS platform, as well as the 1000’s of contributed modules, and supplying these fixes to the wider community as patches or updates. But it is the responsibility of the website owner to make sure these updates are applied.
- Inside your business, keep software up-to-date on all computers, including appliances such as routers and file servers.
- Consider Instituting a centralised patch management program to ensure that devices, and software, are kept up to date at all times.
- Ensure that your core website, third party frameworks and server environment is always running the latest versions of any required software. This is something our Website Support and Optimisation Team look after for many of our clients.
Be mindful of who you grant access to your website and software. It’s much more difficult to control what an external contractor is doing, or the environment that they are operating in. Shoulder surfing is not just something we do on the beaches of Australia!
- Validate that any third-party follows remote access security best practices, such as enforcing multi-factor authentication, requiring unique credentials for each user, setting least-privilege permissions and capturing a comprehensive audit trail of all remote access activity.
- Disable third-party accounts as soon as they are no longer needed; monitor failed login attempts; and have a red flag alerting you to an attack sent right away.
Data loss prevention (DLP) is a strategy for making sure that users do not send sensitive or critical information outside the corporate network. Tools are available to enforce business rules, and to be effective must cover all outbound Internet traffic, on all devices. For example, if an employee tried to forward a business email outside the corporate domain or upload a corporate file to a consumer cloud storage service like Dropbox, the employee would be denied permission.
Build your Disaster Recovery Plan
Conduct a risk assessment to identify where your valuable data resides and what controls or procedures are in place to protect it. Then build out a comprehensive incident response and disaster recovery/business continuity plan, determining who will be involved, from IT, to legal, to PR, to executive management, and test it.
This testing phase is one of the most important components of a strong DR strategy. On top of confirming that the DR plan works, a DR drill gives all teams involved a better idea of how much data could be lost and equally important, how long it will take to recover. So if the worst does happen, you’re prepared, rehearsed and in the best possible position to recover.