Disaster Recovery Plan

Overview

This document outlines the Seal platform Disaster Recovery Plan (DRP) and incident response procedure as part of our commitment to maintaining service reliability. We take incidents seriously and have established a structured approach to handle any service disruptions or technical issues.

Objectives

The principal objective of the DRP is to develop, test and document a well-structured and easily understood plan which will help the company recover as quickly and effectively as possible from an unforeseen disaster or emergency which interrupts information systems and business operations. Additional objectives include the following:

The need to ensure that all employees fully understand their duties in implementing such a plan
The need to ensure that operational policies are adhered to within all planned activities
The need to ensure that proposed contingency arrangements are cost-effective
The need to consider implications on other company sites
Disaster recovery capabilities as applicable to key customers, vendors and others

Disaster Recovery Plan

1. Initial response

1.1 Incident detection

When an incident is detected, the Seal Team initiates a well-coordinated response to ensure timely action and clear communication. The first step involves establishing a dedicated incident response channel where all relevant communications are recorded for transparency and future review.

Simultaneously, the necessary technical team members are assembled to begin diagnosing and addressing the issue. Incident coordination protocols are activated, and the customer deployments team is looped in to ensure alignment across all fronts.

1.2 Communication

Communication during an incident is handled with urgency and transparency.

Our customer deployment team immediately notifies any affected customers, providing timely and consistent status updates throughout the resolution process. We prioritise keeping customers informed about the incident's impact and our ongoing efforts to restore normal operations.

2. Technical Response

2.1 Deployment Rollback Protocol

If the incident is determined to be associated with a recent deployment, our technical team engages a structured rollback protocol designed to restore service with minimal disruption.

2.1.1 Initial Assessment

The rollback process begins with an initial assessment. The team first verifies whether any recent database schema or content changes were involved. From there, they evaluate the safest rollback strategy and identify the most recent known stable version of the platform to target for restoration.

2.1.2 Recovery Actions

Based on the assessment, the team takes decisive action. If no database changes are involved, a rapid rollback to the previously stable version is executed to quickly bring systems back online. If database changes are present, a more controlled recovery process is implemented. This ensures data consistency and minimises any risk of corruption or data loss during the rollback.

2.2 Service Restoration

Once the rollback or fix has been applied, efforts focus on service restoration.

The team will work to restore API services, verify that the frontend application is functioning correctly, and validate the overall stability of the system to ensure all components are operating as expected.

This step ensures that both internal and external-facing components of the platform are functioning correctly before the incident is declared resolved.

2.3 System Monitoring

Throughout the technical response, robust system monitoring plays a critical role. We continuously track key metrics including:

Application error rates and performance metrics
Database query performance
API service status and response times
Background task processing
Cloud infrastructure metrics
SSL certificate status
External service dependencies
Frontend application health

This monitoring enables the team to identify secondary issues early and verify that the system remains healthy post-restoration.

3. Post Incident

After services are fully restored, the focus shifts to communication and continuous improvement.

3.1 Customer Communication

The customer deployment team provides a detailed summary of the incident, ensuring that affected customers are clearly informed about the resolution and any actions taken. This direct communication reinforces our commitment to transparency and accountability.

3.2 Internal Review

Internally, the technical team conducts a comprehensive review of the incident. This includes documenting every action taken, analysing the root cause, and developing in-depth technical documentation for any infrastructure changes made during the event. The goal is to understand the factors that contributed to the incident and identify opportunities to prevent similar issues in the future.

3.3 Process Improvement

The final step involves integrating lessons learned into our processes. We review and refine our incident response procedures, implement technical improvements where gaps were identified, and enhance monitoring and alert systems to better detect and prevent similar issues in the future. Key insights are also shared across technical teams to reinforce a culture of learning and operational excellence.

Practice DRP Exercises

Practicing disaster recovery exercises is a fundamental component of a comprehensive DRP. These exercises serve not only as a means to validate the plan but also to build confidence and readiness across all teams involved. This practice ensures that Seal's disaster recovery procedures are actionable, effective, and responsive to real-world scenarios.

Seal conducts DRP exercises every quarter. DRP exercises come in various forms, including (but not limited to):

Tabletop exercises where teams walk through scenarios in a discussion-based format
Walkthroughs that review documentation step by step
Simulation exercises that mimic real-world events without affecting operations
Interruption tests that shut down systems to test recovery

Each type offers unique insights and value depending on the organisation’s maturity and risk tolerance. Following each session, a thorough review is conductedt to capture what worked well, what did not, and actionable steps for improvement.

There is no pass or fail in a DRP exercises — it is primarily to inform what needs to be improved, and how the improvements can be implemented. Plan exercising ensures that emergency teams are familiar with their assignments and, more importantly, are confident in their capabilities.

Our Commitment

We are committed to:

Quick and effective incident response
Clear communication throughout the incident
Thorough post-incident analysis
Continuous improvement of our systems and procedures

This incident response framework ensures we handle technical issues efficiently while maintaining transparency with our customers throughout the process.

Last updated 4 months ago