Why do failure planning and reliable redundancy matter?
It should go without saying that reliable service delivery and network infrastructure are more important than ever; both aspects complementing each other and critical to the end-user experience for any business to be successful. For many companies, their services (via their networks) are the single point of contact they have with their customer base for delivering products and services.
Any disruption, whether human error, hardware failure, or cyberattack, to a network that impacts service delivery can lead to damaging reputational and financial consequences and can be devastating to an organisation.
Regardless whether the services and network are reliant on a cloud provider or a third-party data centre, a backup plan and redundancy systems are essential to ‘keep the lights on’ and minimise or negate the disruption effect of an outage.
Data Zoo operations are run out of both network styles using cloud services (across multiple regions) and third-party data centres (in multiple geographic locations).
Despite best intentions data centres and cloud providers are not infallible; according to an annual report from the Uptime Institute, in 2019 34 per cent of data centre operators had an outage, while 50 per cent had experienced an outage or incident of severe network degradation in the last three years.
Network Redundancy and Infrastructure
For redundancy planning with third-party data centres organisations must opt for geographically diverse locations; if one location is disrupted, offline, or overloaded, then traffic should be automatically load-balanced and re-routed to a secondary location. Frequent synchronisation between these sites will also minimise the impact on end-users. Some organisation believe that and plan for, simply having multiple servers in the same location is a sufficient redundancy plan – this was evidenced recently with an outage affecting one of Australia’s leading gambling operators whose systems were offline for several hours.
At Data Zoo, we have redundancy and back up plans in place which include load balancing, automated failover switching, and continuous data and logging synchronicity across our third-party data centres.
Cloud-hosted service infrastructure is only as strong as its weakest link, so the Data Zoo cloud services offer redundancy through the elimination of any singular point of failure. Our services have redundant, geographically diverse deployments that are load-balanced with regular status checks indicating a regional deployment’s health and load levels. Depending on the response, the computing resources allocated to a region may be bolstered, or the services may be restarted in case of outages. This process runs in the background continuously, enabling automated, seamless failovers in the rare case there is a server outage.
Planning vs Risk
Organisations will always have to weigh and balance the cost of redundancy plans versus the risk of disruption. Without an effective redundancy plan and strategy, one point of failure can bring down an entire network. Resolving that failure point could take many hours or several days, thus resulting in loss of revenue and reputational damage; even more so if the business is publicly listed.
Whether it be databases that are highly available across geographically segregated data centres or core service deployments which are similarly distributed, at Data Zoo everything is built with the expectation that server outages can be a regular occurrence, and yet cannot impact uptime or delivery of services to clients.
With insight and planning, implementation and execution of a full redundancy strategy are possible and should be considered by any organisation to be best practice.