Late on Thursday night, the fragility of the global computer infrastructure was starkly revealed. An error in a software update from a leading cybersecurity company's Falcon Sensor tool led to a catastrophic failure, causing a widespread number of Windows based computers, to crash with a blue screen of death.
The problem was triggered by a "single content update" for customers with personal computers with Windows. The faulty code went undetected until after it had been downloaded and installed on several machines. Once installed, the flawed update disrupted the core functions of the personal computers, causing Microsoft's blue error screen to appear with a message stating, "Your PC ran into a problem and needs to restart." As long as the update remained, restarting the machine only led to the same erroneous outcome.
This major event has caused widespread chaos for countless individuals affected by such a pervasive yet basic dependency; and has likely erased potentially billions of dollars from the global economy due to the extensive, unexpected and widespread downtime. Yet this is not an isolated case. We have witnessed numerous significant internet outages caused by failures from cloud web hosts and other system providers in the past. The world ought to have taken heed from any of these events. Consequently, the first issue that we highlight pertains to the nature, manner and extent of its implementation of the advice that was given at the passing of each of these incidents.
Clearly and inevitably, each digital disaster brings new vulnerabilities to light. What can we take away from this?
It is crucial to scrutinize and enhance our systems and processes to prevent future disruptions. The impressive technological solutions we rely on daily may not be as polished as they appear. Although they seem expertly engineered and impersonally marketed, they are the product of meticulous humans who code each line and examine every detail. Humans, however, can make mistakes. Therefore, a risk assessment of technology and IT systems would reveal the scope of both systemic and human vulnerabilities and potential risks. Likewise, your business continuity plan should include specific tasks to operate in an enterprise high availability mode, with the rest under business continuity or disaster recovery modes. Reevaluating recovery time and point objectives, with a definitive plan for restoration at destabilized points and continuous testing to evaluate operational effectiveness, is also essential. Organizations might also want to review their crisis management policy and the subsequent operational framework to gauge its impact.
Drawing lessons from this incident is crucial to diminish the chances of recurrence. The recurrence of such events should dispel the notion of these systems being infallible and immune to faults. Indeed, criminals are likely to draw lessons as well while studying how to exploit the vulnerabilities that caused disruptions at television stations, airports, railway stations, markets and insurance companies. Ideally, we would strategize for these eventualities and establish adequate safeguards to enhance resilience. That must be the new reality that we should live in.
We trust that you found this alert to be insightful. Please do not hesitate to write to us at contactus@mgcglobal.co.in, in case you wish to provide your feedback or require any assistance.
Best regards
Markets Team
MGC Global Risk Advisory
|