It was half past midnight Eastern Time when Andrew Rosenberg, an anesthesiologist and critical care doctor who works as chief information officer at Michigan Medicine, suddenly noticed that a substantial number of computers across the health care center had ceased to function. In the hospital’s parlance, it counted as a “catastrophic major incident.”
“We do some fairly sophisticated automatic monitoring of our core systems, and when those suddenly went offline, that triggered alerts,” says Rosenberg. “In a couple of our units, the majority of their computers all had the blue screen of death.”
It soon became clear that this was not an isolated incident. A cybersecurity company called CrowdStrike had made a routine update to its Falcon antivirus product, utilized by companies ranging from banks to airlines to hospitals. That update contained a bug, an error that caused all computers running the software on a Windows operating system to crash.
Around the globe, doctors, nurses, and hospital administrators were going into panic mode as they raced to manage the consequences of the largest IT outage in history. Mass General Brigham, one of America’s biggest health care systems, canceled all nonurgent surgeries, procedures, and medical visits. In the UK, Royal Surrey NHS Foundation Trust declared a critical incident affecting the systems used to deliver radiotherapy treatments. Hospitals in Canada, Germany, and Israel announced issues with their digital services, while the 911 emergency service in some US states was reported to be down. A WIRED reporter found both Baylor hospital network, one of the largest nonprofit health care systems in the country, and Quest Diagnostics unable to process routine bloodwork. Donna Rossi, a spokesperson at the Phoenix Police Department, explained that while calls were still going through, the lack of working internet meant that officers had to be dispatched manually.
The extent of the disruption appeared to vary both between and within health care systems. “Our hospital is fully down due to #Crowdstrike issue,” Dana Chandler, a nurse at GBMC HealthCare in Maryland, posted on X. “No phones, no computers, no safety nets. It’s an all-hands-on-deck kind of day. I hope our patients remain safe.” Rosenberg says that at Michigan Medicine, where he was awake since 1 am dealing with the crisis, anywhere from 15 to 60 percent of the computers were not working, depending on the unit.
“The impact is massive,” he says. “It affects all aspects of modern digital health systems. Luckily, in units where the computers are running the whole time, like the ICUs and emergency departments, the computers didn’t take the CrowdStrike application upgrade, whereas in areas of health care which are more episodic, like operating rooms, the disruption is much greater.”
Rosenberg says that the areas of greatest disruption have been so-called “digital bottlenecks,” which require communication between multiple computer systems. He gives the example of the critical practice of cleaning, disinfecting, and sterilizing medical devices and patient care supplies. This is monitored through digital tools across several computers, to ensure that best practices are followed and the risk of potentially lethal infections is minimized.
Most PopularThe Top New Features Coming to Apple’s iOS 18 and iPadOS 18By Julian Chokkattu CultureConfessions of a Hinge Power UserBy Jason Parham SecurityWhat You Need to Know About Grok AI and Your PrivacyBy Kate O'Flaherty GearHow Do You Solve a Problem Like Polestar?By Carlton Reid
Gear“If one of those computers is affected, suddenly all of your sterilization procedures have to slow down or even stop, and then operations stop,” he says.
With large health care systems employing thousands of personnel and looking after vast numbers of patients—last year Michigan Medicine had more than 2.7 million outpatient visits—modern health care has become reliant on digitization as a matter of necessity, from systems which relay communications between busy departments to electronic medical records, orEMRs, which store vital information about individual patients.
But in recent years, concerning reports have emerged about the potential consequences of those systems breaking down. Studies have shown that during electronic medical record downtime, laboratory testing results are delayed by an average of 62 percent compared with normal operations, while in the NHS, IT failings have been directly linked with cases of patient harm.
In April, Sofia Mettler, at the time a resident physician at Mount Auburn Hospital, published a paper in JAMA Internal Medicine in which she described a day where the hospital’s EMR system was down for a period of seven to eight hours. The disruption meant that samples for morning lab tests were unable to be collected because the phlebotomy team did not know which patient needed which tests, while the results of tests conducted before the downtime could not be disseminated, making it harder to assess overnight progress.
Mettler, now a pulmonary and critical care fellow at Brigham and Women's Hospital, says that experience pales in comparison to the consequences of the CloudStrike outage.
“This time, the extent of the system downtime is way more profound,” she says. “We are currently unable to use any software that relies on digital data transmission. For example, we are unable to review CT scans, because the radiology software is down as well. It is difficult to make clinical decisions without access to what has become the essential part of medicine. We are using bedside ultrasound machines, but it is not nearly as good as CT scans in telling us what is going on in the lungs.”
Dean Sittig, a professor of biomedical informatics at the University of Texas Health Science Center at Houston, says that in case of such incidents, hospitals are supposed to have paper backup systems, and to ensure that vital devices such as IV pumps, blood pressure monitors, and ventilators which are controlled on the internal network are isolated from the internet. However, this doesn’t always happen. “Every hospital has fire drills, but they should have things like downtime drills as well, where they turn off the computer and make sure that everything still functions,” he says.
According to Sittig, there are many reasons why computer failures can lead to patient safety issues, like delays in prescribing certain medications. However, some of the biggest problems are subtler, such as a lack of manpower. With a health care center relying on lab test results being passed on by hand, there can be delays in discharging patients, which means they stay in hospital for longer and become more vulnerable to contracting infections.
Most PopularThe Top New Features Coming to Apple’s iOS 18 and iPadOS 18By Julian Chokkattu CultureConfessions of a Hinge Power UserBy Jason Parham SecurityWhat You Need to Know About Grok AI and Your PrivacyBy Kate O'Flaherty GearHow Do You Solve a Problem Like Polestar?By Carlton Reid
Gear“One of the real problems with becoming entirely reliant on electronic systems, we’ve got rid of people,” he says. “And so when the computer's not working, there's just not enough people to do the work, and everything becomes a lot more inefficient.”
At Michigan Medicine, Rosenberg says that the worst of the problem will be fixed within 24 hours but that it will likely take several days for the situation to fully resolve, especially as some of the data center servers as well as individual computers within the health care center have had to be restarted. “Our best estimate is, per computer, it'll take us about 15 to 20 minutes to fix,” he says. “Not a big deal. But when you’re talking about thousands and thousands of computers, that will take several days.”
He says that the center’s personnel have benefitted from drawing up a strategic sequence of recovery list several years ago for such an eventuality, which has allowed them to immediately focus on which software tools, out of more than 1,000 different applications, needed to be fixed first.
“So not the real-cool software which the neurosurgeon is using to do this innovative 3D mapping of the brain, but the top three applications that we really need,” he says. “So we’re still getting through, especially for critical cases. Our emergency department never closed down, and we’re doing the most life-threatening and critical operations. But we’re getting through in a less efficient, less capable manner.”
However, in the wake of this incident, Rosenberg believes that health care systems need to diversify the software they use, especially in critical areas, to make them as resilient as possible.
“I think more reliance on diverse cloud computing will help us, because I would suspect in a modern cloud environment, they would have different timings of upgrades, for just this reason,” he says. “So part of the trick would be to run your systems in cloud operators, and stagger upgrades, so when these things happen, you can still run your systems from a different data center while the other one is fixed.”