By Monday morning, most of the major disruptions from the recent CrowdStrike security update had been resolved. Flight delays and cancellations were no longer front-page news, and several Starbucks locations near me had resumed taking orders through their app.
However, the cleanup effort is still ongoing. Microsoft estimates that approximately 8.5 million Windows systems were impacted by the issue, which stemmed from a faulty .sys file automatically pushed to PCs running CrowdStrike Falcon security software. This update caused Windows systems to display the Blue Screen of Death (BSOD) and enter a boot loop.
“While software updates may occasionally cause disturbances, significant incidents like the CrowdStrike event are infrequent,” wrote Microsoft VP of Enterprise and OS Security David Weston in a blog post. “We currently estimate that CrowdStrike’s update affected 8.5 million Windows devices, or less than one percent of all Windows machines. While the percentage was small, the broad economic and societal impacts reflect the use of CrowdStrike by enterprises that run many critical services.” The documented fix by both CrowdStrike (whose fault it directly is) and Microsoft (which has taken a lot of the blame for it in mainstream reporting, partly due to an unrelated July 18 Azure outage) was to repeatedly reboot affected systems in hopes they would download a new update file before crashing. For systems where this method hasn’t worked—and Microsoft has recommended up to 15 reboots—users need to manually delete the bad .sys file. This allows the system to boot and download a fixed file, resolving the crashes without leaving machines unprotected.
To streamline this process, Microsoft released a recovery tool over the weekend. This tool automates the repair process on some affected systems by creating bootable media using a 1GB-to-32GB USB drive. Booting from that USB drive allows users to repair their system using one of two options. For devices that can’t boot via USB, Microsoft also documents a PXE boot option for booting over a network.
WinPE to the Rescue
The bootable drive uses the WinPE environment, a lightweight, command-line-driven version of Windows used by IT administrators for applying Windows images and performing recovery and maintenance operations.
One repair option boots directly into WinPE and deletes the affected file without requiring administrator privileges. However, if your drive is protected by BitLocker or another disk-encryption product, you’ll need to manually enter your recovery key so WinPE can read the drive data and delete the file. According to Microsoft’s documentation, the tool should automatically delete the bad CrowdStrike update once it can read the disk.
If you’re using BitLocker, the second recovery option attempts to boot Windows into Safe Mode using the recovery key stored in your device’s TPM to automatically unlock the disk, as during a normal boot. Safe Mode loads the minimum set of drivers needed for Windows to boot, allowing you to locate and delete the CrowdStrike driver file without encountering the BSOD issue. The file is located at Windows/System32/Drivers/CrowdStrike/C-00000291*.sys on affected systems, or users can run “repair.cmd” from the USB drive to automate the fix.
CrowdStrike has set up a “remediation and guidance hub” for affected customers. As of Sunday, the company said it was “testing a new technique to accelerate impacted system remediation,” though it hasn’t shared more details yet. Other fixes include rebooting multiple times, manually deleting the affected file, or using Microsoft’s boot media to help automate the fix.
The CrowdStrike outage affected more than just flights and coffee orders. It impacted doctor’s offices and hospitals, 911 emergency services, hotel check-in and key card systems, and work-issued computers online when the flawed update was sent out. Microsoft has been working with Google Cloud Platform, Amazon Web Services, and other cloud providers to provide fixes for Windows virtual machines in their clouds.