Medical algorithms are used across the health-care spectrum to diagnose disease, offer prognoses, monitor patients’ health and assist with administrative tasks such as appointment scheduling. But the history of such technology’s use in the U.S. is filled with stories of it running amok. From victims of sexual trauma being unfairly labeled as high risk for substance use to diagnostic failures by a sepsis-detection algorithm used by more than 100 health systems nationwide to clinical decision support (CDS) software discouraging necessary referrals to complex care for millions of Black patients, the problems abound. These issues might be extending the current pandemic as well: a 2021 review of dozens of machine-learning algorithms designed to detect or predict the progression of COVID found none to be clinically useful.

The kicker: many of the medical algorithms in use today did not require approval from the U.S. Food and Drug Administration, and the ones that did often did not undergo clinical trials.

To understand why, let us take a brief dive into history. In 1976, after the Dalkon Shield—an intrauterine contraceptive device—was implicated in several reported deaths and numerous hospitalizations, Congress amended the Food, Drug, and Cosmetic Act to mandate that medical devices demonstrate safety and effectiveness through clinical trials. To fast-track the device-approval process, a provision known as 510(k) was created to take advantage of the thousands of medical devices already being marketed. Under the 510(k)-clearance pathway, a device would not need clinical trial data if its manufacturers could demonstrate “substantial equivalence” in materials, purpose and mechanism of action to another device that was on the market. At the time this seemed like a perfectly reasonable compromise for policy makers. Why make a surgical-glove manufacturer undergo rigorous clinical trials if surgical gloves have already been approved?

Of course, medical devices became more complex over time. Surgical gloves were overshadowed by surgical robots; knee braces were trumped by prosthetic knees. Eventually it became too hard to determine whether any two devices were truly equivalent. Beginning in the 1990s, the federal government moved to address the increasing complexity by simplifying some regulations, including by broadening the definition of substantial equivalence to include devices that had significantly different mechanisms and designs as long as they had similar safety profiles. The goal was to encourage innovation, but eventually this change led to a greater number of unsafe and ineffective devices.

To complicate matters, a device approved via 510(k) could remain on the market even if its predicate device was later recalled for quality and safety issues. This practice has led to a “collapsing building” phenomenon, with many devices that are currently used in hospitals having been based on failed predecessors. Of the more than 3,000 devices recalled between 2008 and 2017, 97 percent had received 510(k) clearance.

Under current law, medical algorithms are classified as medical devices and can be approved with the 510(k)-approval process. They are, however, less transparent, far more complex, more likely to reflect preexisting human bias, and more apt to evolve (and fail) over time than medical devices in the past. Additionally, Congress excluded certain health-related software from the definition of a medical device in the 21st Century Cures Act of 2016. Therefore, some medical algorithms, such as those used for CDS, can evade FDA oversight altogether. This is particularly concerning given the ubiquity of these algorithms in health care: in 2017 90 percent of hospitals in the U.S.—roughly 5,580 hospitals—had CDS software in place.

Ultimately regulation needs to evolve with innovation. Given the threats that unregulated medical algorithms pose for patients and communities, we believe that the U.S. must urgently improve regulations and oversight for these new-age devices. We recommend three specific action items that Congress should pursue.

First, Congress must lower the threshold for FDA evaluation. For medical algorithms, the definition of equivalency under 510(k) should be narrowed to consider whether the data sets or machine-learning tactics used by the new device and its predicate are similar. This kind of measure would prevent, for example, a network of algorithms for evaluating kidney disease risk from being approved simply because they all predict kidney disease. Furthermore, CDS systems that are ubiquitous among U.S. hospitals should not be exempt from FDA review. Although the results of CDS algorithms are not intended to be the sole determinants in care plans, health-care workers often rely on them heavily for clinical decision-making—meaning that they often affect patient outcomes.

Second, Congress should dismantle systems that foster health-care workers’ overreliance on medical algorithms. Mandates for prescription-drug monitoring that require health practitioners to consult algorithms for scoring substance use prior to prescribing opioids, for example, should include comprehensive exemptions in all states, such as for cancer patients, emergency department visits and hospice care. And in general, unless their decisions lead to patient harm, doctors should not face significant penalties for using their own clinical judgment instead of accepting the recommendations of medical algorithms. An algorithm may label a patient as high risk for drug misuse, but a doctor’s understanding of that patient’s history of trauma adds critical nuance to the interpretation.

Third, Congress must establish systems of accountability for technologies that can evolve over time. There is already some movement toward this goal. A few years ago Representative Yvette Clarke of New York introduced the Algorithmic Accountability Act of 2019. This bill would require companies that create “high-risk automated decision systems” involving personal information to conduct impact assessments reviewed by the Federal Trade Commission as frequently as deemed necessary. For medical algorithms used in health-care settings, the FTC could require more frequent assessments to monitor changes over time.

This bill and several similar ones that were introduced have yet to reach the president’s desk. Still, we remain hopeful that momentum will increase in the months ahead. The FDA is meanwhile taking initial steps to improve oversight of medical algorithms. Last year the agency released its first action plan tailored specifically to these technologies. It has, however, yet to clarify issues around CDS exemptions and appropriate 510(k) clearances.

We know algorithms in health care can often be biased or ineffective. But the U.S. must pay more attention to the regulatory system that lets them enter the public domain to begin with. If a decision affects a patient’s life, “do no harm” must apply—even to computer algorithms.