joseph shapiro
Faculty FellowUC Berkeley

Joseph Shapiro

Agricultural & Resource Economics

Joseph Shapiro is an Associate Professor in the Department of Agricultural and Resource Economics at UC Berkeley’s College of Natural Resources. He has studied at Stanford University, The University of Oxford, The London School of Economics and Political Science, and The Massachusetts Institute of Technology.

Spark Award Project

Machine Learning Simplifies Compliance with Government Regulation

This project aims to reduce costs and uncertainty for firms complying with government regulations, initially focusing on the US Clean Water Act. This law regulates “Waters of the United States,” but lacks clarity on specific covered areas. As a result, developers often face costly delays and compliance expenses. The solution is a machine learning algorithm predicting the likelihood of Clean Water Act regulation for any US location. It utilizes ground-truth data from the Army Corps and various geophysical layers, achieving high accuracy.

Joseph Shapiro’s Story

How Machine Learning Can Interpret Clean Water Act Regulations

By: Niki Borghei

September 17, 2024

Navigating government regulations can be a complex process for researchers, businesses, governments, and non-governmental organizations alike. For example, the US Clean Water Act, which regulates the “Waters of the United States,” requires clear and consistent guidelines to effectively manage environmental protection. However, misinterpretations of these guidelines can make them difficult to follow, increasing the risk of environmental harm.

Joseph Shapiro, Associate Professor in the Department of Agricultural and Resource Economics at UC Berkeley’s College of Natural Resources, aims to address these issues in his Bakar Spark Award project. His project introduces a machine learning algorithm designed to predict the likelihood of Clean Water Act regulations for any location across the US. By leveraging ground-truth data from the Army Corps of Engineers and integrating various geophysical layers, this innovative solution promises to enhance accuracy and support businesses and organizations in complying with environmental regulations while decreasing the cost of compliance.

Q: What are some of the challenges that businesses, organizations, and researchers face with environmental compliance?

A: In many settings, government regulation of important domains like the environment can have limited effectiveness and high costs. Many regulations involve numerous pages of dense legal requirements. Determining precisely what actions firms must undertake to comply with these regulations requires valuable public and private resources and creates uncertainty, which can undermine the goal of the regulation (e.g., protecting the environment) and increase compliance costs.

Q: How does your solution address the challenges of environmental compliance?

A: We use machine learning to understand regulation, and so make regulation more effective and less costly. We initially focus on the Clean Water Act, where policymakers, environmental organizations, and the private sector have enormous ambiguity about which water resources the Act regulates. We train algorithms to predict which areas the Clean Water Act regulates.

This ambiguity constrains land development across the US in areas with wetlands, streams, and other water resources. This ambiguity creates a challenge in many industries, but especially for the development of renewable energy projects (e.g., wind and particularly solar power plants), which are often in areas with ephemeral streams or isolated wetlands, and where ambiguity about what the Clean Water Act regulates can increase the costs and delays in expanding clean energy.

We train algorithms by using data on over 150,000 decisions from the Army Corps of Engineers, each determining whether the Clean Water Act regulates one specific water resource. To predict the Army Corps decisions, the algorithms use dozens of high-resolution geophysical inputs, like aerial imagery, maps of streams and wetlands, soil characteristics, micro-climate, and elevation data. The algorithms have high accuracy and can describe the probability that individual development sites are regulated, or report aggregate patterns for regions, under different
interpretations of the Clean Water Act (e.g., under the Obama, Trump, or Supreme Court rulings).

Q: Sounds promising! When did you decide you wanted to tackle this problem?

A: I have researched the Clean Water Act since beginning my Ph.D. over 15 years ago. Controversy and ambiguity over what the Clean Water Act regulates has generated front-page news almost annually over the last decade, and repeated rulings by the US Supreme Court and Environmental Protection Agency under Presidents Obama, Trump, and Biden. Clean Water Act regulation has changed so often that regulators refer to this domain as, “regulatory ping pong.” Several years ago, I served on an academic panel that the Alfred P. Sloan Foundation funded to review recent government economic analyses of Clean Water Act regulation. In this review, I noticed that although the Army Corps had conducted tens of thousands of legally-binding determinations of what the Clean Water Act regulated, existing aggregate analyses used case studies or other general descriptions of what this law regulated. Given enormous progress in machine learning algorithms over the last decade, this setting seemed like a powerful opportunity to apply these algorithms to benefit research, policy, business, and the environment.

Q: What led you to conclude that entrepreneurship would be the most effective approach for solving this problem?

A: In January 2024, we published a paper in Science which described findings from applying these algorithms, and which estimated that they could create several billion dollars in commercial value each year by providing rapid and low-cost estimates of the probability that the Clean Water Act regulates a given site. Several readers commented that commercial users could have demand for this algorithm, and that further developing it could also benefit policymakers and the environment. Research can easily sit in academic journal articles without directly affecting policy, the environment, or business practice. This setting seemed like an unusually good opportunity for one set of academic analysis to simultaneously advance peer-reviewed research, benefit the environment, and support policy and business users.

Q: What are some key challenges you’ve faced in commercializing your research, and how have you overcome them?

A: We are learning how research can effectively and efficiently translate to policy and commercial use. Put another way, we are learning how to continue focusing effort on producing high-impact, peer-reviewed research while ensuring that policy, nonprofit, and business users can readily use results from this research. Speaking to academics, entrepreneurs, lawyers, and investors have provided insights. Few if any economists have commercialized algorithms, so we are learning as we go. Our research and IP agreements allow free use for government and non-profit users. We are learning that each potential group of users has different needs and understanding how to meet those needs.

Q: When do you expect to see the initial application of your project?

A: The initial algorithm is already being rolled out, and we expect to expand it over the 3 year timeframe of the Bakar project.