Skip to content
50 States, One (1) Platform

Connecting the dots between 50 state public utility commissions

There's a moment every engineer experiences when they realize the internet is held together with duct tape and good intentions. At Halcyon, our goal is to help make energy information more accessible and useful. That means, as a first step, we need to systematically and reliably discover, collect, and ingest that information from all its (many, many) different sources. As we marched to 50 state public utility commissions (PUCs), Halcyon’s engineering team has lived this duct-tape moment Groundhog Day-style for the last six+ months.

America's federal system creates an operationally burdensome paradox for anyone trying to access public energy information at scale. Each state has its own utility commission, budget, procurement process, and relationship with whatever web development contractor won the bid in 2003 (or earlier!). There's no federal mandate for standardized information systems — just 50 different approaches to putting regulatory dockets and filings on the internet.

Learning to be a user

Navigating 50 different websites means that not only do we have to contend with 50 different definitions and 50 different degrees of completeness and reproducibility, but also that we have to accurately reconcile 50 distinct schemas within the context of how we store all the information in our own platform.  

Said another way, before you can collect and organize information published on a website, you have to figure out how to use it. We have a saying at Halcyon that we have to walk a mile in our customers’ shoes as a prerequisite to providing them value. This sounds nice until you're staring at a search interface at 2 AM, wondering whether "Document Date" means when a document was filed…or published, or served.

Regulatory proceedings are typically organized by date — but there can potentially be many conflicting (sometimes wildly incorrect or ambiguous) dates associated with each filing (i.e. from the year 3199, or from "01/11/8"). Some documents were filed before the age of the internet, and, in 2007, when some commissioners uploaded archives from 1968, they were erroneously automatically 'filed' with dates in 2007. Some websites let you search by date, but only for a single day (if you want to know what happened in March, for example, that’s 31 different searches), or only if you also specify some other search filter. 

Some websites have undocumented hidden APIs (which are great!), but they implicitly synchronize state between the browser and server, and will randomly emit nonsense if that state is even slightly off. Some websites take 10 minutes to load a search result, making them effectively unusable. 

blog - Map


Unexpected diversity

We quickly realized that the states putting the most resources into providing modern web services aren’t always the ones you’d expect, and vice versa. One particularly large, relatively well-funded state PUC has two separate, ancient websites for tracking the same regulatory dockets. Both are incomplete. Both require cross-referencing. Imagine reading a book where only every odd page is printed, and the only way to get the full story is to locate another version of the book with every even page printed, and then read them both side by side. 

(The Florida Public Service Commission, however, has a powerful public-facing API that deserves recognition.)

We found that many of these state PUC websites are built in a way that requires them to remember the user’s browser session history. Every click depends on the previous click. You can't just request a page — you have to carefully preserve the website's memory of your entire session. This also means all internal URLs are unstable. Fine, but annoying.

On a positive note, some of the best websites look like they were built in 1998. Tennessee, for example, has beautifully simple HTML tables listing every docket number and link. No gimmicks, no BS. You click a link, you get a document. If only it were all so simple everywhere else.

When websites fight back

Beyond the ways the website themselves are built, we also had to contend with the way that the underlying data was structured and exposed. One state makes you choose which specific database to search before you start. Another state website interprets any invalid search parameters as a request for all historical data since the beginning of time (taking so long that it almost always times out).

One state has "booby traps" scattered throughout their docket tracking system. Its website uses invisible characters scattered through text, empty table rows, and deliberately mis-aligned cells and columns that were built with the intention of making the webpage spacing appear correctly — old behavior that were most likely “hacks” from a time before CSS largely solved visual layouts.

This vintage element speaks to a broader issue; these public websites are living things that will evolve over time, which means that maintenance becomes a critical task. One state rebuilt its entire website less than one week after we had figured out how to traverse and collect data from it. Another updated the number of documents that could be returned via search, effectively breaking our ingestion pipeline. While the work required to fix these issues was minimal, the work required to catch these changes in a timely manner, across 50 states, was not.

The bigger picture

What struck us most, in the end, was the creativity required to achieve transparency across 50 states. Every broken interface became a puzzle. Every weird data structure forced us to write more flexible code. We learned to build systems that work with websites that function correctly less than half the time. 

To be very clear: we appreciate the hard work by all those who work at PUCs and understand that these public servants very likely did not have any meaningful input in website decision-making or development processes. But these institutions are inherently smaller than the grid that connects our country, and 50 discrete processes applied to interstate products necessarily results in a lot of complexity. 

As a result, there’s no single standard for what "public access" means in practice. That's worth remembering the next time you're trying to figure out why your electricity bill changed. If you need help making sense of this fragmented, complex, and hugely consequential information landscape, drop us a line: sayhi@halcyon.io

Subscribe for more content like this, or reach out with questions/comments: sayhi@halcyon.io, or find us on LinkedIn and Twitter