Vectors and Vectors

by Nat Bullard August 21, 2025

Halcyon has just updated two of its data products: the Large Load Tariff Tracker and the Gas Power Plant Tracker. We ship updates monthly, and for good reasons. The first is that these markets are changing fast, and matching a fast-moving market with fast-iterating products is, as they say, table stakes. The second is that these markets are highly uncertain, while outcomes within these markets are highly significant. The best way to narrow the aperture of uncertainty is to have not just the most recent, but the best-documented and most-transparent information possible.

That’s the why of what we do with data products. (It is worth stating plainly, but our customers, in a good way, already get it.) But equally important is how we do it, and how what we do creates better and more meaningful data products.

Vectors and vectors

A vector is a numerical representation of data in multi-dimensional space, typically created by machine learning algorithms like transformers. Vectors are particularly useful because they work well with unstructured information such as text and images. Halcyon uses vectors.

But I also think of our work in another form of vectors: objects with both magnitude and direction. These move us in the direction of building better products.

Our first vector, so to speak, is our coverage. We traverse and then collect all of the unstructured information flow from every U.S. state public utility/service/corporation commission, the U.S. Regional Transmission Organizations and Independent System Operators (ISOs/RTOs), and a handful of other agencies and institutions as well. We’re at 4.5 million documents at the moment; eventually, it will be 100 million, and then more.

Our second vector is our ingestion and search capability. Halcyon has a purpose-built information pipeline that collects, organizes, and then embeds all of the information in our coverage universe in a way that allows us to search for similar semantic meaning across documents and across institutions. Search in this fashion gives us the ability to treat all of this information as relational outside its specific domain — allowing us to look not just into California, for instance, but into and across California, Nebraska, Florida, and Vermont (and every other state) at once. This is known as semantic search.

Our third vector integrates these processes with the necessary amount of expertise. To us, that expertise is a combination of knowing what to ask of this information landscape (people who have sector experience) and informing the abstractions we build atop our data. Defining and standardizing filing types, establishing authoritativeness, classifying topics — you can only do this if you have the requisite data first and then use domain knowledge to abstract it. When you have both the data and abstractions in place, then you use semantic search to actually find what you’re looking for.

Incidentally: Large language models (LLMs) matter here, but are also the most commodifiable element of what Halcyon does. That’s not to say that models don’t matter to us — improvements in LLMs strengthen the three vectors above (and millions more in our platform). They make semantic search better, they improve our ranking ability, and they allow us to use the data we’ve collected to generate better answers.

We manifest all of that into data products.

Large Load Tariff Tracker

This week we shipped the latest update of our Large Load Tariff Tracker (LLTT), which collects and systematizes information from tariffs currently working their way through regulatory processes.

The name is fairly self explanatory, but it’s worth adding a bit of detail: these tariffs cover agreements between large demand sources and utility power providers, with individual tariff qualifying demand thresholds as high as 500 megawatts. It almost goes without saying, but these specialized tariffs are of particular interest to data center operators, whose new demand might stress or completely overwhelm a utility service territory if not managed properly.

LLTT_ chart- blog

The large load tariff tracker now includes 50 tariffs. In the previous update, last month, it covered 35. More are coming every week — and correspondingly, more will be included in every subsequent update.

Gas Power Plant Tracker

This week we also shipped the latest update of our Gas Power Plant Tracker (GPPT). Three bullet points showing the information landscape month-to-month say a lot already:

June 2025: 116 assets, 55 gigawatts of generating capacity
July 2025: 134 assets, 65 gigawatts of generating capacity
August 2025: 146 assets, 72 gigawatts of generating capacity

These high-level numerical changes are notable enough, but more valuable are all of the newly emerged details on new plants, as well as the discrete changes within existing assets. For instance: the Ft St Vrain gas plant in Colorado provides the cost of its turbines ($121 million) and also says that if it had not reserved its turbines earlier, that cost would be closer to $200 million. The Buck combined cycle plant in North Carolina requires $28 million for interconnection facilities and associated upgrade costs.

And — the 400 megawatt Valmy peaking plant in Nevada, announced in May of last year, has now filed its Certificate of Public Convenience and Necessity (CPCN)…and with it, updated its capex cost by $148 million.

There are dozens of these changes every month, everything from a minor shift in timeline to a nine-figure cost increase. Halcyon publishes every one in a month-to-month changelog built into our GPPT and LLTT. It is one of my favorite product features so far.

GPPT_chart-blog

We fight for the users

Ultimately, all of this effort must provide value in a better way than what exists already. Halcyon needs to provide higher frequency information than readily available right now, and in higher fidelity than even an expert team can efficiently provide on its own. To slightly adapt a movie line that shows this author’s age: we fight for the users.

And by extension, we fight for their use cases. Those include informing commodity trading models, power and gas supply marketing strategies, asset development plans, capital allocations, and regional differentiation. Those also include corporate strategy and planning, and policy decision making. And, we hope, those will one day include use cases that have not yet emerged: difficult to imagine without the right information, and impossible to implement without it.

Subscribe for more content like this; reach out with questions: sayhi@halcyon.io; follow us on LinkedIn and Twitter

Vectors and vectors

Large Load Tariff Tracker

Gas Power Plant Tracker

We fight for the users

RELATED ARTICLES

Tracking US Utility Cost of Capital

Tracking gas plants, from Socrates to Plato to Licking County