Orange Button data taxonomy for solar financial reporting launches with developer meeting

Today SunSpec held a developers meeting for the Orange Button data taxonomy.  This is a Department of Energy project, led by the SunSpec Alliance, aiming to reduce some of the “soft costs” in solar power – specifically, the processing of financial and investment documents related to solar power systems.  The Orange Button project is tasked with developing a common data model or data taxonomy or reporting document format that can be used across the solar industry.  By reducing the overhead in solar project financing and management, it’s hoped that total costs for solar power systems will be reduced, and solar deployment can be sped up.

The Orange Button team has worked towards today for several years.  What’s special about today is it constitutes a kind of “Launch” event, where the Orange Button data taxonomy was presented to a large audience of interested stakeholders and potential users.  The last several months has been about defining the data taxonomy, data formats, identifying some software tools, developing examples, describing use cases, writing documentation, and more.

But — I should back up and explain some things.

Data Taxonomy?

data taxonomy in this case means an organized system of describing data and information.  In this case the data is the description of solar arrays (and the components going into such a system), power production from the array, the ownership of the array, and all kinds of details about the array, its ownership, its financing, and so on.  There are over 4000 types of data involved in the Orange Button data taxonomy, nearly 200 kinds of documents have been defined, and an ecosystem of data exchange and processing tools/services is envisioned by the stakeholders.

For this project, SunSpec partnered with XBRL to help define the taxonomy AND the file formats for exchanging data.  XBRL US, and XBRL International, shepherds the XBRL format which is widely widely widely used within the financial industry.  For example, in the USA the Securities and Exchange Commission (The SEC) requires all financial filing documents use the XBRL format.   As a result XBRL documents are widely used in all kinds of companies, and the document formats are used around the world.

The Dept. of Energy and SunSpec hoped to achieve, for the solar industry, the same sort of gains that open data exchange has achieved in the financial markets.  As an example – those of us who invest in stocks are familiar with services like where you can get all kinds of financial data, and corporate information, about every publicly traded company.  Did you ever think about how the exact same data got to be presented on so many different websites?  It relies on standardized reporting to the SEC, and a standardized reporting format.

The problem with current solar financial reporting

According to SunSpec and others I’ve talked with, handling solar power system documents is a huge burden.  There is simply no commonality in reporting formats, document formats, or even the meaning of numbers reported in the documents.  Those tasked with generating documents like monthly reports must resort to manually extracting the data from arcane documents, to construct spreadsheets that are hopefully accurate.  The manual processes mean that mistakes get made, and limits how many projects a given staff can handle.

One speaker, Leigh Zanone of 8minuteenergy, described a 6-month-old project which had 1400 documents already, an each month he has a new pile of documents to deal with.  Many are 100+ page contracts with multiple attachments describing technical things like the design of the solar arrays in the project.  Another speaker, from Wells Fargo, said they oversee 350 projects in the USA, which generate 2000 documents per year.

Typically the documents are PDF’s where it’s nigh-on-impossible to automatically extract useful data.

An example is the difficulty with configuring a software/data description of the design of solar arrays in a PV system.  The solar designer produces what’s called a “Single Line Diagram” which shows all the components from the PV modules, to combiner boxes, to inverters, to overload protection devices (fuses), to the service panel and connection to the electric grid.  Converting the single line diagram to a data structure aids with producing reports based on solar performance data collected by a monitoring system.  But, a single line diagram in a PDF cannot be automagically converted to a data structure.  Instead someone must manually read the single line diagram and enter the components into a spreadsheet or application.

Rich data in boring documents

Let’s face it – financial reporting documents aren’t the most exciting reading in the world.  The prose is dry and precise, but the content is extremely important.  Excruciating accuracy is a necessity because of the importance that everyone correctly interpret what’s in these documents.  In the case of documents for solar arrays – an example is that system owners are paid dollars based on whether the system performed to its expected level.  The folks paying those dollars require accurate reporting, and I imagine it could be a major legal problem if it was discovered the information was wrong.

Data tables in these documents are not just a pile of numbers.  Each number or word or phrase has several layers of informational meaning that must be interpreted correctly.  For example, take this solar inverter data sheet:

A solar inverter spec sheet may look like a simple data table, but each cell is rich with a variety of information.

This is data sheet describes some of SMA’s solar inverters, and the example focuses in on one of the cells in the information table.  There are multiple aspects to that one data item — it is a 3000 Watt inverter, with the product identification “Sunny Boy 3.0-US 208”, and so on.  Every other data item in the table has similar layers of informational meaning.

The characteristics of the inverter are used in calculating the ratio of actual performance to the expected performance over every time period.  Actual performance is what it sounds like, the actual kiloWatt-hours produced by the array, derived by monitoring the output and adding up numbers.  Expected performance is derived by a mathematical model accounting for the inverter, the type of modules, the number of modules, the weather, whether the system is near a dusty road or near pollen sources, and so on.

All of those characteristics (and more) are defined in the Orange Button data taxonomy.

What it means is a dry boring document like an inverter spec sheet could become a rich tapestry of multiple layers of data.

What’s required is that, for example, inverter manufacturers produce spec sheets in a format where data items can be extracted in the Orange Button taxonomy format.  It’s not just inverter spec sheets but every other kind of document associated with solar projects.

The broad scope of the Orange Button data taxonomy

This slide (it and the one previous came from online training sessions in January 2018) is meant to convey the broad scope of what’s covered by the Orange Button data taxonomy.  As I said earlier, there are over 150 document types supported by the taxonomy.  The data in each document is described using concepts covered under the Data column in this slide.  Each box in the Data column include dozens or hundreds of concepts.

For example a Site Lease document is a contract naming the terms of the lease.  The terms can easily dive into technical descriptions of equipment and of course includes lists of requirements, payment schedules, and more.  One thing which will be included is a list of the stakeholders, and each stakeholder is a rich data item including name(s), biographical descriptions, website addresses, physical addresses, telephone numbers, corporate association, and more.

The Orange Button taxonomy defines a Project as containing one or more Site’s.  Each Site contains one or more Systems, and each System contains one or more Solar Arrays, each containing one or more SunArrays, and so on.  For example a Project might be a corporate campus which could be spread across many acres, covering several buildings and/or parking lots, and could even cross city boundaries.  A Site might correspond to one building, and there might be several Systems on the building and/or covering the parking lot next to the building.

Monthly Operating Report for a Project might break down solar performance to the Array or even SubArray level, as well as aggregating performance data to the System or Site or Project level.

All the data in the Monthly report must be tagged with Orange Button taxonomy terms.  Why?  It’s for the same reason that corporate SEC filings (10Q or 10K documents) are tagged with XBRL financial taxonomy terms.  It’s so that the monthly operating report can be digested as data and used in other reporting systems.  For example the 12 monthly reports for each year can be aggregated into a yearly report.  Or a Project owner may own several projects, and need to aggregate all the monthly reports into a combined monthly report.

Broad range of data/information standards

The Orange Button team drew on many related standards to define the data taxonomy

The Orange Button team did not invent the data taxonomy from scratch.  They stood on the shoulders of giants, meaning they consulted many different standards to bring together definitions for data types, for units of measurement, and more.

GAAP, for example, is the Generally Accepted Accounting Practice which is used in all corporate SEC filings in the USA.  Since solar performance turns into financial performance, that then causes folks to be paid for operating their solar projects, that repays the investors on their investment, it’s necessary to use GAAP to organize the financial reporting.

The envisioned direction for Orange Button

So far the result has been defining the Orange Button taxonomy, and getting a few software developers working on Orange Button products.   Without products, without documents utilizing the Orange Button taxonomy, etc, the envisioned gains will not happen.

Where Orange Button comes into play is when a Document is shared with others.  The Document must be setup to be digestible as Orange Button data.

A common model is for a monthly operating report to be uploaded to a service provider.  That service provider parses the report, storing the data in its internal data structures.  The report might even be thrown away at that point, if the service provider designed their system to be able to reproduce the report from the stored data.  Once the service provider has a pool of operating reports, it can easily construct new reports by slicing and dicing the data as necessary.

The organization producing that report may have its own data pool, with a software system to produce the report.

It’s expected orange button documents will be exchanged between data systems.

This diagram is my attempt to clarify what I just described.  The cloudy thing in the middle is of course The Internet.  It’s expected the data will be translated in and out of Orange Button document formats as needed.

This is of course a simplified diagram, because there is a whole tapestry of document exchanges involved with solar projects.  All those documents are being exchanged today.  It’s hoped that by ensuring those documents to contain easily digestible data, all processes around financing solar projects.  The documents can become carriers of data, rather than simply text.

For more information see –

About David Herron

David Herron is a writer and software engineer living in Silicon Valley. He primarily writes about electric vehicles, clean energy systems, climate change, peak oil and related issues. When not writing he indulges in software projects and is sometimes employed as a software engineer. David has written for sites like PlugInCars and TorqueNews, and worked for companies like Sun Microsystems and Yahoo.

About David Herron

David Herron is a writer and software engineer living in Silicon Valley. He primarily writes about electric vehicles, clean energy systems, climate change, peak oil and related issues. When not writing he indulges in software projects and is sometimes employed as a software engineer. David has written for sites like PlugInCars and TorqueNews, and worked for companies like Sun Microsystems and Yahoo.

One Comment

  1. Pingback: EV charging station costs can be reduced, says Rocky Mountain Institute – The Long Tail Pipe

Leave a Reply