Skip to content

The Unexpected Impact of Representative Test Procedures

a window air conditioning unit on a cabin


Many people join the sustainability industry because they want to make a difference. They enter with visions of planting trees and scrubbing crude oil from ducklings, not realizing that there are other, less romantic ways to make an impact on the world. In fact, lots of dry and “boring” projects end up having the greatest impact. Some of the most important and impactful work we’ve done at Cadeo is developing and advocating for representative test procedures.

Good to Know

Test procedure – The procedures and instructions that a laboratory runs on a piece of equipment. The test procedure prescribes setup parameters, instrumentation, and lab conditions. It also identifies the measurements required to calculate the metric (e.g., power, temperature, pressure).
Metric – A metric is the standardized measure of a piece of equipment’s efficiency. Metrics are the “result” of a test procedure. A test procedure will prescribe how the metric is calculated—often employing an equation that combines lab measured parameters, physical constants, real world conditions, and weighting factors.

Test procedures are used to rate the energy efficiency of a variety of equipment, including refrigerators, window air conditioner (AC) units, industrial air compressors, water pumps, lightbulbs, and gas furnaces. Because each type of equipment is different, each test procedure and efficiency metric used for rating is also different. The goal of test procedure design is to make sure that equipment that uses less energy gets a better rating and that each test procedure is “representative” of how equipment operates in real life.

The term “representative test procedure” is pretty opaque, so we’re going to discuss how we assess representativeness while also illustrating key concepts and sharing some of the things we advocate for to improve test procedures.

Options for Efficiency Metrics

To understand whether a test procedure is representative you first have to understand the different options for metrics. There is a world of variation within “efficiency” metrics, but there are two key types worth recognizing. Efficiency (in its simplest form) is the ratio of power delivered to power consumed at a single point in time. For example, say you have a box fan running at high speed on a hot day. The efficiency measure is how much electricity the fan is using compared to the amount of air it’s moving. Depending on the type of electric motor and/or the shape of the fan blades, the efficiency of a fan will vary.

The Weighted Average Efficiency

Unlike that simplest efficiency, a weighted average efficiency (or sometimes called an input power metric) is not an instantaneous measure of efficiency. Instead, it accounts for all power delivered and consumed over a period of use. The period of use could be a few hours or a season or year. With the same box fan, if you decide to turn it down to half speed, the instantaneous efficiency may actually be worse, but the fan will use less power at the lower speed. If you measure the input power of both speeds over a day, the low-speed setting uses less energy. There are not universal terms used for each of these types of metrics, and both would fall within the general category of an “efficiency metric.” The key concept is that there is variability between metrics and each type will give you different information.

Test procedures that rate efficiency at one speed or one operating condition are easier to develop and implement. Many of the first-generation equipment test procedures used this simple approach for their metrics. Efficiency metrics like this are good for addressing low hanging fruit, but now all the easy energy efficiency has been achieved. Simple efficiency metrics are bad at capturing the energy savings of equipment that can operate at multiple speeds to precisely match demand, or equipment that is part of a larger system (like heating, ventilation, and air-conditioning [HVAC] equipment). Because of this, we advocate for test procedures that use efficiency metrics that have been appropriately weighted for an average period of use and that account for both active and standby operating modes.

Average Period of Use

The average period of use for a piece of equipment will vary depending on how it’s used. For our box fan example, a reasonable period of use might be “when the fan is operating.” If the fan has only one speed, you can test at that speed, and you’ve appropriately rated the average period of use. But if the fan has two speeds, you might want to test the fan’s efficiency at both speeds and take an average of the two to get the fan’s average efficiency. Or, if you know from a (hypothetical) recently published box fan study that most Americans run their portable fans at low speed 90% of the time and high speed only 10% of the time, you could instead weight the low speed to represent 90% of the final rating.

Well… what if you also know the fan uses a small amount of energy even when it’s turned off? A truly representative metric would account for all energy consumed, so you can change your average period of use to “one summer,” but now you need to do more testing to get that standby energy consumption and do more research to figure out how to appropriately weight standby in your final metric.

The Case for Longer Periods of Use

We advocate for longer periods of use, generally a year of use or a heating or cooling season. This makes efficiency metrics more complex, but better represents how the equipment is actually used and the total energy use consumers will see on their utility bills. Even a simple piece of equipment like a box fan can get complicated fast when you consider multiple speeds or modes of operation and varying conditions over the period of use. HVAC ratings are even more difficult to determine because HVAC units have modulating control and multiple components that impact energy consumption. Testing equipment at one operating condition is much easier; it will give you some information about energy use, but as we’ve shown, that’s usually not the whole story.

Attributes of a Representative Test Procedure

The previous sections describe two things that we look at to improve representativeness of a test procedure (and metric). However, there are many things to consider before we can crown a test procedure as representative. There are three high-level checks we use[1] to determine whether a test procedure is representative:

  1. Does it give the correct rank order?
  2. Does it account for all the things that use or save energy?
  3. Do the test conditions match reality?

Rank Order

As we’ve said, a test procedure is used to develop a final rating for each piece of equipment. These ratings differentiate equipment from each other. If a test procedure is representative, the rank order of equipment will align with total energy consumption over the period of use. This means that equipment that uses less energy will have high ratings and equipment that uses more energy will have low ratings. This can inform consumers making purchasing decisions so they can correctly weigh upfront cost and energy costs over time. Correct rank order also means that setting federal standards will be fair. Because federal standards have a big impact on where a manufacturer will spend research and design dollars, a nonrepresentative test procedure can mean manufacturers spend time developing products with higher ratings and not necessarily developing products that use less energy.

Accounts for All the Things

The second check is to ensure the test procedure accounts for all the energy-saving and energy-consuming features of the equipment. Most test procedures are good at accounting for the major energy-consuming features, such as the electricity consumed by an AC’s refrigeration compressor during full-load operating mode. Test procedures are getting better at considering auxiliary energy consumption, such as standby energy, controls, defrost, and crank case heaters.

Generally, though, test procedures are bad at accounting for energy losses or gains that are part of a system but not directly measured during full- or part-load operation. A key example of this is economizers and heat recovery equipment for commercial HVAC products. An economizer will bring in cool outside air to meet a building’s cooling load instead of running the AC to re-cool air recycled from the building. Heat recovery can recover energy from exhaust air streams and transfer it to fresh ventilation air. Both technologies can save a significant amount of energy (about 10 to 50 percent of seasonal energy)[2] because it means the AC or gas heat will have to turn on less. Existing test procedures don’t value these impacts. A representative test procedure can account for these energy-saving components and ensure equipment with them gets credit in the final rating.

Test Conditions Match Reality

The third attribute of a representative test procedure is test conditions that match reality. While testing a furnace at room temperature in a lab setting is convenient, realistically, a furnace only ever runs when it is cold outside. So, the combustion air will be much colder than room temperature. Another example is for faucets and showerheads. Laboratories should test them at water pressures commonly available in residential settings.

Not all test procedures used today check all three attributes, and some do better than others in each section. These high-level checks help us identify where to look for improvements. We also need to periodically revise test procedures because a representative test procedure is a moving target. New energy-saving features enter the market every year and average climate conditions are changing—we need to be able to adjust.


In an ideal world, energy efficiency consultants would make a recommendation, and test procedure development committees, field researchers, and our government would jump to implement it. Let me tell you—we have some ideas for creating the perfect commercial rooftop unit test procedure. Unfortunately, we know that the theoretical metric would be a huge, complicated, expensive hassle to implement. Often, we must reconcile our ideas for improvements with the burden of more complex testing and the value added to the metric. Changing test procedures has real impacts on manufacturers’ business and consumers’ ability to afford equipment. This is why it’s so important to consider the effects of our work. That being said, we find opportunities every day to improve test procedures that make sense for all stakeholders.

Test procedure development is arguably one of the drier areas of the energy efficiency world, but it’s hugely impactful. Voluntary programs, energy codes, regulations, marketing materials, and anyone trying to decide what equipment to buy all use test procedures. Making them representative does not make them any more interesting, but it does ensure that they’re accurate, useful, and fair.

[1] There are other criteria for test procedure design, such as repeatability and reproducibility. We focus on representativeness in our work.

[2] Cadeo Group, Energy Savings from Efficient Rooftop Units in Heating Dominated Climates, prepared for the Northwest Energy Efficiency Alliance (NEEA), Portland, Oregon: NEEA, 2022.

Rebecca Hovey
Rebecca is a senior associate on the Emerging Tech team at Cadeo.

Recent Posts


EVs for Resiliency

Grid resiliency was the central topic of the 2022 GridFWD conference in Denver, Colorado. The 2023 conference is coming back to the Pacific Northwest this

Read More