Calculating PV Degradation Rates Using Open-Source Software

Got sensor drift, inverter clipping or data shifts due to maintenance events? RdTools, a new freeware toolkit, can handle any of these scenarios. It calculates robust degradation rates despite common performance data quality challenges.

The degradation rate (Rd) quantifies the rate at which PV systems or modules lose performance over time. Rd values not only drive the results of long-term energy production estimates for financial projections and other studies, but also help provide consumers and investors with an indicator of PV system quality and durability. In conjunction with taking other quality assurance steps, project stakeholders can also use the Rd to guide product selection and determine whether PV products or installations meet warranty terms. Accurate Rd data are therefore essential to the solar industry’s long-term success.

Here we provide an introduction to RdTools, a free and publicly available software package intended to help users evaluate Rd more easily and quickly. One of the benefits of this open-source toolkit for calculating degradation rates is that it can accommodate common challenges associated with real-world performance data, including sensor drift, clipped power curves or data shifts due to maintenance events. Since accurate methods for calculating PV degradation rates are important for manufacturers, insurers, engineers, utilities, installers, investors, businesses and consumers alike, many solar industry stakeholders may find RdTools useful.

Developing RdTools

Like module efficiency, Rd values are expressed as a percentage. However, module efficiency and module degradation rates represent very different values. Rd is relative to a baseline of 100% initial production. As an example, if a 22% efficient module degrades linearly at a rate of -0.6%/year, then its efficiency after 25 years would be 18.9%. In RdTools and in this article, a degradation rate with a negative number indicates a decrease in production.

Scientists and industry experts have long sought ways to consistently calculate accurate PV degradation rates. This is a challenging undertaking for a number of reasons. First of all, to establish a reliable basis of comparison, you must  account for performance transients when establishing the 100% performance baseline. PV module performance stabilizes over a period of days or months, depending on cell technology, and it is important to use the post-stabilization value as the starting point for Rd calculations. Depending on the module technology and the project construction schedule, project stakeholders may be able to account for stabilization effects by simply waiting until after the completion of system commissioning activities to establish the 100% performance baseline value.

Additional challenges arise post-commissioning and -stabilization. Since degradation is not necessarily linear, it is necessary to run analyses that tolerate nonlinearity. More important, a number of scenarios can impact the quality of the data used to calculate Rd values. These complicating factors include highly variable weather, data outliers, poorly maintained sensors, seasonal soiling or shading, and data shifts from maintenance events.

To address these challenges, researchers at the National Renewable Energy Laboratory (NREL)—notably Michael Deceglie, Chris Deline, Dirk Jordan and Ambarish Nag—developed RdTools in collaboration with Greg Kimball from SunPower and Adam Shinn from kWh Analytics. In addition to being relatively accurate and easy to use, RdTools provides project stakeholders with a consensus methodology for calculating PV degradation rates in the real world. To estimate the Rd for a PV system with RdTools, users need ambient temperature data, irradiance data from a sensor or reference cell, and 2 or more years’ worth of granular (hourly or better) performance data.

The developers not only used Sandia National Laboratories’ open-source PVLIB modeling software (see Resources), but also turned to Python, a freely available scientific computing language, to write RdTools. Users can run RdTools on any computer that has the open-source Python programming language installed. Interested parties can access, download and customize RdTools via the software development platform GitHub (

How It Works

To get started with RdTools, users first enter system configuration details such as longitude, latitude, time zone and PV system mounting configuration. Although RdTools does require some source of on-site irradiance data, on-site temperature measurements are not essential as the software can model these values. Upon start-up, RdTools automatically conducts a prescreening step to check the granularity of the collected data. At present, RdTools is set up to use high-frequency performance data such as 1-minute, 15-minute or hourly values.

Two different analysis methods are available on RdTools. The sensor-based method is best if high-quality temperature and irradiance data are available, which assumes that technicians regularly clean and calibrate the project’s sensors and reference cells. The clear-sky method, which normalizes the data based on clear-sky conditions, is best if sensors have low accuracy or in cases where low-accuracy satellite measurements are the source of the data. The clear-sky method still currently requires some source of irradiance data to identify times of sunny conditions, but it does not demand perfectly cleaned or calibrated sensors.

As detailed below, RdTools follows a four-step data analysis process: First, it normalizes the data, adjusting performance relative to irradiance and temperature; second, it filters the data; third, it aggregates the data and generates periodic totals; and lastly, it calculates the median rate of degradation.

Step 1: Data normalization. In this step, RdTools divides measured production data by modeled ideal values to calculate performance ratio (PR) values. The software derives the modeled values based on meteorological and system configuration details by passing these data into a PVLIB performance model. Currently, RdTools uses PVWatts as the default PVLIB performance model.

There are two possible workflows in the data normalization step. The sensor-based method passes site-measured irradiance and temperature data directly into the PVLIB performance model, in which case the calculations may incorrectly attribute sensor errors to system degradation. Alternatively, the clear-sky method calculates PR values by normalizing site data against modeled clear-sky irradiance and long-term monthly site temperature averages, which produces results that are relatively insensitive to drifting or erroneous ground-based sensors.

Step 2: Data filtering. This step filters data to remove problematic points, including power-curve clipping from a high dc-to-ac ratio, low and anomalous high-irradiance values, and improbable temperature measurements. For the clear-sky method, the software also filters data points based on the clear-sky index to specifically consider sunny conditions.

Step 3: Data aggregation. In this step, the analysis averages the filtered and irradiance-weighted PR data over the aggregation period. This results in a single PR value per aggregation period, which is typically daily.

Step 4: Rd calculation. RdTools utilizes a year-on-year (YOY) method of analysis to calculate degradation rates. In this step, the software calculates a series of slopes between any two daily values that are separated by 365 days. This means that if there are 3 years of production data, the software will calculate 730 annual slopes. In the event that there are no data for a particular day—due to data filtering or an outage—the software will not calculate slopes to or from that date. Once RdTools has calculated all the annual slopes, it generates a histogram based on the combined data and reports the median value as the system’s rate of degradation.

Customization. One of the best features of RdTools is that users are free to customize the software to fit their needs. Users with some knowledge of the Python programming language can customize aspects of RdTools to better match system and data characteristics. For example, while the default data aggregation period is daily, users can easily change this to a weekly period. Customizability provides users with the ability to optimize RdTools on a per-project basis. Users can adjust data filtering parameters to account for climates that are more or less cloudy than normal or to account for inverter power limiting in systems with a high dc-to-ac ratio. Users can also customize PVLIB system performance models based on specific system configuration details. Due to the open-source nature of RdTools, software developers can communicate with one another, report bugs, review code and propose new functionality via the GitHub repository.

Methodological Improvements

Researchers at NREL and other industry stakeholders have tested and compared many Rd calculation methodologies, weighing factors such as ease of use and the amount of time needed to determine a degradation rate with a relatively low degree of uncertainty. RdTools not only compares favorably in these regards, but also offers statistically robust analysis in relation to common problems associated with data quality. Specifically, RdTools avoids errors associated with linear regressions and tolerates imperfect sensor data as well as seasonality and seasonal soiling.

YOY analysis. The YOY analysis method in RdTools represents an improvement over classic linear regression analyses. The problem with regression line slopes is that these are sensitive to data outliers near the beginning or end of the line, as terminal data have high statistical leverage in regression analyses. Objectively filtering for outliers in a regression analysis is complex, as the filter needs to move in tandem with the unknown degradation rate to follow the gradual downward shift of the data.

In their IEEE Journal of Photovoltaics article “Robust PV Degradation Methodology and Application” (see Resources), Dirk Jordan and his co-authors found that a YOY method of calculating Rd reduced uncertainty relative to two different types of linear regression analyses. Because the YOY analysis calculates a median value from a distribution of Rd slopes, it is less sensitive to data outliers, as well as snow and soiling events. The YOY method is also resilient to data shifts, which often occur as the result of software changes or maintenance events such as sensor replacement.

If a data shift is subtle enough to go unnoticed, it can influence the results of linear regression analyses. By contrast, a median YOY Rd value is resistant to the influence of this type of data shift, as it will appear as an outlier on the histogram in a YOY analysis. Missing data have a similar effect. If end-of-year data are missing, data analysts conducting a linear regression analysis need to eliminate data for the last fraction of the year so that seasonal effects do not have an undue influence on the Rd results. The YOY technique, meanwhile, is tolerant of seasonal issues, meaning that analysts can use the full data collection time span, including fractional years.

Another problem with linear regression analyses is that they assume linearity. In the real world, however, linearity is not necessarily the case, as Jordan and others have shown in the Progress in Photovoltaics article “PV Degradation Curves: Non-Linearities and Failure Modes” (see Resources). RdTools’ YOY analysis method limits the impact of nonlinearity by showing a distribution of degradation rates rather than a single value. If a system has, for example, two different degradation rates, switching from one to another at some point in time, users may see a pair of bumps in the histogram instead of a single peak. To detect nonlinear degradation, RdTools users can analyze multiple periods of time in 2-plus–year increments to estimate these different Rd values.

The assumption of linearity is also problematic if Rd calculations are conflated with the accuracy of nameplate ratings. The “Compendium of Photovoltaic Degradation Rates” (see Resources) compiles more than 11,000 degradation rates, revealing different findings for studies that relied on one performance measurement only compared to more detailed analysis. In particular, taking only one unconfirmed data point for performance and relying on the nameplate rating instead of performance data may be inaccurate, especially in the case of older modules where the nameplate rating may have been slightly under- or overestimated. Newer PV modules have tended to demonstrate more accurate nameplate ratings, and quality modules are typically rated to take into account initial stabilization. The authors conclude that Rd calculations based on multiple clear-sky measurements, including initial post-stabilization values, are more accurate than those based on nameplate ratings and a single performance data point only.

Imperfect sensor data. Based on their experience analyzing numerous fielded PV systems, the team of developers responsible for RdTools observed that irradiance sensors are not always well maintained in the real world. It was important, therefore, to develop an analysis method that could tolerate imperfect sensor data. The data presented in the IEEE Journal of Photovoltaics article demonstrate RdTools’ usefulness in this regard.

The data in Figure 1a, for example, aggregate measured plane-of-array irradiance (Gpoa) values for a variety of sensors. The blue diamonds represent a regularly maintained reference cell; the red circles represent the median of 10 regularly maintained pyranometers; and the green triangles, black squares and purple triangles represent unmaintained sensors. The data for photodiodes 1 and 2 and reference cell 2 illustrate the sensor drift that can occur when technicians do not regularly clean and calibrate irradiance sensors in the field. Compared to the reference cell, data from the unmaintained sensors drift by as much as 1.5% per year. 

The data in Figure 1b illustrate the extent to which the clear-sky method in RdTools tolerates imperfect sensor data. The red circles in this figure represent YOY degradation rates according to RdTools’ sensor-based calculation method; the blue diamonds represent YOY degradation rates according to the clear-sky methodology. The dashed line shows the median of 10 different conventional methods of Rd calculation, including time-series analysis and quarterly I-V measurements, and the green lines show the range of one confidence interval for these values. These data illustrate that while sensor-based Rd calculations are sensitive to the quality of ground-based measurements, the clear-sky method is considerably more robust.

As shown in Figure 2, analysts can also use RdTools to compare clear-sky–based (2a, top) versus sensor-based (2b, bottom) results. For the graphs on the left, RdTools normalizes performance ratio data to 1 and charts these values by year. The graphs on the right aggregate these YOY data into a histogram and report the median value as the Rd. The confidence interval represents one standard deviation of a bootstrap distribution. In this example, which assumes a clear-sky index filter of ±20%, the drop-off at the end of the sensor-based data in 2016–17 indicates a recent sensor problem. In this case, the clear-sky–based results are likely to be more accurate than the sensor-based results.

When the clear-sky and sensor-based results disagree, analysts should suspect a sensor problem and if possible arrange for sensor testing, calibration, cleaning or replacement. Sensor maintenance is a best practice as there is likely an upper limit to the degree sensors can be erroneous for either analysis method. It is important to note that the mathematical uncertainty represented by a confidence interval reflects the degree of variation within the given data set, but does not account for a problem such as a defective or unmaintained sensor. Confidence intervals are susceptible to the garbage-in, garbage-out challenge of all data analysis. However, YOY analysis with clear-sky normalization enables analysts to utilize, rather than discard, some poorly maintained sensor data.

If a system has well-maintained irradiance and temperature sensors, the clear-sky and sensor methods are likely to produce similar results and graphs. In a system with well-maintained sensors, the best option is probably to use the sensor-based degradation rate calculation since the uncertainty represented by the confidence interval can be lower compared to the clear-sky method, as is the case in Figures 2 and 3b.

Seasonality and seasonal soiling. Many PV systems experience predictable seasonal performance variations based on annual weather patterns, haze, spectral sensitivity, partial shading, snow or soiling. Whereas linear regression analyses are vulnerable to seasonal effects, the YOY methods that RdTools uses to calculate Rd values are more robust.

As an example, the repetitive data patterns in Figure 3a (top) are the result of variations in power production for a PV system in California due to seasonal soiling. As shown in the inset detail, soil builds up on the array throughout the dry season, resulting in a steadily decreasing performance ratio; cleaning or rain events produce a noticeable upward data shift. The data in Figure 3b (bottom) show that a standard least square (SLS) linear regression analysis overestimates the rate of degradation compared to degradation rates obtained using clear-sky– and sensor-based YOY methods. These results suggest that the two YOY methods are robust in relation to seasonal soiling events, a characteristic that likely extends to other seasonal effects such as haze or partial shading.

While the YOY and clear-sky methods are less sensitive to a number of common data quality issues, analysts still require quality input data and good analysis decisions to achieve high-quality results. Prior to proceeding with any calculations, data analysts should assess data quality, check the PV system’s maintenance log and look for issues that can mimic module degradation. Is there evidence that overgrown weeds or trees may be shading the system? Has the site experienced tracker outages? If so, analysts can determine an appropriate response, such as applying data filters or removing certain time periods from the analysis. While it is still essential that input data are reasonably accurate, the RdTools software package provides system owners and data analysts with consistent and validated methods for calculating PV degradation rates.


Katherine Jordan / Complex Review / Denver /

Michael Deceglie / NREL / Golden, CO /

Chris Deline / NREL / Golden, CO /

Dirk Jordan / NREL / Golden, CO /


Jordan, Dirk, et al., “Compendium of Photovoltaic Degradation Rates,” Progress in Photovoltaics, February 2016

Jordan, Dirk, et al., “PV Degradation Curves: Non-Linearities and Failure Modes,” Progress in Photovoltaics, July 2017

Jordan, Dirk, et al., “Robust PV Degradation Methodology and Application,” IEEE Journal of Photovoltaics, December 2017

Stein, J.S., et al., “PVLIB: Open-Source Photovoltaic Performance Modeling Functions for Matlab and Python,” IEEE 43rd Photovoltaic Specialists Conference, 2016

Article Discussion

Related Articles