Fits, misfits and cannibals - challenges in renewables power production

Last March more than 35% of Germany’s power was produced by renewable energy, where a major stake was created by wind farms. But have you ever wondered, if this figure will be stable or even constantly rise through the coming months and how this could be accomplished?

This question can neither be easily answered, nor do I promise a crystal ball. What we can try is to sketch future production with statistical means in order to get a sensible feeling for the uncertainty in renewables production profiles. Moreover we will make our live easier by showing how to invest in a portfolio with the right renewables asset mix to secure stable production figures.

To start things, this post is about the setup of a nice, small model, which is capable of forecasting daily production volume of renewable power plants while being analytically tractable. We will enhance and modify the model in the upcoming posts, jumping between fits and misfits of model and its data, proofing the quality of its predictions. All calculations are done with R and are implementations of methods carried out in the cited academic papers. As usual the staring point is the investigation of data which is in our case given by daily historical production volumes.

Data Description

What we see in the plot below are the daily production profiles of a German wind farm (turquoise) and a Romanian solar park (golden).

Production profile of wind and solar, source: Energy Transparency Report Statistics

After having a first look at the chart, we observe a clear yearly, seasonal pattern which results from the availability of the natural resources wind and sun. Taking a deeper look at the wind times series, we understand that there is a lot more wind in winter as in summertime. When summer ends, the wind production volume rises constantly again and prevails throughout the winter. Only relying on wind in your own renewables portfolio is evidently not really a good idea, as one will suffer from lower production cycles in summer.

On the contrary inspecting the given solar production profile, we observe quite the opposite behaviour to wind. Of course sun intensity is more prominent in summertime as in winter. Consequently there should be a rising solar production in respective summer season, which is also clearly observed in our chart. So we internalise that wind and sun have an opposite production cycle. In general for both asset classes seasonality can further be drilled down and detailed to monthly as well as weekly seasonality, which force the observed production profiles to behave as curvy as they do.

What both charts have in common is a production behaviour, which quickly returns to its seasonal mean after rapidly surging with a spike. Evidently these spikes appear far more often and are thus more prevalent in wind production than in regards to a solar power plant. This is also due to the fact, that we generally experience wind more unstable than cozy sun intensity. Consequently, the renewable asset class wind should be far more volatile than the Romanian photovoltaic power plant. In a qualitative way consequently balancing out wind farms with PV should give us quite stable production cycles throughout the year, which is already a hint at the answer to our initial question. But as this blog is of quantitative nature, let us see, what the numbers say in the end.

The Model

Having reflected a bit about features in the data, we proceed to the model selection. We stick to the model of Lucia&Schwartz or Kluge and adapt it to the power production case. We will evaluate the forecasting performance of this model after discussing its properties. In my opinion these kind of models are attractive, because they are tractable and parameters are relatively easy to calculate. Besides, using this formulation for production simulation purposes is not as unusual as it might seem at first sight (cf. Benth).

$\begin{align*} S(t) &= exp(f(t)+X(t)) \\ dX(t) &= (\mu-\alpha)X(t)dt+\sigma dW(t) \\ f(t) &= a_0+\sum_{i=1}^6[a_i cos(2\pi\gamma_it)+b_i sin(2\pi\gamma_i t)] \\ \end{align*}$

The dynamic of this logarithmic model is driven by a mean-reverting Ornstein-Uhlenbeck process. Parameter $\alpha$ describes the mean reversion speed or in other words how fast the production returns to its seasonal mean after experiencing a spike. As usual $\sigma$ represents the volatility of the underlying. Honestly speaking I am a bit indecisive about the parameter $\mu$. I have seen models where this parameter is left out, because calibration often sets $\mu$ to zero, but for our exercise we will drag him along.

The truncated fourier series $f(t)$ represents seasonality seen in the data above. A lot of different ways to capture seasonality (cf. Hyndsight) have been tried out and we will adhere to the following seasonality approach as it captures yearly, monthly as well as weekly seasonality. We set the parameters $\gamma_1 = 1, \gamma_2 = 2 ,\gamma_3 = 4, \gamma_4 = \frac{365}{7}, \gamma_5 = 2*\frac{365}{7}, \gamma_6 = 4*\frac{365}{7}$ , which corresponds to annually, semi-annually, quarterly, weekly, semi-weekly and quarter-weekly seasonality factors.

Calibration

No model is useful without capturing features observed in real data, hence we throw light on calibration in this paragraph. As already stressed a few times in this post, seasonality seems to play a major role in our time series. That’s why this data feature will be our first calibration objective. A simple linear regression of trigonometric basis functions against the logarithm of observed daily production volumes should do the job. To uncover yearly seasonality in our data, we will use three years of historical daily production volumes. Results of the linear model are as follows:

Seasonality per asset estimated by truncated fourier series

After determining the seasonality, we deseasonalize our log-Spot data by simply subtracting the seasonality function from the data.

Deseasonalised observed production volume

To carry out a sensible calibration of the model parameters $\{\alpha, \sigma, \mu\}$ , we have to remove disturbing patterns in the time series in order to arrive at the pure stochastic part. Hence we are only half-way done in extracting the pure stochastic part in our data. Given a wind farm there are a lot of artifacts, which result e.g. from O&M measures at the park. We will winsorize them out, so that these artifacts will not disturb our parameter estimation of $\{\alpha, \sigma, \mu\}$ . After doing so, we arrive at the following stage:

Deseasonalised and winsorised observed production volumes

It is very interesting fact, that deseasonalisation as well as winsorisation clearly made different regimes of volatility visible in our data. Especially if we have a look at our solar plant, we see a lot more volatility in winter than in summer, which is quite understandable when we remember ourselves being exposed to the weather of these seasons. However having our model in mind, the hypothesis of constant volatility seems to be fair enough for the moment, but perhaps a bit simplified for future use.

Returning to our technical analysis, further calibration will be carried out by the means of a maximum likelihood estimation. We use the fact that transitional densities for the Ornstein-Uhlenbeck process of our model are given. This helps to boil the calibration down to the numerical estimation of $\alpha$ . The maximum likelihood method is carried out with a Brent Solver, remaining Ornstein-Uhlenbeck parameters are then explicitly calculated through the estimated $\alpha$ .

Testing of calibration results

This paragraph will demonstrate the goodness of our model. Strictly speaking, whether the model reproduces enough features seen in historical production, so that one may use it for future economical decisions without danger. Disjunct quarterly, semi-annual as well as annual estimation horizons were tested and I recommend to use at least semi-annual daily observations in order to arrive at sensible estimation results. For the sake of fun I tried quarterly observation horizons out, but it seemed more than roulette to arrive at a density plot which has something to do with actual power production profile. If we would like to use quarterly observation horizons, we should include hourly production figures to guarantee enough data points. To demonstrate goodness of fit, we compare the density of historical log returns against an artificial normal distribution, which is built by the calibrated parameters $\alpha$ and $\sigma$.

Calibration results with annual time windows for German wind park

Let us do the same exercise with the Romanian solar park:

Calibration results with annual time windows for Romanian solar park

Quite visually we see a reasonable fit between the real data and the artificial density function with our estimated parameters. The longer estimation horizons are taken into account, the better the fit. Nevertheless the normal distribution is not capable of capturing the tails in the empirical distribution, precisely when we have a look at the wind production. This arises not only because of the limitations of the normal distribution, but also on account of the already mentioned non-constant volatility in the data. Personally I am a big fan of the Weibull distribution in this context. But before diving into the depths about the distribution hypothesis, let us draw a few normal distributed samples with a Monte-Carlo simulation of the SDE above and compare them with real data.

We will calibrate seasonality and SDE parameters to the data of 2016-2017 and plot simulated sample paths against daily production observations in 2018. Honestly speaking it would not be necessary to install Monte-Carlo machinery, as it should be sufficient to compare the seasonality function with the real data. But we will need the Monte-Carlo machine in the end, so let us deploy it now.

Carrying out a plain-vanilla simulation approach (Euler-Scheme) of the above SDE, we produce - thanks to the fantastic grammar of graphics - the following plot:

Daily simulation of German wind farm with calibration to 2016-2017 and simulation draws in 2018.

Daily simulation of Romanian solar plant with calibration to 2016-2017 and simulation draws in 2018.

I think some explanations about the coloring are necessary. The black trembling line is our already known real historical data for 2018. The fancy noise in the background shows 100 Monte-Carlo example paths, which were generated with parameters calibrated to 2016/2017 and then simulating the equation throughout 2018. Red lines represent standard errors, where I tried to give us a more general impression about the error with the grey smoothing graph. This one was generated by a spline interpolation of the absolute difference between real data and the mean of our Monte-Carlo simulation. With reference to the solar plant, the error is reasonable low confirming our optical impression. Of course the simulation did not reproduce the exact same line as the real production, but I would say the model has definitely a good fit to the data. Regarding the wind farm the situation is more subtle. The period around January 2018 and fourth quarter in 2018 is too noisy, which results in a sesonality that underestimates the enormous spikes seen in the data. On the contrary for the remaining periods the fit seems to be reasonable. Further analysis about this wind aspect will be postponed to another post, worth mentioning, that this trait should have also something to do with non-constant volatility.

Let us summarise what we have achieved so far: We now have established model, which shows a reasonable fit to real data, but has some limitations when it comes to volatility. Simulating the model equation again shows paths, which are reasonable in comparison to actual production volumes. Now let us try to achieve something economical useful with the model.

Returning to our introductory question, we would like to have most stable production throughout the year or at least a production which will never be under a predefined threshold (red line).

Iteratively changing weights in mixture density distribution for our portfolio of wind and solar assets

This can be achieved by constantly shifting up and down the production of a solar and a wind power plant, which is represented by adjusting the weights in our mixture distribution from day to day. So what is the basic problem formulation in economical terms: Compensate adverse production periods by curtailing the production of one plant and boost production of another, which is exposed to more favourable circumstances. In mathematical terms: On a daily basis we will adjust the weights of our marginal densities in the mixture distribution, so that the left wing of the mixture distribution with a probability of 99% will never be below a certain negative return. In other words we will optimise our plant loadings, so that estimated production is never below e.g. the average daily production of a 50% wind and 50% Solar asset mix of the year before. Well, that’s the idea and now as we have all our technical ingredients together, let us put it into a bowl and create a nice algorithm that constantly re-balances our portfolio consisting of one big German wind farm and two smaller Romanian PV parks.

Results of this daily optimisation algorithm are shown below:

Optimised portfolio balancing for wind and solar assets.

Again the red line indicates the average production of a 50% to 50% asset mix of wind and solar. Golden plot represents an asset mix of 10% wind and 90% solar and evidently starts with a huge winter downside, while getting a good performance in summer. Blue line shows 90% wind and 10% solar asset mix, hence the opposite picture to 10%:90% asset mix. Results first, the Monte-Carlo optimised loadings (black line) of the plant show a more stable production trend at the cost of loosing some upside in summertime. I also added a trend interpolation in blue throughout the year, so that we can clearly see a more stable trend. In January we started with low production in all three plants, so in terms of portfolio allocation the wind park seems not to be optimal. A wind park with a clearer seasonal height in winter should have produced a more stable black line. For the sake of illustrating the idea of Monte-Carlo optimisation of a portfolio, used asset allocation should be sufficient.

Key Takeaways and further steps

We have a model which quantifies production uncertainty and an optimisation method for our portfolio to overcome low production cycles. Hence we can fix the production for some time with the right asset mix. That’s great …

Alt Text

Share on

Twitter Facebook LinkedIn

Fits, misfits and cannibals - challenges in renewables power production

Jens Juergens

Data Description

The Model

Calibration

Testing of calibration results

Key Takeaways and further steps

Share on

Leave a comment

You may also enjoy

Position risk

Renewables Sharpe Ratio - NASA found a wind park in Tasmania

Renewables storage - control the plant controller