www.elsevier.com/locate/energy
A multivariate qualitative model for the prediction of daily
global radiation from three hourly global radiation values
L. Ramirez
a, Ll. Mora-Lo´pez
b, M. Sidrach-de-Cardona
c,*a Dpto. Energı´as Renovables/CIEMAT, Avd. Complutense 22, 28040 Madrid, Spain
bDpto. Lenguajes y C. Computacio´n, Campus de Teatinos, Universidad de Ma´laga, 29071 Ma´laga, Spain c Dpto. Fı´sica Aplicada II E.T.S.I. Informa´tica, Campus de Teatinos, Universidad de Ma´laga, 29071 Ma´laga, Spain
Received 10 December 1998
Abstract
A new model for the prediction of daily global radiation using three hourly radiation values is proposed. This model is obtained by multivariate regression analysis. The hourly clearness index and various qualitat-ive variables are used as independent variables. The hourly values are obtained from net ground measures of hourly global radiation corresponding to the hours in which Meteosat secondary images are available over Europe. The qualitative variables allow us to include additional non-numerical information, specifically, the season of the year. The proposed model is the same for all the locations analysed. This model can be used for the prediction of daily global radiation based on hourly global radiation data obtained from satellite images. 2001 Elsevier Science Ltd. All rights reserved.
1. Introduction
One important limitation for the development of solar energy systems is the often scarce avail-ability of information about solar radiation resources. Appropriate knowledge of energy resources is necessary for many important tasks, such as site selection for big solar plants or the right design and size of solar installations. At present, data obtained from ground stations are used for these tasks. Unfortunately, these data are not available for all locations, especially in underdeveloped countries. For this reason, other sources are used to analyse the availability of solar energy [1,2]. Several years ago the evaluation of solar radiation was proposed using geostationary satellite images [3]. The use of these satellite images, the resolution of which ranges from 2.5 to 5 km2
(at the subsatellite point), allows simultaneous analysis of different geographical zones.
* Corresponding author. Tel.:+34-95-2132722; fax:+34-95-2131450.
E-mail address: [email protected] (M. Sidrach-de-Cardona).
0360-5442/01/$ - see front matter2001 Elsevier Science Ltd. All rights reserved. PII: S 0 3 6 0 - 5 4 4 2 ( 0 0 ) 0 0 0 5 6 - 6
In order to determine solar radiation from satellite images, some steps must be realized. First of all, it is necessary to estimate global radiation from one single image. Several methods have been used for this: one method calculates instantaneous global radiation from a specific image [4], while others calculate the hourly global radiation value corresponding to the time the image was taken [5–8]. The second method is the most widely used, and, especially in Europe, the estimation of hourly values of global radiation is made from Meteosat secondary images, and these are the ones used in this study. In this method, the second step is the estimation of daily global radiation from the few available hourly values. Fig. 1 shows the steps followed in order to perform this estimation. Several models have been used for this objective. M. Noia et al. [9,10] give a review of statistical and physical methods. The most commonly used models are the following:
—The model proposed by Diabate´ [11,12]: monthly mean daily radiation is calculated from monthly mean hourly values.
—The model proposed by Raschke [13]: this model is based on the hypothesis that the ratio between daily radiation and the addition of hourly values is the same for terrestrial data as for extraterrestrial data. This is as follows:
Gd
冘
3 j⫽1Gh,j
⫽ Gd0
冘
3 j⫽1Gh0,j
(1)
where Gdand Gd0are daily global radiation and extraterrestrial daily global radiation,
respect-ively, and Gh,j and Gh0,j are hourly global radiation and extraterrestrial global radiation, for
each hour (j), respectively.
—The model proposed by C. Delorme [14]: in this model a clearness ratio for clear sky is
Fig. 1. Steps to follow in order to obtain daily global radiation from satellite images. Ghis each of the hourly global
defined (KE1) and equated with the ratio between daily global radiation and the addition of the
three hourly values.
KE1⫽
Gd1
冘
3 j⫽1Gh1,j
and Gd
冘
3 j⫽1Gh,j
⫽ Gd1
冘
3 j⫽1Gh1,j
(2)
where Gd1 is the daily global radiation for clear sky, and Gh1,j is the hourly global radiation
for clear sky, for each hour (j). As Delorme proposes, this ratio can also be calculated as a simple equation based on the Julian day and latitude, as shown:
KE1⫽KE1(EQ)⫹0.1(JD⫺12)⫹2·10−4f(JD⫺12)2 (3)
where KE1(EQ) is the value of KE1 during the equinoxes, JD is the Julian day andf is the
lati-tude.
The daily global radiation obtained with our model will be compared with the results of the model proposed by Raschke. Comparing our model with Delorme’s model would be more difficult because the source of error in both models could be very different. Namely, the error could be due to the type of clear sky model applied, the Eq. (3) proposed to calculate KE1, and the fact
that in Delorme’s model the hours must be “centred” in relation to midday, which is not the case in the model we propose. On the other hand, comparisons with the model proposed by Diabate´ are not possible because his model is only applicable for mean daily radiation values.
This paper proposes a new model for estimating a daily value from three hourly values. The model is based on a multivariate regression analysis where the dependent variable (daily solar radiation) is a function of three numerical independent variables (the hourly values) and it incor-porates six new qualitative or binary variables. These new variables allow us to take into account non-numerical information, such as the season the data was gathered and the clearness index for midday.
The following hypothesis has been made: if there is a high correlation between daily global radiation and the three values of global hourly radiation (incorporating the binary variables), then it is possible to use this correlation to estimate daily global radiation from hourly values obtained from satellite images. The hourly radiation values used for the model adjustment correspond to the hours the Meteosat satellite secondary images were taken.
2. Data set
In the development of a model, two stages must be differentiated: the training stage, and the application stage. Although the final objective of this paper is the calculation of daily global radiation from three hourly values obtained by processing satellite images (the application stage),
the pyranometrical hourly values are going to be used during model training. This model training will then be carefully studied and the use of pyranometrical data will enable us to identify possible error sources of using hourly global radiation directly from processed satellite images.
The data used in the training stage are hourly global radiation values from eleven Spanish locations whose latitudes range from 36°to 43°, recorded by the Spanish National Meteorological Institute between 1975 and 1986. The daily global radiation is obtained as the sum of all hourly data during a day.
First of all, the hours for which Meteosat secondary images are available in Spain were evalu-ated. The best way to evaluate solar radiation over Spain is by using only one image covering the whole of the Iberian Peninsula in the visible channel. Regarding secondary images, only the format C2D covers this area, and only three slots (30 minute intervals) are available. In Table 1, we show a detailed timetable of the hours linked to these three images [15].
Once the times were evaluated, the pyranometrical hourly radiation data corresponding to the hours labelled as 11:00, 13:00, and 14:00 and the daily value, were extracted from the database. In the development of the new model we use a derived variable instead of global radiation; this is the clearness index. This is defined as the relationship between terrestrial and extraterrestrial global radiation:
Kd⫽
Gd
Gd0
and Kh⫽
Gh
Gh0
(4)
where Kd is the daily clearness index and Kh is the hourly clearness index.
A summary of the evaluated data, i.e., mean daily global radiation (Gd), mean value of the
daily clearness index (Kd), the total number of days with data, the global period, the total number
of years, and the latitude and longitude for each location, is shown in Table 2.
3. Proposed model
The proposed model is the final result of an evolution which started from simpler models [16]. The first step was the analysis of the relationship between daily global radiation and hourly global
Table 1
Detailed timetable of the hours related with the three selected images
Slot Scanning start Iberian Peninsula Hour centred in Pyranometrical Label of the time time scan timea the scan timeb data timec in treatment
21 10:00 10:21 (10–11) 11 1
25 12:00 12:21 (12–13) 13 2
27 13:00 13:21 (13–14) 14 3
a The satellite needs 25 minutes to scan 2500 lines. As the centre of the Iberian Peninsula is around the line 2100,
then this is approximately twenty minutes later than the scanning start time.
b As the pyranometrical data are recorded in whole hours, it is necessary to select the most similar periods. In this
case, only a ten minutes difference is detected, which is not very significant.
Table 2
Summary of the data set used in the analysis Location Gd(kWh)
a K
d
b Days Period Years Latitude Longitude
Madrid 4.5 0.55 2.920 79/86 8.0 40°28⬘ ⫺3°34⬘
Badajoz 4.7 0.55 2.731 76/83 7.5 38°52⬘ ⫺6°58⬘
Sevilla 4.7 0.57 2.872 75/84 7.9 37°22⬘ ⫺5°59⬘
Palma 4.3 0.52 3.530 75/84 9.7 39°33⬘ +2°43⬘
Logron˜o 4.1 0.51 1.460 81/84 4.0 42°27⬘ ⫺2°20⬘
Castello´n 4.4 0.54 1.680 79/84 4.7 39°57⬘ ⫺0°04⬘
Ma´laga 4.7 0.55 3.255 75/84 8.9 36°40⬘ ⫺4°29⬘
Murcia 4.6 0.56 3.438 75/84 9.4 38°00⬘ ⫺1°10⬘
Oviedo 3.1 0.40 3.377 75/84 9.3 43°21⬘ ⫺5°52⬘
Santiago 3.3 0.41 1.694 75/84 4.7 42°53⬘ ⫺8°25⬘
Tortosa 4.1 0.51 1.580 80/84 4.3 40°49⬘ +0°29⬘
a Total mean of the daily global radiation. b Total mean of the clearness index.
radiation. The results of the linear regression, which included all the locations as a whole and then one by one, were evaluated. It was observed that the correlation between them was high, and 95% (R2=0.95) of the daily global radiation could be explained by using the three hourly
values as independent variables, without showing any location dependence. On the other hand, there was a seasonal component in this relationship that had to be eliminated in order to assume the basic hypothesis of regression analysis. Thus, in order to eliminate it, clearness index values were used instead of global radiation values. The result of the linear regression of daily clearness index, as the dependent variable of three hourly index values, is R2=0.96.
In the next step, dummy variables related to the seasons were incorporated. After analysing the residuals, a certain structure was found in them. This structure is related to the seasons and, in order to get rid of it, dummy variables were introduced as independent variables. These vari-ables allow us to include non-numerical information (season of the year) in the model. The intro-duction of such variables makes it unnecessary to perform a separate analysis for each season. Dummy variables have been used to group observations according to the different seasons of the year.
The last step was detecting strong differences in the distribution function of the new series compared to the measured series. It was detected that in most of the daily clearness index ranges the estimation of the new model was closer to the measured values. The probability distribution function of the hourly clearness index was evaluated and a breakpoint around 0.5 was detected in the three hours measured. The clearness hourly index at 13:00 was selected to group the obser-vations.
With these dummy variables, it was possible to group our observations according to the infor-mation available. These dummy variables were defined for each observation as follows:
Fi,j=1 if Kh2苸Ii (1ⱕiⱕ2) and month苸Mj, (1ⱕjⱕ3), and otherwise Fij=0. Variable Ii represents
the group to which the clearness index (Kh2) of the observation belongs. These groups are defined
Table 3
Estimates of coefficientsaiof Eq. (5)
a1 a2 a3
0.3894 0.1126 0.4293
I1=[0–0.5], I2=[0.5–1]. Variable Mj represents the seasonal group to which the observation
belongs. These groups are defined as follows:
M1(winter) =months 1st, 2nd, 11th, 12th;
M2(autumn−−spring) =months 3rd, 4th, 9th, 10th;
M3(summer) =months 5th, 6th, 7th, 8th.
The dependent variable in our model is the daily clearness index, Kd, and the independent
variables are the hourly clearness index and dummy variables. That is,
Kd⫽
冘
3j⫽1
ajKh,j⫹
冘
2i⫽1
冘
3j⫽1
gi,jFi,j⫹Error (5)
where Kh,1Kh,2and Kh,3are the clearness hourly index values at 11:00, 13:00, and 14:00,
respect-ively, and aj (1ⱕjⱕ3) and gi,j (1ⱕiⱕ2, 1ⱕjⱕ3) are unknown parameters. This equation was
estimated by ordinary least squares.
The results of this fit are shown in Tables 3 and 4. The coefficient of determination (R2) is
0.97. It has also been proved that if this regression model is applied separately to each location, the results do not improve.
4. Results
The daily clearness index obtained using our model (hereafter, proposed model) and the value obtained using the model proposed by Raschke (model 1) are compared with the measured data.
Table 4
Estimates of coefficientsgi,jof Eq. (5)
Month (1, 2, 11, 12) (3, 4, 9, 10) (5, 6, 7, 8)
0ⱕKh2⬍0.5 0.0191 0.0195 0.0194
Fig. 2. Daily clearness index calculated from measured data versus daily clearness index estimated from Model 1 for the localities considered.
Figs. 2 and 3 plot the measured daily clearness index data versus the estimated index with model 1 and the proposed model, respectively. From Fig. 2 we can conclude that, for all values of Kd
the estimates are higher than the actual values. Moreover, this overestimation increases the greater the value of Kd. On the other hand, in Fig. 3 it is possible to observe that the estimates are very
similar to the data measured, for all values of Kd.
The results of the proposed regression model do not improve if a model is estimated separately
Fig. 3. Daily clearness index calculated from measured data versus daily clearness index estimated from the proposed model for the localities considered.
Fig. 4. Daily clearness index calculated from measured data versus daily clearness index estimated from model 1 (data from Madrid).
for each location, i.e., the model is general for all the locations we have analysed. For instance, Figs. 4 and 5 show the measured values for Madrid together with the estimates of clearness index obtained with model 1 and the proposed model, respectively. It can be seen that these figures are similar to the figures obtained with all the locations as a whole (Figs. 2 and 3).
The comparison between measured and estimated series was made by analysing the distribution function. Once it was proved that both series had similar distribution functions, the differences
Fig. 5. Daily clearness index calculated from measured data versus daily clearness index estimated from the proposed model (data from Madrid).
between both models were evaluated for the mean relative error and the root mean square error. The differences between the distribution functions were evaluated with the Kolmogorov–Smirnov test [17]. For each range of daily global radiation, the probabilistic value of Dnis calculated with:
Dn⫽|Tn(x)⫺T(x)| (6)
where Tn(x) is the distribution function of the measures and T(x) is the distribution function of
the evaluated series. The critical value calculated for n=28.537 is⬇0.01. Fig. 6 shows the distance of the distribution function for each series to the measured one. The critical value is fitted as a horizontal line. The significance of this critical value is that the distance between the estimated and the measured series cannot be greater than this value. If the distance is lower in all ranges, it can be assumed that the series of measured and estimated data have the same distribution and in this way they can be similar. As shown in Fig. 6, it can be said that the estimated series from the proposed model have the same distribution function as the measured series, whereas none of the previous models yield similar results.
The mean relative errors are studied as:
RE⫽100 N
冘
i
|Gd,i−Gmed,i|
Gmed,i
(7)
where Gdrepresents one of the calculated series and Gmed(both in Wh/m2) the measured one for
each day (i). N is the total number of observations (28.537). The results of the studied model are:
REmodel 1⫽13.8% REmodel 2⫽25.1% REproposed model⫽5.4%
As can be seen, the best accuracy is given by the proposed model, followed by model 1, and finally model 2.
Fig. 6. Results of the Kolmogorov–Smirnov test. The horizontal line in the figure, fit of the critical value of the distance between the estimated series and the measured one.
The root mean square error (RMSE) is a standard parameter evaluated in most related work, and allows us to compare these results with them. The RMSE is defined as follows:
RMSE⫽
冪
冘
i(Gd,i−Gmed,i)2
N (8)
The RMSEs calculated for each series (in Wh/m2) are:
RMSEmodel 1⫽170 RMSEmodel 2⫽292 RMSEproposed model⫽87
.
5. Conclusions
Most estimations of daily global radiation values from satellite images require the use of models which allow us to calculate these values from a small number of images per day (usually three). Different methodologies have been suggested to do this. One of them proposes the estimation of the hourly values of global radiation from satellite images and then calculating the daily value. In this work, a model for predicting daily global radiation from three hourly values is proposed. This model has been obtained using a multivariate regression analysis. The model uses some independent variables — three values of hourly clearness index — and six qualitative variables. These qualitative variables allow us to include non-numerical information in the model. In pre-vious models, only quantitative variables were used. The coefficient of determination of the pro-posed regression analysis is 0.97, and when predicted and real values for the dependent variable are compared, they yield similar distribution functions. The mean relative error is 5.4%, and the root mean square error is 87 (in Wh/m2). These errors are smaller than the ones obtained by using
other methods. The results of the proposed regression model do not improve if a model is esti-mated separately for each location, i.e., the model is general for all the locations we analysed.
The results have also been compared to those obtained with a similar model proposed earlier. In the locations under study, the new model provides better results than others. It remains to test whether the proposed model continues to yield good results in locations whose radiation levels differ from those found in the locations we used. Since the model has not been tested for different latitudes, it can only be said that it can be applied to a similar range of latitudes to the ones used in its development.
Acknowledgements
This work has been partially supported by the Science and Technology Interministry Com-mission, Spain, Project No. CLI 98/0847. We are grateful to The Spanish National Meteorological Institute (Madrid), for providing us with the data used in this work.
References
[1] Perez R, Seals R, Stewart R, Zelenka A, Estrada-Cajigal V. Using satellite-derived insolation data for the site/time specific simulation of solar energy systems. Solar Energy 1994;53(6):491–5.
[2] Perez R, Seals R, Stewart R, Zelenka A. Comparing satellite remote sensing and ground network measurements for the production of site/time specific irradiance data. Solar Energy 1997;60(2):89–96.
[3] Katsaros KB. Solar radiation data from satellite images. Book review. Bull American Meteorol Soc 1987;68:1441. [4] Hay JE, Hanson KJ. A satellite-based methodology for determining solar irradiance at the ocean surface during
GATE. Bull American Meteorol Soc 1978;59:1549.
[5] Tarpley JD. Estimating incident solar radiation at the surface from geostationary satellite data. J Appl Meteor 1979;18:1172–83.
[6] Raphael C, Hay JE. An assessment of models which use satellite data to estimate solar irradiance at the earth’s surface. J Clim App Meteor 1984;23:832–44.
[7] Cano D, Monget JM, Albuisson M, Guillard H, Regas N, Wald L. A method for the determination of the global solar radiation from the meteorological satellite data. Solar Energy 1986;37(1):31–9.
[8] Beyer HG, Constanzo C, Reise C. Multiresolution analysis of satellite-derived irradiance maps — an evaluation of a new tool for the spatial characterization of hourly irradiance fields. Solar Energy 1985;55(1):9–20. [9] Noia M, Ratto CF, Festa R. Solar irradiance estimation from geostationary satellite data: I, Statistical models.
Solar Energy 1993;51(6):449–56.
[10] Noia M, Ratto CF, Festa R. Solar irradiance estimation from geostationary satellite data: II, Physical models. Solar Energy 1993;51(6):457–65.
[11] Diabate´ L, Demercq M, Michaud-Regas N, Wald L. Estimating incident solar radiation at the surface from images of the earth transmitted by geostationary satellites: the Heliosat Project. Int J Sol Energy 1988;5:261–78. [12] Diabate´ L, Moussu G, Wald L. Description of an operational tool for determination of solar radiation at ground
using geostationary satellite images. Solar Energy 1989;42(3):201–7.
[13] Raschke E, Gratzki A, Reiland M. Estimates of global radiation at the ground from reduced data sets of the international satellite cloud climatology project. J of Climatology 1985;7:205–13.
[14] Delorme C, Mohamed C, Otmani A. De´termination d’une irradiation solaire journalie`re a` partir de trois irradiations horaires a` 9, 12 et 15 h. Revue Phys Appl 1989;24:1023–7.
[15] EUMETSAT. The Meteosat system. EUM TD 05, 1998.
[16] Ramı´rez, L. Desarrollo y aplicacio´n de modelos de ca´lculo de la radiacio´n global diaria. Internal Report. University of Murcia (Spain), 1998.
[17] Rohatgi VK. An introduction to probability theory and mathematical statistics. New York: John Wiley and Sons, 1976.