Downscaling ECMWF seasonal precipitation forecasts in Europe using the RCA model

The operational performance and usefulness of regional climate models at seasonal time scales are assessed by downscaling an ensemble of global seasonal forecasts. The Rossby Centre RCA regional model was applied to downscale a five-member ensemble from the ECMWF System3 global model in the European Atlantic domain for the period 1981–2001. One month lead time global and regional precipitation predictions were compared over Europe—and particularly over Spain—focusing the study in SON (autumn) dry events. A robust tercile-based probabilistic validation approach was applied to compare the forecasts from global and regional models, obtaining significant skill in both cases, but over a wider area for the later. Finally, we also analyse the performance of a mixed ensemble combining both forecasts.


Introduction
Seasonal forecasting is nowadays a well-established operational area and different centres around the world run global seasonal forecasting systems (Kirtman and Pirani, 2009), such as the Climate Forecast System from the National Center for Environmental Prediction, NCEP CFS, (Saha et al., 2006), the Australian POAMA (Wang et al., 2001), or the European EURO-SIP multimodel (Palmer et al., 2004;Vitart et al., 2007). Moreover, different validation studies have shown certain skill of these models to forecast temperature and precipitation months in advance in tropical (see e.g. Palmer et al., 2004;Gutiérrez et al., 2005) and mid-latitude regions such as Spain (Sordo et al., 2008;Frías et al., 2010), although the predictability is lower in the latter case, and reduced to particular seasons and events (e.g. dry autumn seasons in Spain, Frías et al., 2010).
Seasonal forecasts have many potential applications for different socio-economic sectors, such as agriculture (Challinor et al., 2005) and energy (García-Morales and Dubus, 2007). However, existing application models typically require higher resolution predictions than those delivered by the global modelsparticularly in Spain, with high spatial and temporal variability (Témez, 2005) and marked interannual drought indices (Almarza, 2000). Therefore, it is necessary to perform some form of downscaling process in order to cope with regional scales. Regional numerical models have been extensively applied in short-and medium-range weather forecast and also to study local effects of global climate change. However, by comparison, little has been done for the seasonal time scale (see Laprise, 2008, for a review of regional modelling). Previous efforts are mostly centred over regions strongly sensitive to the El Niño Southern Oscillation (ENSO) signal. There are studies based on single global prediction (Gershunov et al., 2000) and on ensembles (Nobre et al., 2001;Druyan et al., 2002;Misra et al., 2003;Sun et al., 2005); however, the latter cannot fully exploit a probabilistic validation framework due to the low number of seasons simulated (the studies mentioned used a maximum of 3 yr). The results reported so far are mainly based on the reproduction of the seasonal mean for a few seasons or comparisons of the spread among members in the global and regional models. The spatial distribution of precipitation is usually improved due to the finer resolution but, as far as we are aware, no improvement in predictability has been reported so far. Moreover, most of these studies used observed sea surface temperatures (SSTs), which are not available in a real application, instead of running a fully coupled atmosphere-ocean global circulation model.
Even though the need for a systematic analysis of the performance of dynamical downscaling at seasonal time scales is suggested in several studies (Feddersen, 2003;Doblas-Reyes et al., 2006), this analysis is still lacking. In a previous study, Díez et al. (2005) analysed the results of a regional model coupled to the ECMWF DEMETER output over a mid-latitude region with low seasonal predictability (Spain), reporting some successful preliminary results due, again, to the low number of seasons simulated (the 1997(the /1999; in that work, an operational seasonal forecast system was used instead of prescribed SST simulations. Moreover, some recent studies have reported skill for global predictions over this challenging region, in particular in autumn SON season (Sordo et al., 2008;Frías et al., 2010). This study is grounded on these works and considers the hindcast of the last generation ensemble seasonal forecast system at ECMWF (System3) for a period of 21 yr. The Rossby Centre (RCA) regional model was applied by AEMET to dynamically downscale these global predictions over the Euro-Atlantic region, considering the 12 initializations, month by month, around the year (Orfila et al., 2007). This long period allowed us to perform a robust probabilistic analysis of the results. Moreover, since System3 is operative and RCA can be run in operational mode, the improvements shown in this study can be directly applied by AEMET to improve the seasonal prediction over Spain, at least in the key SON season, which is the beginning of the hydrological year.
The main goal of this paper is analysing the performance of a regional climate model (the RCA model) coupled over Europe to the global seasonal predictions of ECMWF System3 (Section 2). To this aim, a robust probabilistic validation framework is applied to compute the skill and the corresponding statistical significance of the global and regional predictions; the results are compared at a continental and regional (Spain) level (Section 3).

Data and area of study
The RCA climate model version 3 (Kjellstrom et al., 2005) has been used in this work considering 40 levels in the vertical and a horizontal resolution of 0.5 • × 0.5 • . The area covered was the European Atlantic domain (15.5 • N-65 • N and 67 • W-31 • E) and the model was integrated for a period of 21 yr  forced using the boundary conditions provided by the ECMWF System3 hindcast. System3 has a horizontal resolution of 1.125 • and 62 levels in the vertical, so only the odd levels were considered. The System3 hindcast includes 11 members which run over 7-month periods initialized every month. However, in this study we only considered five members as boundary conditions in the RCA simulations (hereafter we refer to the global fivemember ensemble SYS.5 and to the corresponding regional one as RCA.5). All the aforementioned restrictions were mainly due to limitations in the archiving of the boundaries from System3 and the large computational resources required to run the RCA for such a long hindcast period in ensemble mode.
We considered both the autumn (SON) and spring (MAM) seasons, using 1-month lead time predictions from System3 hindcast starting in August and February, respectively. The study is focused on the results from autumn since this period corresponds to the highest signal of skill found for precipitation in Spain by Frías et al. (2010) using seasonal forecasts from the EU DEMETER project (Palmer et al., 2004). Note that the regional model is not expected to improve a prediction if it is a noisy one, with no skill. The results from spring were also analysed since this season shows also high precipitation variability over Spain; however no significant skill was obtained in this case and, therefore, the results are not shown.
In this study we also used the E-OBS 0.5 • resolution daily gridded precipitation data sets over Europe produced in the EN-SEMBLES project (Haylock et al., 2008); moreover, in Spain we consider an alternative 0.5 • resolution daily grid prepared using the same data and methodology described in Herrera et al. (2010), covering the period from 1950 to 2002 with higher station density than E-OBS (hereafter Spain05). Note that these data sets offer an optimum approach to validate the skill of the regional seasonal forecasts having the same spatial resolution as the regional model. Figure 1 illustrates the comparison of the climatologies from the observations and the RCA and Sys-tem3 simulations. It is shown that the RCA model climatology captures overall the amounts of accumulated precipitation and the spatial patterns. However, it can also be seen that model precipitation is very much dominated by the orography, which introduces a bias in the predictions. Therefore, in order to assess the skill of the predictions, we need to consider an appropriate framework which is robust to bias.

Validation results
A robust probabilistic tercile-based validation approach was used in this paper to compare the results obtained from the global and regional models. This method is invariant to any increasing transformation of the predictions and/or observations and has been introduced in a previous study to assess the skill of seasonal predictions from DEMETER and System2 over Spain (see Frías et al., 2010, for further details).
This approach allows us to explore whether the forecasting system captures correctly the observed dry/normal/wet anomalies. These three probabilities represent a coarse-grained description of the forecast PDF and provide a reasonable and commonly used first choice for seasonal forecast validation. The tercile probabilities are estimated from the ensemble prediction as the fraction of the members that fall in each category (counting method). This approach is applied to the precipitation series from the five-member ensemble predictions SYS.5 and RCA.5 to obtain a probabilistic forecast. For each member and season the daily predictions were averaged to obtain a single seasonal forecast. The corresponding terciles for each ensemble member and grid point were computed for the analysis period (21 yr). Thus, for each grid point the 21 yr of seasonal values are converted to a series of tercile categories by considering values above, between or below the terciles of the whole period. Observations were also expressed in terciles using the E-OBS and Spain05 data for the same period of time.
As an illustrative example of the probabilistic tercile-based validation scheme, Fig. 2 shows the series of forecasted probabilities for the 21 yr (autumn season) corresponding to the RCA.5 and SYS.5 predictions for a particular grid point with high/low regional/global predictability (the point is marked by an arrow in Figs 3c and d). From a visual inspection it can easily be seen that the resolution (non-uniformity) or the prediction obtained from the RCA model varies from year-to-year, indicating a stronger signal given by the forecast. The circles show the corresponding observed terciles for each of the years. It can also be seen that the forecast system does a good job to predict the observed dry anomalies (first tercile, labelled as 'T1'). In the case of the System3 predictions the probabilities are more uniform and the predictions of the dry anomalies are not so good as in the previous case.
In order to obtain a quantitative measure of the skill, we used the ROC Skill Area (RSA) score. The ROC curve shows the proportion of occurrences that were correctly forecast (the hit rate) versus the proportion of non-occurrences that were incorrectly forecast, the false alarm rate (see Jolliffe and Stephenson, 2003, for an introduction to probabilistic forecast validation). The area under the curve is a skill measure (RSA) and it is commonly used to evaluate the performance of probabilistic systems. The value of this score ranges from 1 (perfect forecast system) to −1 (perfectly bad forecast system). A value zero indicates no skill compared with a climatological prediction.   3. RSA for the dry (T1) probabilistic predictions in autumn for the five members System3 and RCA models for (a,b) Europe and (c,d) Spain, respectively. Dots indicate grid points exhibiting an RSA with a confidence higher than 95%. The arrows in panels (c) and (d) indicate the illustrative grid point considered in Fig. 2.   Figures 3(a) and (b) show the spatial distribution of the RSA over Europe corresponding to the dry tercile (T1) in autumn for the SYS.5 and RCA.5 predictions. The dots in the figure indicate those grid points exhibiting a RSA which are significantly different from 0 with a confidence higher than 95%. These confidence values were computed from 1000 bootstrapping runs (Mason and Graham, 2002) performed in the sample of predictions and observations, respectively. These figures show that some spatially extended significant skill is attained in the Iberian Peninsula, in agreement with Frías et al. (2010). Therefore, we also analysed this region in further detail, considering the Spain05 observation data set, obtaining the results shown in Figs 3(c) and (d). Note that although different observation data sets have been considered, both results are in very good agreement. Global predictions are skillful in central Spain, whereas regional ones exhibit skill over a wider area covering the Southern and Mediterranean regions.
As a simple indicator of model performance, we computed the area with positive RSA for each of the models, over the European and Spanish domains, respectively. In the former case we obtained values close to 50% (47.52% for SYS.5 and 52.61% for RCA.5) indicating no global skill over the European domain (note that a 50% area would correspond to a random prediction). However, in the latter, the values increased to 59.84% (for SYS.5) and 73.23% (for RCA.5) indicating a significant increment of skill over this region, in accordance with previous studies (see Fig. 3). Moreover, the average RSA values for the SYS.5/RCA.5 models over those regions (with positive RSA values) are 0.21/0.22 for the European domain and 0.21/0.27 over Spain, respectively. This indicates that besides of improving the skillful area, the RCA model also improves the average skill over the resulting region, reaching a value close to 0.3 in Spain. Figure 3 also shows the areas with RSA larger than 0.3, indicating also the outperformance of the RCA model as compared with the global System3 simulations.
Finally, since the areas of global and regional skill seem to be complementary (see Figs 3c and d), we also analyse the performance of a mixed ensemble obtained by joining the fivemember ensembles SYS.5 and RCA.5 (referred to as MIX.10). This mixed ensemble shares features of both global and regional methods as shown in Fig. 4. However the quantification of the area with positive RSA (72%) reveals a result similar to that obtained for RCA.5. The same conclusion is obtained for the spatial average of positive RSA, with a value of 0.25 for this Fig. 4. RSA for the dry probabilistic predictions for the 10-member mixed ensembles. Dots indicate grid points exhibiting a RSA with a confidence higher than 95%.
ensemble. The usefulness of this type of ensembles is not clear and further research is necessary to understand the benefits and limitations of this approach.

Conclusions
We present the results of an unprecedented hindcast experiment consisting of dynamically downscaling an ensemble (five members) of global seasonal forecasts for autumn (SON) and spring (MAM) from the ECMWF operational model System3 over Europe for a 20-yr period. This allows for a proper statistical verification and comparison of the global and regional probabilistic predictions using a robust tercile-based validation approach.
As in other seasonal forecasting studies in mid-latitudes, significant predictability is only found for particular seasons (SON), areas (mostly over the Iberian Peninsula) and events (dry). However, the regions where the regional RCA downscaled forecasts are skillful differ from those of the global System3 ensemble. Thus, on some areas, the RCA downscaled results provide additional skill which could be translated into operational benefits.
In order to take advantage of both sources of predictability, a simple initial approach for combining both the global and regional dynamical forecasts in a mixed ensemble is considered. The application of the mixed ensemble over this region provides similar skill of the RCA forecasts. This type of mixed ensembles with different resolution (as the GCMs from the IPCC AR4) or even of different nature (e.g. dynamical and statistical downscaling) is currently a hot topic of active research. However, further research is needed in order to explore optimum ways of combining these types of predictions and to understand the benefits of this kind of approach.

Acknowledgments
This work was partly supported by projects ENSEMBLES from the 6th FP EU (GOCE-CT-2003-505539), EXTREMBLES (CGL2010-21869) and CORWES (CGL2010-22158-C02) from the Spanish Ministry MICINN (Plan Nacional de I+D+i) and by ESCENA (200800050084265) from the Spanish Ministry MARM. The authors are also grateful to Tim Stockdale, from ECMWF, for the arrangements in making available the System3 boundaries, and to SMHI and MetÉireann, respectively, for providing and making easier the use of RCA in the ECMWF facilities. We thank two anonymous reviewers for their suggestions for improving the manuscript.