Statistical analysis

The statistical comparison of model predictions with ground tracer concentration measurements could be carried out only some time after the tracer release experiment, due to the time required by the chemical analysis of samples.

Since models simulated the first 60 hours of the experiment in real time, the statistical analysis of model predictions involved 20 values of three-hour average concentrations for each of the 168 stations.

The statistical analysis was divided into time, space and global analysis. With a very few exceptions, the results of these three analyses were in agreement: if a model had a good time performance, it also had a good space and global performance, confirming the validity of the choice of the stations selected for the time analysis and of the times selected for the space analysis.

The results of the statistical analysis indicate that a group of models (some six or eight) were able to forecast in real time the cloud position and its horizontal extent. This is valid for the period up to 48 hours; later the forecasts worsened.

On the other hand, when considering the concentration evolution at the various locations, even the best models were not capable of always reproducing correctly the temporal trend and/or the predicted values of those time series.




Space analysis

The space analysis was performed at 24, 36, 48 and 60 hours after the start of the release. For each time interval, all the sites were considered.

The 0.01 ng/m3 contour maps were evaluated, confirming the shown on the faxed maps.

In addition to the contour maps, for each of the four times (24, 36, 48 and 60 hours after the release start) the following statistical indices were computed for each model:

In general, it was noted that models with poor spatial performance were more or less poor for all the indices. For each statistical parameters it was possible to find a number of models with good performance. These models were not the same for all the parameters. Only five models performed well for all the indices considered in the spatial analysis, and all these models belong to the group of models that could qualitatively reproduce the cloud stretching along the direction transverse to the main transport direction.

The same indices were also computed on model predictions obtained using analysed meteorology, i.e. the subsequently observed rather than forecast meteorological situation, which showed an average improvement, especially for those models that performed badly with forecast meteorology. On the other hand, the models that performed best with forecast meteorology, worsened using analysed meteorology. This could be due to the different nature of forecast and analysis models. The former are based on physical principles while latter simply interpolate meteorological measurements. Also, forecast models generally use a finer grid.



More on: Space analysis of model predictions




Time analysis

The time analysis was conducted on the predictions at 11 sites forming approximately two arcs on the main cloud trajectory, and selected from those having a complete sequence of 20 observations. The first arc is placed at a distance of about 600 km from the source and includes stations NL5, B5, NL1 and D44. The second arc, including stations DK5, DK2, D42, D5, CR3, PL3 and H2, is at a distance of about 1200-1400 km from the source.

The 11 ground sampling stations selected for the time analysis of ETEX first release.

The first arc of stations was characterised by arrival times between 15 and 18 hours after the release start; the stations of the second arc had times of arrival equal to or beyond 30 hours after the release start. The transit duration of the tracer cloud was significantly longer at the stations of the second arc compared to those of the first arc. The concentration peak did not decrease passing from the first to the second arc stations.

Observations at the 11 sampling sites selected for the time analysis.

Station

Arrival time (h after the release start)

Duration (h)

Time of peak (h after the release start)

Peak conc. (ng/m3)

1st arc 6 4 7 4 8

NL5

18

33

30

1.99

B5

15

21

27

0.70

NL1

18

24

33

1.74

D44

15

21

24

0.68

 

 

 

 

 

 

2nd arc 6 4 4 4 7 4 4 4 8

DK5

36

51

45

2.01

DK2

39

³ 51

48

1.03

D42

30

42

45

2.02

D5

30

27

42

1.02

PL3

39

³ 51

48

0.72

CR3

30

45

39

0.57

H2

36

51

42

0.52

Only one model did not predict any tracer concentrations in all the 11 stations, but at each station there were models that did not predict any tracer arrival. The stations where most of the models predicted a tracer episode were B5, NL1, D44 (central and South in the first arc) and DK2 and D42 (central-North in the second arc). The station where most of the models failed was H2, placed at the extreme South of the second arc.

Number of models that did not predict any tracer concentration for each station

NL5

B5

NL1

D44

DK5

DK2

D42

D5

CR3

PL3

H2

11

7

5

7

10

7

6

10

13

13

18

For each of the selected stations the following quantities were computed for each model and compared with the corresponding measured quantities:

Time of arrival, temporal duration and peak concentration were well predicted by almost all the models in the two stations NL1 (centre of the first arc) and D42 (centre of the second arc). The best four models predicted the time of arrival within 9 hours of that measured for all the 11 stations. Predicted duration of three best-performing models differs from the measured ones between only –6 to +9 hours for almost all the stations.

The following statistical indices were also calculated using the 20 observation-prediction pairs for each of the 11 selected stations:



More on: The analysis of model predictions

More on: FMT at two of the 11 stations

 

Global analysis

The global analysis considered the observed-predicted concentration pairs at all times in all the stations.

The results of the global analysis were described in terms of FA2, FA5, FOEX, bias, NMSE and Pearson correlation coefficient.

Bias and FOEX are both used to evaluate the overprediction or underpredictions of a model. The difference between these two indices is that FOEX counts the number of over or underpredictions, while bias quantifies the amount of over or underprediction. Most of the models had negative FOEX, indicating that most of the values are underpredicted by the models. At the same time, almost all the models had positive bias, which means that summing all the overpredictions and subtracting all the underpredictions the result is positive, this means that a very few high values are overpredicted by a large amount.

The FA2 and FA5 take into account the percentage of predictions within a factor 2 or 5 respectively of the measurements. Only four models had a FA5 greater than 40%.

Only five models had a value lower than 10 for the global NMSE, which represents the absolute deviation of predictions from measurements. Two of these models also had the FA5 greater than 40% confirming that the agreement is on the majority of points. Five models had NMSEs greater than 100, with two of them showing NMSEs greater than 1000.

Three models had the global PCC greater than or equal to 0.5.

The predictions of models that used analysed meteorology improved in the FA5 (six models had FA5 greater than 40%) but worsened in terms of NMSE; in fact nine models had NMSE greater than 100, including four models with a NMSE greater than 1000. The other indices for predictions with analysed meteorology were almost unvaried with respect to predictions with forecast meteorology.

More on: Global analysis of model predictions