Objectives of the global assembling and pre-processing
The objective of the global data assembling is to produce a final complete data set set with no format errors and no duplicates, and the pre-processing which is the following step is to provide a data set with only "Good" data to the analysis centre (GHER in Liège) for the computation of the climatology.
Each National Data Centre (NODC) or Designated National Agency (DNA)have compiled and formatted their own data sets and sent them to the Regional Data Centre (RDC) for further quality control and regional assembling. Then, each RDC sent the regional data set to the Global Assembling Centre (GAC). The GAC loaded each regional data set in the global database and look for format errors and duplicates. The error list has been sent to the RDC and either small or obvious corrections were made by the GAC or the data set was returned to the RDCs for corrections (if errors are too complicated or too numerous) until no important errors are detected.
Once all the regional data sets have been merged in the global database (that means no more format errors and no duplicate detection), the GAC performed the pre-processing of the data which consists in :
The data have been extracted and interpolated parameter per parameter using the SELMEDAR software developed in IFREMER, and the final data set have been sent to the Analysis Centre for the climatology computation.
Data circulation between Regional Data Centre (RDC) , Global
Assembling Centre (GAC) and Analysis Centre (AC).
Statistics on the Global data set
Quality checks have been performed on all the profiles
by the Regional Data Centers, the result is that each numerical value has a
quality flag (GTSPP flag scale).
The number of quality flags assigned to header information of each
station to qualify the date, the latitude, the longitude and the bottom depth is
Most of the corrections on the header information have been done on the
Date and Time of the station. Some wrong station locations have been modified
too, and only a few stations remain with wrong Date and/or Location. These
modifications have been made only for obvious errors and the original
value is kept in the DM HISTORY field of the MEDATLAS files.
Bottom depth is often missing, 65% of the stations do not have information about the bottom depth. When given, this information is often doubtful (1.2 %) or wrong (0.3%) and 7,6 % of the stations have not been quality checked for bottom depth.
Statistics on the quality flags on each measurement, parameter per parameter
have been done, and the results are presented in the next table and figure.
This shows that for temperature and salinity, the number of data
points is quite high: more than 17 millions data points for temperature
and 11.5 millions data points for salinity. The quality of these data
is rather good as more than 99% of the data points have been flagged to
good (QC flag = 1).
For Phosphate, Nitrate and Silicate the quality is still good enough, but
there are less data. For oxygen measurements more doubtful or wrong data
are met: more than 23% of the 1 million data points. This is due to some
CTD casts for which all the data points are considered of bad quality
because the oxygen sensor was not calibrated.
For the other parameters (Nitrite, Ammonium, Alkalinity,
PH, Chlorophyll, Total Nitrogen, Total Phosphorus) except Hydrogen Sulphide
for which 100 % of the data points are flagged to good (QC flag=1), the
quality is also good.