Global Data Assembling

 

Home page
Project
Participants
Cruise Inventory
National Data Sets
Regional Data Assembling and Quality  Checks
Global Data Assembling
Observed Data
Climatological Data
Quality Assurance
Formats & Codes
Documentation
News, Meetings
IMPACTS Cluster
Links
CD-Rom Database

Objectives of the global assembling and pre-processing

The objective of the global data assembling is to produce a final complete data set set with no format errors and no duplicates, and the pre-processing which is the following step is to provide a data set with only "Good" data to the analysis centre (GHER in Liège) for the computation of the climatology.

  Methodology

Each National Data Centre (NODC) or Designated National Agency (DNA)have compiled and formatted their own data sets and sent them to the Regional Data Centre (RDC) for further quality control and regional assembling. Then, each RDC sent the regional data set to the Global Assembling Centre (GAC). The GAC loaded each regional data set in the global database and look for format errors and duplicates. The error list has been sent to the RDC and either small or obvious corrections were made by the GAC or the data set was returned to the RDCs for corrections (if errors are too complicated or too numerous) until no important errors are detected.

Once all the regional data sets have been merged in the global database (that means no more format errors and no duplicate detection), the GAC performed the pre-processing of the data which consists in :

Extraction of the good data from the global data set => latitude, longitude and date flags equal to 1 (good) or 5 (modified during QC), data flags equal to 1 or 2 (good or out of the climatological statistics)
Interpolation to the 25 standard levels defined for MEDAR/MEDATLAS-2 (IAPSO list + 3 levels in bold characters in the list below):

0, 5, 10, 20, 30, 50, 75, 100, 125, 150, 200, 250, 300, 400, 500, 600, 800, 1000, 1200, 1500, 2000, 2500, 3000, 3500, 4000

The data have been extracted and interpolated parameter per parameter using the SELMEDAR software developed in IFREMER, and the final data set have been sent to the Analysis Centre for the climatology computation.

 

 

 Data circulation between Regional Data Centre (RDC) , Global Assembling Centre (GAC) and Analysis Centre (AC).

Statistics on the Global data set

Quality checks have been performed on all the profiles by the Regional Data Centers, the result is that each numerical value has a quality flag (GTSPP flag scale).

The number of quality flags assigned to header information of each station to qualify the date, the latitude, the longitude and the bottom depth is :

 

 

 

Flag 0

No QC

 

Flag 1

Good

 

Flag 2

Out of

Statistics

Flag 3

Doubtful

 

Flag 4

Wrong

 

Flag5

Changed

during QC

Flag 9

Default

value

DATE

-

280772

-

15

23

5069

-

LATITUDE

-

285583

-

91

33

172

-

LONGITUDE

-

285540

-

87

35

217

-

BOTTOM DEPTH

21353

75682

237

3270

800

48

184489

Most of the corrections on the header information have been done on the Date and Time of the station. Some wrong station locations have been modified too, and only a few stations remain with wrong Date and/or Location. These modifications have been made only for obvious errors and the original value is kept in the “DM HISTORY” field of the MEDATLAS files.

Bottom depth is often missing, 65% of the stations do not have information about the bottom depth. When given, this information is often doubtful (1.2 %) or wrong (0.3%) and 7,6 % of the stations have not been quality checked for bottom depth.

Statistics on the quality flags on each measurement, parameter per parameter have been done, and the results are presented in the next table and figure.

 This shows that for temperature and salinity, the number of data points is quite high: more than 17 millions data points for temperature and 11.5 millions data points for salinity. The quality of these data is rather good as more than 99% of the data points have been flagged to “good” (QC flag = 1).

For Phosphate, Nitrate and Silicate the quality is still good enough, but there are less data. For oxygen measurements more doubtful or wrong data are met: more than 23% of the 1 million data points. This is due to some CTD casts for which all the data points are considered of bad quality because the oxygen sensor was not calibrated.  

 For the other parameters (Nitrite, Ammonium, Alkalinity, PH, Chlorophyll, Total Nitrogen, Total Phosphorus) except Hydrogen Sulphide for which 100 % of the data points are flagged to good (QC flag=1), the quality is also good.

PARAMETER

NAME

 

Flag 0

(No QC)

Flag 1

(Good)

Flag 2

(Out of Statistics)

Flag 3

(Doubtful)

Flag 4

(Wrong)

Flag 9

(Default Value)

Nb Total of data points
Nb % Nb % Nb % Nb % Nb % Nb %
SEA TEMPERATURE 0 0.0 16904033 98.3 28339 0.2 18876 0.1 23989 0.1 220253 1.3 17195490
PRACTICAL SALINITY 0 0.0 11144172 97.7 17128 0.2 3144 0.0 221104 0.2 217035 1.9 11402583
DISSOLVED OXYGEN 2593 0.2 757638 61.9 39861 3.3 166851 13.6 78694 6.4 178611 14.6 1224248
NITRATE (NO3-N) 0 0.0 77858 93.1 2194 2.6 2437 2.9 1109 1.3 56 0.1 83654
NITRITE (NO2-N) 0 0.0 74704 98.4 0 0.0 211 0.3 829 1.1 164 0.2 75908
AMMONIUM 0 0.0 30538 98.4 0 0.0 376 1.2 89 0.3 28 0.1 31031
SILICATE 0 0.0 120163 97.2 1383 1.1 373 0.3 1576 1.3 177 0.1 123672
PHOSPHATE 0 0.0 147977 97.2 1625 1.1 2375 1.5 1937 1.3 222 0.1 154136
ALKALINITY 0 0 4357 96.0 0 0.0 72 0.5 250 1.7 22 0.1 14701
PH 3834 2.5 144962 97.7 0 0.0 469 0.3 1215 0.8 285 0.2 150765
CHLOROPHYLL-A TOTAL 0 0.0 54360 96.2 4 0.0 3908 6.7 29 0.0 7 0.0 58308
HYDROGEN SULPHIDE (H2S) 0 0.0 11889 100.0 0 0.0 0 0.0 0 0.0 0 0.0 11889
TOTAL NITROGEN 0 0.0 496 92.7 0 0.0 6 1.1 31 5.8 2 0.4 535
TOTAL PHOSPORUS 0 0.0 11040 98.3 0 0.0 1 0.0 182 1.6 7 0.1 11230

 Quality flags on the data points

 

  Percentage of quality flags per parameter. (Flag 0 = No QC, Flag 1 = Good, Flag 2 = Out of statistics, Flag 3 = Doubtful, Flag 4 = Wrong)

 

[bottom.htm]