The Clean Seas project has made significant use of data from six different instruments over the past three years. Each instrument dataset has been delivered to partners of the project by a different route and in a different format. Some data, such as those from AVHRR, have been received directly from the satellite by a ground station operated by a partner to the project - in this case the Stockholm University Receiving Station. SeaWiFS data have also been available to some extent via this route while the majority, plus the SAR data used by the project have been obtained indirectly from a third party receiving station. ATSR, MOS and Landsat data have been obtained from archives on CD-ROM or exabyte, in some cases many months after the data were acquired by the satellite. Each agency supplying data has used a different format to supply the data and, while the partner receiving the data was generally well equipped to deal with that format, the majority of the project team did not have easy access to the relevant reading software for each data type.
Figure 3-1. Frame numbering scheme for SAR data in the three test sites. The rectangles denote the location of the ERS-2 SAR frames which were regularly acquired and analysed.
An early priority for the project was agreement of a common data standard that would meet the requirements of the project and allow for straightforward exchange of data to allow one of the major elements of the project to be undertaken - comparison of data from different classes of sensor. The requirements for the data standard were as follows:
It was agreed at the first Clean Seas meeting in Southampton in January 1997 that the hierarchical data formal (HDF) would be used for all data received by the project and distributed to other project participants. A standard naming convention was adopted based on the source and content of the file and was agreed to be of the following form:
S.INS. yymmdd.XX.V.hdf
Where, S is the site: L: Gulf of Lion; N - North Sea; B - Baltic Sea, K Kattegat (in both North Sea and Baltic regions). INS specifies the instrument used, for example, SAR (Synthetic Aperture Radar) or GBT (ATSR derived gridded brightness temperature), etc. In order to specify the date in such a way as to allow straightforward chronological sorting, an eight digit date using zeros to fill blank spaces was adopted as yymmdd, e.g. 970715 is 15 July 1997. XX is the frame location according to the acquisition plan which is instrument specific. As an example, the SAR frame numbering scheme is illustrated in Figure 3-1. Finally, V is the version number, required to track changes to the processing or correction of data as algorithms improve. The extension .hdf was also included to allow PCs to identify the file type.
Irrespective of the original source of the data and independent of its type or processing level, it was a priority that the information contained in the data be made available quickly and easily. Early work in the project was strongly oriented towards this objective. A simple strategy was therefore adopted which required that whether data were received directly from the satellite, or partially processed on CD-ROM or exabyte, all data would be converted to the agreed HDF format and a browse copy at low resolution made available on dedicated world wide web sites.
Each partner undertook to set up a two level web site containing public information and a private password protected area. The public area would contain general information about the role of the partner in the project as well as relevant links, but from the point of view of the work of the project it was the private area that was essential. Within these areas, the browse images were available to all participants along with on-line access to the HDF data files for immediate download or conventional delivery by CD-ROM.
Simply putting images onto a web site was not enough to ensure that free exchange of information and therefore it was agreed that each scene received would be inspected visually upon receipt and then annotated with a features report. This report would briefly identify, in plain language, any potentially interesting information contained in the images. Much of the data was received on a semi-operational timescale which meant that in many cases a simple text description was available within a few hours or a few days of the data acquisition making it straightforward to monitor the steady stream of new observations. The on-line text reports made possible the rapid identification of coincidences of features between different data streams allowing case study possibilities to be identified.
Although the file sizes of some of the data were too large to be maintained long term on-line, the text reports were held in on-line archives. This permitted subsequent review of the contents of images in a matter of minutes, without the need for each partner to download and examine each individual image. In particular, this avoided problems due to lack of familiarity with the peculiarities of particular datasets. When an interesting feature was identified in a text report it was possible immediately to examine the browse image to establish the precise location of the feature as well as other characteristics such as its shape and size. This strategy ensured that the initial assessment of data was undertaken by somebody experienced in the interpretation of observed features as well as the risks presented by the false positives which might be identified by less experienced observers. This was particularly true of the SAR data.
Figure 3-2. Standard data projections for each test site.
Once identified, common features allowed ad-hoc working teams to be formed to study the available data and decide if there was sufficient information on which to base a case study of the time period. The initial text reports provided the basis for selection of location and time-scales for searches of other datasets which were not routinely collected by the project, such as those from Landsat or WiFS, as well as providing the constraints on the modelling activities that would be required to support the investigations.
The objectives of the Clean Seas project were broad and dealt with all possible pollution related signatures which may be seen from earth observing instruments. Combined with the thousands of images collected during the project, the strategy of simple eye-ball assessment of each image and brief text reporting to a web site has proven to be one of the strengths of the project. Although it required a substantial initial investment of time and effort to produce interpretations of scenes which were not looked at again by any members of the project team, it did identify sequences of images from single sensors as well as coincidences of features between different sensors. The net result has been to allow many of the studies in Clean Seas to make confident choices about which potential case studies might be worth pursuing.
Within each of the three Clean Seas test sites the geographical projection of results was also in a standard form. This was particularly important when multiple scenes were to be combined to form climatological datasets, for example, using the colour and temperature data. Synthetic Aperture Radar data were less well suited to this technique due the greater variability in the signal strength which was not necessarily a reliable indicator of a pollution type. Boundaries were agreed by each regional team although they were subject to occasional revision as a greater understanding of the problems and characteristics of the areas under investigation. The boundaries provided the link between the raw data acquired and the modelling which would support some of the analyses. The regions to be used during the preparation of derived datasets such as mean monthly pigment concentrations are illustrated in Figure 3-2.
The modelling activity areas covered a subset of the data collection areas and will be described in more detail in section 4.3.
Clean Seas is based on the expert analysis and interpretation of a broad range of marine remote sensing data. To provide access to the information contained in the images acquired, simple text annotations were produced for each image.
AVHRR data from two of the NOAA satellites (NOAA-12 & NOAA-14) have been received and analysed during the first two project years. The procedure for AVHRR handling and annotation was already in place at Stockholm University prior to the Clean Sea project, but was improved and adapted to the requirements of the project. The AVHRR data flow has been high, occasionally up to 10 satellite passes per day, totalling 1 GB of raw data. A standard software, Terascan, has been used for all processing, e.g. geometric navigation, sea surface temperature and albedo calculations. For the processing of the three sites, the geometric boundaries were pre-defined and covered the SAR frames fully. All of the satellite passes within any of these boundaries were archived and catalogued both in original TDF (Terascan Data Format) and the agreed HDF format.
Figure 3-3. AVHRR channel 1 image from August 9th 1997
All passes received and within the defined areas were first quality checked, i.e. percent coverage of the area, missing lines or area at the edge of swath was indicated. For the Baltic Sea area, 4 passes per day were processed and interpreted for the occurrence of algal blooms, suspended sediments and various temperature anomalies, both near the coast as well as in the open sea. For the North Sea and the Gulf of Lion areas, one pass per day per area was interpreted. All detected features were described in text and often marked in a subset image and then distributed from the local web server. An example of the extensive bloom during 1997 can be seen in Figure 3-3. The image annotations have not been transferred into the HDF files, but are available on the web server. During the intensive bloom periods in July-August parallel interpretations on the same AVHRR image but with two different operators have been performed. This ad-hoc quality evaluation indicates that the differences imposed by two operators are only minor. When a difference occurs it is most often limited to the exact location of the outer (visible) limit of the bloom (surface accumulation of the cyanobacteria).
On several occasions when features were detected e.g. in some SAR data, a more detailed analysis of the corresponding AVHRR images was performed. Such features could be suspended sediments or algal blooms, but also temperature fronts e.g. in the Pomeranian bight. At a late stage of the project, a re-interpretation of two scenes per day, one night and one day, was conducted. This ensured a consistent interpretation of all evaluated images and features. For most features their presence was noted together with central position, relative size and dimensionality (point, line or area feature). In the North Sea and Gulf of Lion areas, the interpretations of features have been less detailed. General types of features in those areas, such as temperature fronts and plumes have been noted. Area specific features were left to each regional team member to analyse.
Figure 3-4. The ATSR dual view of operation.
The ATSR is a four channel infrared radiometer designed to measure sea surface temperature (SST) to an accuracy of better than ±0.5 K for a single 1 km2 pixel. The operational instrument during the Clean Seas project was an advanced version, ATSR-2, flown on ERS-2. The ATSR instrument has three principal improvements over other operational infrared radiometers:
The ATSR-2 has three channels in the visible part of the spectrum (560, 670 and 870 nm), two in the near infrared (1.6 and 3.7 mm) and two in the thermal infrared (11.0 and 12.0 mm). The two thermal infrared channels are used to determine the SST, as they correspond to atmospheric windows (wavelengths where atmospheric absorption is low) close to the peak emittance of the sea surface. The slight difference in atmospheric absorption between these two channels, combined with the two look angles, is used to correct the SST value for atmospheric effects. The 1.6 mm channel is used to detect clouds in daytime images, whilst the 3.7 mm channel is used to detect clouds, and as an additional atmospheric correction channel, in night-time images. This latter channel is not only in an atmospheric window, but is also relatively insensitive to the radiation emitted by the sea. The 3.7 mm channel is particularly sensitive to the presence of water vapour. In general, the ATSR data for night-time images will be more accurate than for day time images because of the presence of the 3.7 mm channel for additional atmospheric correction.
The absolute accuracy of the ATSR instruments has been determined using a range of in situ calibration techniques, and is thought to be approximately ±.3 K (Barton et al., 1995; Saunders, 1995).
The ATSR data are received at SOC from the National Remote Sensing Centre Ltd. The data are received on exabyte tape, in SADIST format, a format specific to ATSR data. The data are provided as two separate products:
The data are reformatted using IDL, to produce the standard HDF format files for distribution. Only channels and images where there is greater than 3% good data (not land, cloud or otherwise absent data) are included in the HDF files.
Using data provided during the reformatting procedure, a text file is produced automatically, giving the percentage cloud cover, percentage land cover and latitude and longitude limits of the data included in the file.
For each image considered "good", a GIF image is produced for inclusion on the web server. The channel used for this image is selected automatically during processing. For the GSST data, the dual view is preferred and is used if it includes more than 3% good data. For the GBT data, a thermal channel is used, nadir view by preference.
These images are all examined manually for interesting features, such as a visible Rhine outflow or eddies in the Baltic or Gulf of Lion. All these features are added to the text files for each images.
The final HDF data files were written to CDROM for distribution to the Clean Seas partners. Each CDROM contained several months of HDF data for a particular region. Included on each CDROM were the complete set of quick-looks for that region, together with html files which included all the text file information for quick-looks.
Oil films floating on the sea surface dampen the small-scale surface waves, and since these waves are responsible for the radar backscattering at oblique incidence angles (between 20° and 75°; so-called Bragg scattering). They are visible on SAR images as areas of reduced backscattered radar power (see, e.g., Alpers and Hühnerfuss, 1988, Gade et al., 1998a, and literature cited therein). Apart from definite oil pollution (i.e., dark patches in the SAR images which can clearly be related to oil spills) the images show a large variety of radar signatures caused by oceanic and atmospheric phenomena. Some of these look similar to radar signatures of oil spills and therefore are often called "look-alikes" (Espedal et al., 1995).
For each of the three test areas 15 SAR frames were chosen for routinely monitoring marine pollution (the locations of the SAR frames are depicted in Figure 3-1. For the Gulf of Lion the total number of frames is higher, because two additional frames have been chosen for our analyses). Over the whole project duration (or, more precisely, between 1 December, 1996, and 30 November, 1998) a total of 709 ERS-2 SAR images was acquired over the three test areas and have been analysed.
Within Clean Seas ERS-2 SAR images have been used which were processed to a resolution of 50 m. These so-called "quick-look images" were provided by the West Freugh Ground Station in Scotland, UK. Quick-look images are advantageous because the dark signatures of oil spills can still be delineated, but the small size of the data files makes it easier to quickly process a large number of SAR images. Every SAR image has been analysed with respect to the occurrence of oil pollution. In order to ensure maximum confidence of the oil detection and, thus, of the statistics to be produced, this analysis was done by eye. The detected oil spills were then catalogued by means of their exact position, size, and their mean reduction of the radar backscattering in that particular area.
The general term "ocean colour" is used to indicate the visible spectrum of upwelling radiance as observed at the sea surface. This radiance is related - by the processes of absorption and scattering - to the presence, nature and abundance of substances in the surface layer of the sea: the water constituents (i.e. planktonic pigments, the concentration of which can be related to algal biomass in surface waters; dissolved organic matter, the so-called yellow substance; or suspended inorganic sediments). Different water masses can be classified according to the kind of water constituents shaping their optical properties. In general, two main water types, referred to as case 1 and case 2 waters, can be identified in the sea (Morel and Prieur, 1977).
In case 1 water, the optical properties are dominated by biological constituents. These include the photosynthetic pigments of phytoplankton (in the following simply referred to as pigments), both from living algal cells and from associated debris originated by natural decay or zooplankton grazing, as well as other dissolved organic matter liberated by the algae and their debris. This water type is usually found in the pelagic zone, but it occurs also in coastal regions with arid climate, considerable depth and limited coastal runoff. The range of case 1 water spans from blue, oligotrophic waters (with pigment concentration below 0.1 mg m-3), to biologically active waters (with pigments concentration around 1 mg m-3), as well as to green, eutrophic waters (with pigment concentration as high as 10 mg m-3) like those found in upwelling areas.
In case 2 water, the same constituents found in case 1 water are accompanied by the presence of terrigenous particles from coastal runoff, suspended sediments of variable nature, dissolved organic matter (mostly degrading organic remains from land drainage, the yellow substance) and other particulate or dissolved substances originated from anthropogenic influx. This water type may occur in coastal zones with high coastal and fluvial runoff, or near mud flats, estuaries and river deltas, as well as in shallow offshore zones. The total area covered by case 2 waters, not optically dominated by plankton alone, is just a small fraction of the global marine surface. However, their profound impact on marine biology and particularly coastal ecology makes them an important topic of research. Moreover, it is in such waters that most human activities take place, that most of the marine resources to be exploited concentrate, and that most dramatic can be the dangers connected with environmental pollution, particularly in the European Seas.
The retrieval of environmental parameters from ocean colour depends on the evaluation of in-water optical properties, and of non-marine (atmospheric) contributions to the remotely sensed signal (Gordon and Morel, 1983). In practice, it is necessary to apply algorithms and/or models describing the relationship between radiance and water constituents. First, the sensor-recorded digital counts are converted into apparent radiance values by means of a calibration algorithm, accounting also for sensor response variations. Second, the apparent radiance values (which are contaminated by an atmospheric contribution up to 90% of the total signal) are corrected to derive upwelling water radiances, or sub-surface reflectances, by means of an atmospheric correction algorithm. And, third, these estimates are used to compute the concentration of water constituents (i.e. that of chlorophyll-like pigments or, in alternative, other parameters such as total suspended matter, or diffuse attenuation coefficient) by means of a bio-optical algorithm.
Time series of ocean colour data collected by the Coastal Zone Color Scanner (CZCS) (Hovis et al., 1980) were used to derive long-term statistics of water constituents for all the Clean Seas test sites. The European CZCS data set (1979-1985) originates from the OCEAN project, established in 1990 by the JRC EC and the ESA (Barale et al., 1999). The historical data a total of 5554 images, collected on a daily basis when favourable meteorological conditions occurred had been processed to apply sensor calibration, to correct for atmospheric contamination, and to derive chlorophyll-like pigment concentration (Sturm et al., 1999). Individual pigment images were generated for each available day, co-registered using the same geographic projection and resolution, with a 1 km pixel size, and then averaged pixel by pixel, to compute monthly and annual composites of each variable.
The time series of historical data was complemented, for the North Sea and Baltic sea test sites, by a set of images collected by the Modular Opto-electronic Scanner (MOS), on board the IRS-P3 satellite (Zimmermann and Neumann, 1997). The original MOS images were pre-processed for geolocation using standard procedures provided by the data supplying agency (C. Tschentscher, pers. com.). Once the geometry had been corrected and the sun and satellite angles had been properly estimated, an atmospheric correction scheme (Sturm, 1998) was applied to the top-of-the-atmosphere signal recorded by the sensor. The water-leaving radiances for each of the visible MOS channels, generated by the atmospheric correction, were used to derive estimates of pigment concentration by means of empirical algorithms (Barale et al., 1999) and then registered onto standard geographical grids created for the Clean Seas areas of interest (see section 3.2.1).
In addition, a full year (1998) of current ocean colour data collected by the Sea-viewing Wide-Field-of-view Sensor (SeaWiFS) (Hooker et al., 1993) was assembled and used for comparison with the historical record, and with other kinds of remotely sensed data. The SeaWiFS data 485 images, collected once or twice daily, when favourable meteorological conditions occurred were corrected for atmospheric contamination and turned into chlorophyll-like pigment concentrations, for compositing with the same method and at the same scale used for the CZCS data. The algorithms used for the SeaWiFS data processing are those provided by SeaDAS (Fu et al., 1998), a comprehensive image analysis package for the processing, display, analysis, and quality control of all SeaWiFS data products and ancillary meteorological and ozone data.

The Clean Seas project has collected together one of the first comprehensive archives of colour, temperature and radar image data to have been acquired contemporaneously. The earth observation system has consisted of two of the three classes of sensor on numerous occasions since each was first flown in 1978 but at no time until the launch of OCTS on the Japanese ADEOS satellite were SAR, radiometer and ocean colour data being collected at the same time. these satellite missions were uncoordinated and there was no strategy in place by any of the space agencies operating the spacecraft to encourage the types of synergistic uses of the data envisaged by the Clean Seas project. Fortunately, however, the thousands of images collected over each of the three test sites has allowed the project to take advantage of some serendipity of acquisition, particularly in the case of the 15 July 1997 study of the algal bloom in the Baltic Sea. The project has gone to considerable lengths to annotate and archive the data that has been acquired and to date only a small percentage of the scenes have been examined in detail. This was due to a number of factors such as the inhibitions caused by cloud cover as well as the significant number of scenes which did not contain any pollution signatures of interest to the project objectives.
Many scenes not used by the project contain dynamic features which have been noted by the text descriptions and there is undoubtedly a great deal of potential work that could be done to study climatic conditions in the three test sites for example, the kinds of statistics produced from the ocean colour datasets by the JRC.
Licensing arrangements on the datasets mean that the HDF files used by the project cannot be made publicly available but the annotations can be distributed freely to the general scientific community along with derived browse images which highlight the location of the features discussed. An access tool has been developed and is available on the Clean Seas world wide web site which allows the text descriptions to be searched and relevant information identified. For example, searches can be undertaken on the basis of the features observed, the date on which data were acquired and the test site over which the image was acquired. This simple tool allows access to all of the annotations collected from inspection of several thousand images and will as a minimum allow identification of scenes of interest to other researchers. Depending on the data and the original distributor of the data, some of the images collected may also be available for download from the site.
The project data have now been copied to CD-ROM for long term archiving by the project partners. Subsequent exploitation in projects which build on the work of the Clean Seas project will therefore be possible as the CD-ROM reflect the same data standards and file structures as were used on the individual web sites, thus ensuring that consistency is maintained between the classes of data collected.