• No se han encontrado resultados

The Principal component analysis (PCA) method is a mathematical/statistical procedure that uses an orthogonal transformation to convert a set of correlated observations into a new set of uncorrelated components. The number of PCs is less than or equal to the number of original variables. In the case of Cassini/VIMS analysis they are always equal. It is sorted so that the first principal component always includes the largest possible variance (Fig. 5.1). Furthermore, the PCs are guaranteed to be independent if the data set is normally distributed.

137

Fig. 5.1 – The Principal Component (PCs) transformation. The figure illustrates 3-D gene expression data located with 2-D subspace. With PCA the reduction of the dimensionality of the dataset occurs through the accumulation of the information hidden in the datasets into the first PCs. (left) PC#1 and PC#2 describe the highest variance of the data. (right) Representation of the 2-D component space (Image and information credit: Matthias Scholz).

5.1.1.1 Principal Components (PCs)

With the PCA a new coordinated system is obtained that maximizes the variance of data from the VIMS data containing a large number of spectral channels, (e.g. Richards, 1994). This can be achieved by concentrating in the PCs fewer spectral parameters from the relevant VIMS image information. The latter is concentrated in the first PC (PC#1). However, the content of the PC images strongly depends on the original information of the input data (Singh and Harrison, 1985; Gillespie, 1980). Thus, in a perspective analysis of VIMS data with PCA, the PC#1 contains the ‘dominant’ feature as seen in VIMS, by frequency of appearance from the 352 spectro-images (96 visible and 256 IR) or their selection depending on the kind of research (Jolliffe, 1986, 2002, 2005; Smith, 2002; Stephan et al. 2008; Abdi and Williams, 2010). Here, we use a series of VIMS infrared images (96 in total). Indeed, the VIMS datacube is made by VIMS-Visual and VIMS-IR channels that do not have coincident fields of view and viewing geometry; and the use of both could easily mislead the analysis. In order to avoid this, we are using only the VIMS-IR channels, as suggested by the reviewer, for all the datacubes in use in this analysis. Hence, we are using the bands 97 to 352 that correspond only to the infrared observations.

5.1.1.2 Selecting the appropriate PCs from the datacube

In order to extract and select in the final retransformed PCA image the PCs with significant spectro-imaging information the following issues should be taken into account: (a)

the significance of their eigenvalues (EVj); (b) the spatial variations in the PC image itself as

well as; (c) the content of spectral information in the corresponding eigenvectors (Ej)

(Stephan et al. 2008).

Eigenvalues

The eigenvalues are a special set of scalars associated with a linear system of equations (i.e. a matrix equation) that are sometimes also known as characteristic roots, characteristic values (Hoffman and Kunze, 1971), proper values, or latent roots (Marcus and Minc, 1988). Eigenvalues are closely related to eigenvectors. Figure 5.2 shows a plot of eigenvalues from the Tui Regio datacube (No 4, Table 3.3). The slope presents a flattening that is often used as a criterion for the selection of PCs (e.g. Jöreskog et al. 1976; Stephan et al. 2008).

Fig. 5.2 - Eigenvalues plot for Tui Regio datacube NoX. The variability is accounted most for the 1st while PC2, 3, 4 are also significant statistically to our work. After that, the flatenning indicates the absence of variability in the PCs.

Eigenvectors and PC images

The eigenvectors (EV) are a special set of vectors associated with a linear system of equations that are sometimes also known as characteristic vectors, proper vectors, or latent vectors (Marcus and Minc, 1988). In this analysis the EV provide information on the spectral behavior of each PC. Thus, it enables us to distinguish between atmospheric and surface information and use the PCs that only or at least largely correspond to the surface. Figure 5.3 shows the eigenvectors extracted from Tui Regio’s datacube. It appears that PC#1 does not reflect the spectrum of the surface since it misses any spectral response at 3 and 5 μm where the methane windows are centered. However, PCs #2,3,4 do, followed by PCs #5,6,7 where the spectrum is very noisy out of the windows. I should note here that the eigenvalues

139

hereabove have shown that the 1st PC contains significant information that is present in the datacube indicating that another parameter other than surface information is dominant in the datacube. The PC images should clarify the nature of PC#1. Moreover, PC#2 shows a distinct surface spectral behavior.

Fig. 5.3 - Calculated eigenvectors from the Tui Regio datacube. It appears that only PCs #2,3,4 are compatible with surface spectra.

Following the examination of the eigenvectors, I have used ENVI to retrieve the images of each PC and validate their nature. This step is very important since the correlation between EV and PC images shows if the PC images with a relative high EV shows spatial variations that are related to surface features or if PC images with a low EV are fully dominated by a distinct noise pattern (Stephan et al. 2008).

Figure 5.4 shows a series of the seven first PCs, where the first few, as mentioned earlier, retain most of the data variability present in all of the original variables.

Figure 5.4 - The seven first ‘Principal Components’ (PCs) for the Tui Regio dataset. The first principal component (PC#1) presents the largest possible variance. At this datacube as seen in Fig. 5.2 the information concerning the variability of the dataset is present up to PC#4.

To describe what each PC represents, in the particular case of Tui Regio, we can say that PC #1 mainly reflects the atmosphere that lay over the area of Tui Regio and seem to be present more than any other feature. Nevertheless, such evidence enhance the need for a tool to evaluate and ‘remove’ the atmosphere (see sections hereafter). PC #2 indicates that frequently Titan shows dark surface features, while PC #3 shows that quite often, Titan’s surface features are bright (for this dataset, PC #3 displays Tui Regio as particularly bright). PC #4 projection indicates that Titan’s surface features are dark rather infrequently with a limb-darkening. Lastly, PC #5,6 and #7 show that the following PCs in this particular cube contain only noise. Thus, for Tui Regio, we will thereafter use only RGB composites of the three main principal components related only to the surface (which are here # 2,3,4, since PC#1 refers mainly to the atmosphere) (Fig. 6.13, Chapter 6).

Fig. 5.6 shows the images of the PCs for the Sotra Patera case, which is an example of a datacube where the 1st PC corresponds to the surface. This can be confirmed also by the eigenvectors as shown in Fig. 6.12 and the result PCA image in Fig. 6.13 in Chapter 6.

141

Fig. 5.5 - Same as for Fig. 5.4 but for Sotra Patera. PC#1,2,3 reflect the surface.

PCA images

For any given VIMS datacube there should be a specific selection of PCs based on their eigenvalues and images, as well as, on the information adapted from the eigenvalues. Hence, for Tui Regio’s example by following this process we derive the PCA image seen in Fig. 5.6. We have used adequate colors for our type of analysis in order to illustrate the distinct spectral units, with the aim to color the actual Tui Regio, Hotei Regio and Sotra Patera features as red within the datacubes. Hence, knowing that the areas are particularly bright (anomalously at 5 μm) from previous studies (e.g. Barnes et al. 2006; McCord et al. 2008; Soderblom et al. 2009) we label the red spectral unit as ‘brightest’ and the green (at this particular datacube) as ‘darkest’.

Fig. 5.6 – PCA image of Tui Regio using PCs#3,2,4 (Solomonidou et al. 2013b).

An additional advantage of PCA, other than identifying, and partially exploiting the atmospheric PCs (Fig. 5.4), is the ability to distinguish any clouds present in the data, which would appear in a separate RoI, distinct from surface features, due to their very specific spectral signatures.

5.1.1.3 Test case

By using the PCA method we aim at creating units of color heterogeneity that each correspond to units of diverse spectral response in all of our three study areas. Le Mouélic et al. (2008) studied the Sinlap crater on Titan –one of the few impact craters observed on Titan and through the use of band ratios they identified several units in the VIMS false color composites that indicate compositional heterogeneities. These units correspond to specific coloring as follows: the bright ring corresponds to the ejecta blanket, the blue and dark blue area is probably enriched with water ice (Rodriguez et al. 2006) compared to the surroundings, and the brownish area corresponds to dune fields (Soderblom et al. 2007a, 2007b; Barnes et al. 2008; Rodriguez et al. 2013) (Fig. 5.7). Such a classification led to three compositionally and geologically different types of surface regions. Applying the PCA technique in the exact same VIMS cube, we arrive at the same result of three distinct regions as seen in Figure 5.8 within the same regional boundaries. We thus have a confirmation that our principal component application works properly (Solomonidou et al. 2013b).

Fig. 5.7 - (left) On this RADAR/VIMS combined pan-sharpening image of the Sinlap crater region, from T3 SAR image and T13 VIMS cube CM_1525118253 taken at 2.03 μm (upper right), Le Mouélic et al. (2008) found spectral diversity between three regions. These authors used the band ratio technique in the VIMS RGB composites in this order: R=1.59/1.27 μm; G=2.03/1.27 μm; B=1.27/1.08 μm (left). Such a diversity is confirmed with our PCA which isolates 3 RoIs (lower right) that match the Le Mouélic et al. (2008) inferences. Following the coloring of this PCA image, the red rectangle in the left-sided image corresponds to bright region, blue to dark brown, and green to dark blue as characterized by Le Mouélic et al. (2008) (Image credit: Solomonidou et al. 2013b).

143

5.1.1.4 Application on icy moons data

The Principal Component analysis method has been applied another icy moon spectral study in the past by Stephan et al. 2008. In this study, the PCA was mainly used for the improvement of quality of the spectro-imaging data through the removal of noise and instrument artifact. This was also an outcome of our PCA analysis, since we were able to remove the noise (PCs#5,6 and 7) detect any present clouds. Stephan et al. (2008) studied Ganymede with the use of Galileo/NIMS data (0.7-5.2 μm). The PCA method was used in NIMS data in order to improve the signal-to-noise ratio (SNR), which accounts as an important parameters for mapping spectral properties as the wavelength position and band depth depends on that. The higher the SNR, the better the quality of the data. The improvement of SNR can be achieved by averaging the spectra, and PCA can be applied as to that matter without diminishing the amount of spectral information. The application of the method produced higher spectral data notably after the 3 μm. Such improvement is essential for mapping and spectral studies in terms of chemical composition analysis and surface properties.

Fig. 5.8 – PC images from NIMS datacubes. On the right of the image an albedo map with the main geological features of the datacube area is shown (Image credit: Stephan et al. 2008).

Documento similar