1.2. Descripción del problema
1.6.4. Calidad de Servicio
5.2.1 Identifying Variable Stars
The idea behind the method that we use for identifying variables is a simple one. Variable stars should show an RMS scatter in the lightcurves that is above the usual RMS scatter due to noise. We can use the RMS diagrams to find the stars with an RMS that is larger than that of the majority of stars at any particular magnitude.
Firstly we plot a histogram of the number of epochs in the PASS0 lightcurves and decide to consider a variable search in the 12853 lightcurves (out of 13515) with at least 100 epochs (Figure 5.1). Then we plot the RMS diagram for the chosen PASS0 lightcurves (Figure 5.2) and fit it with the model described by Equation 4.3 in Chapter 4, but adjusting the model to take a reference magnitude of 8 rather than 12.75. We obtainA = 12.1 mmag, B = 8.8 mmag and
C= 7.2mmag. The model is plotted on Figure 5.2 as the solid black line.
To identify variables, we need to choose a threshold in RMS above which any stars will be considered as variable. We must be carefull in choosing this threshold because the higher we set this threshold, the more real variables that we will fail to detect, and because the lower we set this threshold, the more constant stars that will contaminate our variable star sample.
It is clear from Figure 5.2 that our variable star detection threshold will need to be a function of magnitude, and we decide to set the threshold as some factorTtimes the fitted model (solid black
114 CHAPTER 5. VARIABLE CANDIDATES
Figure 5.1: Histogram of the number of epochs in the PASS0 lightcurves. The lightcurves with at least 500 epochs are represented by the shaded part of the histogram.
line). In Figure 5.3 we present a normalised cumulative histogram of the ratio of the lightcurve RMS to the model RMS. We compare the normalised cumulative histogram to similar histograms for chi squared distributions with degrees of freedom ranging from 1 up to the number of epochs in the longest PASS0 lightcurve (3426 epochs), where we have normalised thex-axis by the number of degrees of freedom. We find that the chi squared distribution that best fits the normalised cumulative histogram (chosen using least squares) has 27 degrees of freedom. The range in the number of epochs in the PASS0 lightcurves, combined with systematic trends present in the PASS0 lightcurves have served to increase the spread in the RMS values around the model RMS curve, which has resulted in the best-fitting chi squared distribution having an artificially small number of degrees of freedom.
To choose our threshold factor T we consider the probability P(x > T) that the lightcurve RMS to the model RMS ratio xis greater than T, which can be obtained directly from the chi squared distribution with 27 degrees of freedom. The probability thatx≤T forN lightcurves is
0.001 0.01 0.1 1 4 5 6 7 8 9 10 RMS Scatter (mag) PASS0 Magnitude
Figure 5.2: Plot of lightcurve RMS scatter against mean magnitude for the PASS0 lightcurves (red dots). The fit to the RMS diagram is shown as the solid black line, and the cut off in RMS for choosing variable star candidates is shown by the green dashed line. Variable star candidates are shown as small filled black circles. Variable star candidates that survive our further cuts in Section 5.2.2 are shown as large filled black circles.
given by:
P(x≤T for allN lightcurves) = (1−P(x > T))N (5.1) Hence the probability that at least one lightcurve exceeds our thresholdTis1−(1−P(x > T))N, and we choose to require that this false alarm probability is 5%. We can now solve for the proba- bilityP(x > T):
P(x > T) = 1−(1−0.05)1/N (5.2)
and we may then chooseT by considering the cut-off value of the chi squared distribution with 27 degrees of freedom. For theN = 12853PASS0 lightcurves that we search for variables we find that we should setT = 2.71.
Having chosen our detection threshold for variable stars, we scale up the model represented by the solid black line in Figure 5.2 by a factor ofT = 2.71, and plot it as the green dashed line. We take all stars with an RMS above this dashed line as candidate variables. This method is similar to that presented in Street et al. 2002 [78]. The 123 candidate variable stars are marked in Figure 5.2
116 CHAPTER 5. VARIABLE CANDIDATES
Figure 5.3: Normalised cumulative histogram of the ratio of the lightcurve RMS to the model RMS (solid line). We overplot the best-fit chi squared distribution with 27 degrees of freedom, where the x-axis of this distribution has been normalised by the number of degrees of freedom (dotted line).
as filled black circles, rather than the red dots for all the other stars.
5.2.2 Determining the Period of the Variable Stars
For each variable star candidate we inspect the lightcurve and remove obviously spurious lightcurves. These fake variables show the same large drop in brightness at the end of a night of observations, boosting the RMS of the lightcurve to a large value not representative of the actual scatter in the rest of the lightcurve. We also try and determine a period for each lightcurve using the Lomb- Scargle method [46],[73] via the standard periodogram. For the PASS0 lightcurves it became clear that without at least 500 epochs in a lightcurve, it was impossible to determine a period, mainly because of the poor time sampling of the lightcurves (∼2 hours in every 24 hours). Hence we further cut the set of variable lightcurve candidates to only those with at least 500 epochs (see Figure 5.1). This left us with 60 variable stars.
The Lomb-Scargle periodogram method attempts to find the best period for a set of time-series data (evenly or unevenly spaced) by fitting sine and cosine curves at different frequencies to the data, and plotting the improvement in the data residuals as a function of frequency. With a clear
Figure 5.4: Periodograms for PASS0 example variable lightcurves.
periodic signal, the period will be indicated by the peak of the periodogram. However, when the signal is not so strong, there may be multiple periodogram peaks. Also, when the time-series data does not cover the full phase of the actual period, other periodogram peaks of equal strength may appear at integer multiples of the real frequency.
The period search for each variable lightcurve was limited to the range from the Nyquist frequency (half the sampling frequency) to half the time length of the specific lightcurve. In the case of a clear periodogram peak, we take this peak as the period. In the case of multiple equal- strength peaks, we take the peak corresponding to the shortest period as the period. For stars which have periodograms without a clear peak we simply plot the unphased lightcurves.
In Figure 5.4 we plot four example periodograms corresponding to four PASS0 variables. The top left periodogram is for the star 1632-1488-1 (lightcurve in Figure 5.5) which varies on a longer time scale than the length of the observation run. In this case, the periodogram suggests a period of 1 d, which is clearly wrong, and corresponds to the time between different observation nights. Hence we cannot derive a period for this star. The top right periodogram is for the star 3215-0971-
118 CHAPTER 5. VARIABLE CANDIDATES 1 (lightcurve in Figure 5.6) which shows an eclipse on the seventh observing night, but is constant otherwise. It is impossible to derive a period with only one eclipse, and the periodogram shows no clear peak but suggests a period of greater than 10 d. Hence we do not adopt a period for this lightcurve. The bottom left periodogram is for the star 3169-3875-1 (lightcurve in Figure 5.8) and it has a clear peak at a period of 3.3979 d. The folded lightcurve is of good quality and therefore we adopt the suggested period. The bottom right periodogram is for the star 3215-1746-1 (lightcurve in Figure 5.7). The strongest peak is at a period of 0.1603 d which produces a beautiful folded lightcurve and therefore we adopt this period.
Of the 60 variable stars, we could derive a period in 30 cases. In Figures 5.5 and 5.6, we plot the 30 unphased variable star lightcurves for which a reliable period could not be found. In Figures 5.7 and 5.8, we present the remaining 30 variable star lightcurves phased on the most reliable period. We mention that for lightcurves like 1635-0568-1, 3155-1202-1 and 1631-0923- 1 there seem to be a few copies of the lightcurve pattern shifted in magnitude from the main lightcurve. This is caused by a small subset of the PASSCAL factors for the LST correction (k1(t)) being poorly determined for these lightcurves (see Section 3.2.2).
In Tables 5.1 and 5.2, we list the variable stars from the PASS0 data. We report the celestial coordinates and mean magnitude of each star. We also report the period and amplitude of the lightcurve for those lightcurves for which we could derive a period. In the last column we report the results of a query in the SIMBAD astronomical database (http://simbad.u-strasbg.fr/simbad/) for the Tycho star identifier and list the variable type if the star is already known as a variable (note that the SIMBAD database does not supply the period). We find that 37 out of the 60 variable stars are already known as variables in SIMBAD, and it is likely that the remaining 23 variables are known variables that are not reported in SIMBAD but elsewhere, simply because these are bright stars (brighter than 10th magnitude). It is not within the range of this thesis to investigate these variables further.