4.5.1
Nature of the Problems
a)Short time scale systematic fluctuations.As discussed in more detail in Chapter 5, the
data recorded in the Ceduna monitoring campaign through to early 2005 contain systematic fluctuations which are most pronounced on time scales on the order of a few hours. For a discrete ACF computed at typical lag intervals of 4 hours, the highest variability frequency that can be identified is 3 days-1, some three times higher than the 1 days-1frequency limit
imposed by the presence of systematic fluctuations on diurnal time scales.
It seems clear that the systematic fluctuations have a thermal origin, and result from changes in the ambient temperature, but at the time of finalising this thesis, the problem has not been resolved at a telescope operational level and efforts to develop a data processing procedure to remove the fluctuations have not been successful (see Section 5.6).
The systematic fluctuations are present in the PKS B1144-379 observing period 4 data used in this chapter to illustrate scintle counting, data folding and spectral analysis methods of estimating TcharTperiod. However, the true variability signal in this data set is quite strong,
(see Figure 4.1), and a Tperiodvalue of just over 2 days is sufficiently long that the systematic
diurnal fluctuations can be ignored without significantly influencing the analysis results. Nevertheless, the systematic fluctuations are a nuisance that can hinder a variability analysis by obscuring a scintle counting exercise, introducing spurious high-frequency PSD peaks, and producing minor irregularities in data folding plots.
b)Weekly time scale systematic fluctuations. The calibrator flux densities vary slightly,
usually by less than 4%, over the course of an observing period. These variations are largely systematic in nature, and are believed to be a manifestation of the thermally induced
systematic fluctuations discussed above, due to the passage of weather systems. Synoptic time scales are similar to the variability time scales of the blazars monitored by Ceduna, and it is thus necessary to use the calibrator flux density data to correct the blazar flux density data in a refined manner: a bulk correction factor for each observing period is not sufficient. c)Long time scale flux density trends. All the blazar flux density time series recorded by
Ceduna exhibit significant changes over periods of months to years, presumably due to processes intrinsic to the source. The flux density thus typically changes over an observing period in a fairly steady fashion, and introduces a low frequency PSD peak which roughly corresponds to the length of the observing period. This peak can be quite strong, making it difficult for spectral analysis to identify variations on the order of several days.
4.5.2
Data Filtering
After consultation, the author decided to address the problems described in the previous section by filtering the data in the time domain. High and low pass filtering in the frequency domain could remove the fast systematic fluctuations and the flux density trends over the observing period respectively. However, time domain filtering facilitates removal of the small systematic flux density variations on time scales of days, using concurrent calibrator data. Emeritus Professor Peter McCulluch (University of Tasmania) favoured the time domain filtering option because it enabled visual inspection of the process, which is always reassuring when dealing with the processing of data from an initial observing campaign.
Step 1. Smooth the diurnal data fluctuations with polynomial
A polynomial fitted to the data for a given observing period enables the systematic diurnal fluctuations to be smoothed out, leaving slower fluctuations on the order of several days. Polynomial fits are notoriously bad near the start and end of the data to which they are applied, so a strategy was developed by the author whereby the smoothing polynomial is fitted to an extendeddata set in which the first and last days of data are repeated:
Extended data for an M-day observing period = {day 1 + days 1 to M + day M} A polynomial of too low an order will over-smooth the data, failing to capture the longer term variations that are of interest, while a polynomial of too high an order will follow the unwanted faster variations. For data from an observing period of M = 10-15 days duration, a polynomial of order M+5 produces an acceptable fit to the M+2 days of data.
In practice, the success of the polynomial filter is not too sensitive to its order, so long as it is close to order M+5. The reason is shown in Figure 4.21, which plots the sum-square error between PKS B1622-253 data from observing period 10, and the fitting polynomial as a function of the polynomial order.
Figure 4.21 Sum-square error for various polynomial fits to PKS B1622-253 data.
The red dashed line highlights the gradual error reduction as high order polynomials start to follow diurnal flux density variations.
Figure 4.21 shows a rapid decrease in the sum-square error with increasing polynomial order, signifying progressively better fits smoothing through the daily means of the data. The dashed red line shows that additional reductions in the sum-square error are gradually achieved by progressively higher order polynomials starting to follow the diurnal variations. Observing period 10 has 12 days of data, so a 17thorder filtering polynomial is specified by
the M+5 prescription. Figure 4.21 shows that a polynomial of order much greater than 17 would be needed for a polynomial fit to start to follow the diurnal variations.
Importantly, overfitting is not a problem because the only polynomial values used are those computed at times corresponding to actual data. Overfitting occurs when a polynomial used to fit a set of data points produces the required values at those points, but varies wildly and incorrectly between the data points, which can happen when the polynomial order is higher than necessary, for example if a 10thorder polynomial is used to model cubic variations.
A smoothed data set, xsmoothData , is obtained by evaluating the polynomial fitted through the data, Data
x , at the times associated with each data point (ignoring the repeated first and last
days of data that were added to ensure the polynomial was well behaved).
Data smooth polynomial
5 x
xData M
Step 2. Apply calibrator correction to the smoothed data set
The calibrator data are fitted with a (different) polynomial of the same order, M+5, which describes the minor fluctuations on time scales of days. This enables correction of the smoothed blazar data set, xsmoothData :
Data smooth Calibrator smooth Data smooth density flux Calibrator x x x
where the Calibrator smooth
x values are obtained by evaluating the calibrator polynomial at the same
times as the xsmoothData values.
Step 3. Remove long term flux density trends and convert to a zero-mean data set
Next, the smoothed data set, Data smooth
x , is fitted by a low order polynomial to remove any flux density trend over the observing period. Any such trend is associated with intrinsic changes in the blazar’s emissions over time scales of weeks to months and, as noted in the previous section, needs to be removed to avoid a low frequency PSD peak. The polynomial must be low order to avoid following the genuine scintles in Data
smooth
Figure 4.22 shows the different PSDs for PKS B1622-253 observing periods 10 and 12, that result from specifying 1stto 4thorder polynomials. the different order polynomials produce
principal PSD peaks that range from 0.20 to 0.23 days-1and 0.13 to 0.15 days-1in observing
periods 10 and 12 respectively.
Figure 4.22 PSDs for PKS B1622-253 zero-mean data for observing periods 10 (top)
and 12 (bottom), using various low-order polynomials.
Removing a flux density trend over an observing period of 10-15 days rarely requires more that a linear polynomial, and this has been adopted as a general rule for the present research. The few exceptions are identified in the data processing plots (see Chapter 6).
Evaluating the low order polynomial at the times associated with each data point gives values Data
trend
x which enable a zero-trend smoothed data set to be produced:
Data trend Data smooth Data zero x x x The Data zero
x data set contains the genuine flux density fluctuations with respect to a zero mean
that vary on time scales greater than a day. It is the basis for analysis of the Ceduna data considered by this research.
The diurnal fluctuations
Subtraction of the smoothed data set, Data smooth
x , (after the Step 2 correction) from the original
data from which it was derived, Data
x , produces a data set, xzeroFast, that contains fluctuations
that vary on diurnal time scales.
Data smooth Data Fast zero x x x
In Chapter 5, the Fast zero
x data sets of the two calibrator sources are compared to the xzeroFast data
sets of PKS B1622-253 and PKS B1519-273, the blazars considered by the present research. It is found that the Fast
zero
x data sets of both blazars are dominated by systematic fluctuations,
with little or no sign of real variability. The analysis of variability in the flux densities of these two sources can thus be based on Data
smooth
x , and the xzeroFast data sets can be discarded.
However, when a procedure is developed to avoid or correct for these systematic variations, the Ceduna program should be able to monitor variability on time scales of less than a day.
Example: PKS B1144-379 data
Figure 4.23 shows the above data processing procedure applied to PKS B1144-379 data from observing period 4. For this source, the short term fluctuations (bottom plot) appear to show quite different time scales, and may include some genuine variability in addition to systematic fluctuations.
Figure 4.23 Processing of PKS B1144-379 data for observing period 4.
The top plot shows the flux density data, Data
x , and the M+5 = 14thorder smoothing
polynomial (blue line) that produces the Data smooth
x data set (Step 1). The plot also shows
the linear polynomial (red line) that produces the Data trend
x data trend set.
The next plot shows the zero-mean PKS B1934-638 calibrator data, and a (different) 14thorder polynomial through these data, which produces the Calibrator
smooth
x data set.
The third plot shows the Data zero
x data set evaluated at the times associated with the xData
data points. The Data zero
x data set is the basis for variability analysis. It is produced by
using the Calibrator smooth
x data set to correct for fluctuations in the calibrator data (Step 2), and
using the Data trend
x data set to remove the flux density trend in the xData data set (Step 3).
The bottom plot shows the Fast zero
x data set containing the fast (diurnal) signal component
that is dominated by systematic fluctuations.
It is worth emphasising that fluctuations in the PKS B1934-638 calibrator data areminor.
On time scales long than a day, the Data smooth
x data set fluctuations are 0.2 Jy, while the
Calibrator smooth
x data set fluctuations are about 1/10ththe size.