The count probability distribution function (CPDF) and its moments have been used ex- tensively to quantify the clustering pattern of galaxies (e.g. White 1979; Peebles 1980). In this Section we give an outline of the counts-in-cells approach, explaining how the volume averaged p-point correlations are derived from the CPDF and give a brief theoretical back- ground. A more comprehensive discussion of the counts-in-cells approach can be found in Bernardeau et al. (2002).
3.2.1 Estimating thep-point volume averaged correlation functions
The p-point moment, or (un-reduced) correlation function, m3(r1,r2,r3)≡< δ(r1)...δ(rp)>, can be used to fully characterise the clustering of a fluctuating field δ(r). The reduced
p-point correlation function,ξp(r1, ...,rp), is defined as the connected part of the above p- point correlation in such a way that for p>2:ξp=0 for a Gaussian field (see Bernardeau et al. 2002 for more details). Following the standard convention, for the remainder of this paper when we talk about correlations we will always assume they are “reduced" correla- tions.
The p-point volume averaged galaxy correlation function, ¯ξp(V), can be written as the integral of the p-point correlation function, ξp, over the sampling volume, V (Peebles 1980): ¯ ξp(V)= 1 Vp Z V d3r1 . . . d3rpξp(r1, . . . ,rp). (3.1) A practical way in which to estimate ¯ξp(V) is to randomly throw cells down within the galaxy distribution, recording the number of times a cell contains N galaxies so as to build up the galaxy CPDF, PN(V). Since we adopt spherical cells, the CPDF is a function of the sphere radius, R,
PN(R)= NN NT
, (3.2)
where NN is the number of cells that contain N galaxies out of a total number of cells thrown down, NT. The volume averaged correlation functions ¯ξp(V) are then related to the moments of the CPDF, mp: mp(R)=h(N−N)¯pi= ∞ X N=0 PN(R)(N−N)¯p, (3.3)
where ¯N is the mean number of galaxies in a cell of volume V and is calculated directly
from the CPDF ¯ N = ∞ X N=0 NPN. (3.4)
3.2.2 Scaling of the higher order moments
For the case of a continuous distribution, ¯ξpis related to the corresponding cumulant,µp, through ¯Npξ¯p=µp, where the cumulants are defined as (see Gaztañaga 1994 for details):
µ2=m2 ; µ3=m3,
µ4=m4−3m22 ; µ5=m5−10m3m2. (3.5) If instead we are dealing with a discrete distribution, these relations must be corrected. A Poisson shot noise model is adopted (see Baugh et al. 1995 for a discussion of this point), to give corrected estimates of the moments, kp:
k2=µ2−N ;¯ k3 =µ3−3k2−N¯,
k4=µ4−7k2−6k3−N¯,
k5=µ5−15k2−25k3−10k4−N¯. (3.6)
The volume-averaged correlation functions, calculated from the galaxy CPDF, follow di- rectly from the relation ¯ξp=kp/N¯p.
3.2.2 Scaling of the higher order moments
In the hierarchical model of clustering, all higher-order correlations can be expressed in terms of the 2-point function, ¯ξ2, and dimensionless scaling coefficients, Sp:
¯
ξp=Spξ¯2p−1. (3.7)
Traditionally, S3 = ξ¯3/ξ¯22is referred to as the skewness of the distribution and S4 = ξ¯4/ξ¯32 as the kurtosis. The hierarchical scaling of the higher order moments arises from the evolution due to gravitational instability of an initially Gaussian distribution of density fluctuations (see Bernardeau et al. 2002 and references therein).
3.2.3 Systematic effects: biased estimators
In addition to sampling errors (see Section 3.3.3 below), the estimation of the hierarchi- cal amplitudes can be compromised by systematic effects, as discussed in some detail by Hui & Gaztañaga (1999). These authors identified two sources of error that could lead to a systematic bias in the inferred values of Sp. The first effect arises from biases in the estimates of the higher order correlation functions themselves, known as the “integral constraint bias” (see e.g. Bernstein 1994). The second effect originates in the nonlinear combination of ¯ξpand ¯ξ2to form Sp; this is called the “ratio bias”. The latter effect dom- inates on large scales and tends to cause the inferred values of the Sp to be biased low. Hui & Gaztañaga wrote down expressions for these biases which accurately reproduce the systematic effects seen upon estimating the hierarchical amplitudes from sub-volumes extracted from N-body simulations.
As mentioned above, we will use different volume limited samples to study the luminos- ity dependence of the hierarchical amplitudes, Sp. As the luminosity that defines a sample
is made brighter, the volume of the sample increases. Thus the estimation biases tend to cause the Spto increase with sample luminosity. This spurious tendency has already been reported in the literature (see Hui & Gaztañaga 1999). For volumes of the size used in our analysis, it turns out that the predicted biases are smaller than the corresponding sampling errors (e.g. see figure 3 in Hui & Gaztañaga 1999). This is the first time that a redshift survey has been available which is large enough to overcome such systematic biases.
3.2.4 Galaxy biasing
Galaxy samples constructed using different selection criteria display different clustering patterns. This leads one to the conclusion that distinct samples of galaxies must trace the underlying mass distribution in different ways, a phenomenon that is generally known as galaxy bias.
A simple, heuristic scheme describing the impact of a local bias on the scaling of the higher order moments was proposed by Fry & Gaztañaga (1993). These authors demon- strated that in this case, the scaling of the higher order moments of the galaxy distribution should mirror that of the dark matter, though possibly with different values for the hierar- chical amplitudes Sp. Fry & Gaztañaga made the assumption that the density contrast in the galaxy distribution, δG, i.e. the fractional fluctuation around the mean density, could be written as a Taylor expansion of the density contrast of the dark matter,δDM:
δG= ∞ X k=0 bk k!(δ DM)k. (3.8)
On scales where the variance, ¯ξDM2 , is small, the leading order contribution to the two-point volume averaged correlation function of galaxies has the form:
¯
ξ2G=b21ξ¯DM2 , (3.9)
where b1 is the ubiquitous linear bias b. The leading order forms for the hierarchical amplitudes, Sp, for the cases p=3 and p=4 are:
S3G = 1 b1 SDM3 +3c2 (3.10) S4G = 1 b21 SDM4 +12c2SDM3 +4c3+12c22 , (3.11)
where we use the notation ck = bk/b1. Expressions for the hierarchical amplitudes are given up to p=7 in Fry & Gaztañaga (1993).
Mo, Jing & White (1997) give theoretical predictions for the coefficients bk using the Press & Schechter (1974) formalism and exploiting the framework developed by Cole & Kaiser (1989) and Mo & White (1996). For halos of mass M, the first two bias factors
3.2.4 Galaxy biasing
(k=1 and 2) are given by:
b1=1+ ν2−1 δc (3.12) b2=2 1− 17 21 ! ν2−1 δc +ν 2 δ2c(ν 2 −3) (3.13)
whereν≡δc/σ(M),δcis the linear theory overdensity at the time of collapse (δc =1.686 forΩ = 1) andσ(M) is the linear rms fluctuation on the mass scale of the halos. This is a simple model but nevertheless it shows some tendencies that are correct. For example, a typical mass halo corresponding to ν= 1 displays an unbiased variance with b1 = 1, but introduces a bias in the skewness, since c2=b2=−0.7. As a further illustration, consider massive halos defined by ν2 > 3; in this case the Mo, Jing & White theory predicts that
c2 > 0, while less massive halos could produce c2 < 0. To get more realistic values of
bk for galaxy bias, a prescription has to be adopted for populating dark matter halos of a given mass with galaxies of a given luminosity (Benson et al. 2001; Scoccimarro et al. 2001; Berlind et al. 2003).
In the interpretation of the higher order moments presented in this paper we will make use of a relative bias, which describes the change in clustering compared with that mea- sured for a reference sample (Norberg et al. 2001, 2002a). Using Eq. 3.9 as a guide, we define the relative bias, br=b1/b∗1, of a sample as the square root of the ratio of the 2-point correlation function measured for the sample over that measured for the reference sample, denoted by an asterisk (the reference sample will be defined explicitly in Section 3.5):
br≡ b1 b∗1 = ξ¯G2 ¯ ξG2∗ 1/2 . (3.14)
Thus, we can obtain an estimate of the relative bias from the ratio of the variances. When the linear bias is a good approximation (ck ' 0 for k > 1), we can relate SGp in different galaxy samples regardless of the underlying DM value of Sp:
SGp = S
G∗
p
bpr−2
. (3.15)
More generally, one can manipulate Eq. 3.10 to write down an expression comparing SGp for two galaxy samples, eliminating SDM3 for the underlying dark matter (e.g. see Eq. 9 in Fry & Gaztañaga 1993). For the skewness:
SG3∗=brSG3 −3
(c2−c∗2)
b∗1 , (3.16)
where an asterisk denotes a quantity describing the reference sample, and br = b1/b∗1 is the relative bias defined above. Any second order relative bias effects are thus given by:
c02= (c2−c ∗ 2) b∗1 = 1 3 brSG3 −SG3∗ . (3.17)
As a special case, if the reference sample is un-biased (i.e. b∗1 = 1 and c∗p = 0), we then have c02 =c2.