The empirical analysis uses original administrative Italian data, at the individual level, covering the period 1995-2012. From the data, we recover the bivariate distribution of mar- riages by cultural-ethnic group of spouses by province; the population vectors by ethnic group and marital status for each province; the fertility rates by ethnic group of spouses by province; the separation rates by ethnic group of spouses by province; the socialization probabilities by ethnic group of spouses by province; and finally the population distribu- tion by ethnic group and sex for each province. A more in depth discussion of available data and sample selection is reported in AppendixB.2, while a synthetic description of the variables of interest is provided in TableB.1.
The empirical estimation is based on a unique quasi-longitudinal dataset, that we con- structed matching marriage records with birth and separation records. We exploit an exact matching procedure thanks to time invariant dimensions. The longitudinal structure of data has two main advantages. First, it allows to follow households over time, having a complete representation of intra-household dynamic decisions, starting from marital choices to sub- sequent potential fertility and dissolution choices. Secondly, the matching process allows us to fix a particular time period characterized by increasing migration inflows. The final sample consists of 4,151,528 marriages, that cover the 92.58% of the universe of marriages celebrated in Italy from 1995 to 2012. The 87.28% of marital unions are homogamous Italian marriages, while the remaining percentage refers to marriages that involve at least one for- eign spouse. First marriages account for the 88.28% of the total sample. The comparison of the two marital distributions suggest that remarriage rates are not systematically different across spouses ethnic groups. In the sample the fertility rate corresponds to 69.56% with an average of 1.54 children per family. Of all marriages, the 7% end up in separation in the first years of the marital union.
We restrict our attention to legal marriages, while cohabitations are not included in our sample. Despite the cohabitation rate increased in the last decade, data availability is very limited and only rely on Census data, which are available every ten years. Implicitly, we interpret the differential between legal marriage and cohabitation choices, in light of the fact that marriages entail an additional degree of commitment, which is especially relevant for long-term investments such as children socialization [Lundberg and Pollak(1993) and
Chiappori, Salani´e, and Weiss(Chiappori et al.)].
We derive the population vectors by ethnic group, sex and marital status from indi- vidual Italian Census data of 2001 and 2011. We select only adult unmatched individuals
(of more than 18 years of age). Census data classify the marital status of an individual as: never married, at present married, separated de facto, legally separated, divorced or widowed. Because the model allows for endogenous divorce choices, we consider that an individual is unmatched in case she/he is never married, legally separated, divorced or widowed. We take into account potential measurement error concerns due to truncation of unmatched population vectors. That is, the observed unmatched men and women in 2011 might well marry in future years, which leads to an underestimation of marital gains. Our analysis might be hampered in presence of systematic differences across ethnic groups in marital rates over time, namely if marital rates are systematically higher for some specific ethnic groups as compared to others, over the period. To shed a light on this point, we compare the vectors of unmatched men and women in 2001 by ethnic-group with those in 2011. We notice that unmatched rates increase quite symmetrically for all ethnic groups. The overall Spearman rank correlation test is as high as 0.88, and equal to 0.57 and 0.98 for available adult male and adult female, in turn15, suggesting that the ethnic-group rank order remains stable over the period, especially for women. In addition, followingChiap- pori, Salani´e, and Weiss(Chiappori et al.), we restrict the set of unmatched individuals to unmatched men and women after their marriageable age, defined as the 90 percentile of the age at marriage distribution for men and women, respectively, in 2001. Gains to marriage, computer from equation (2.7), are reported in Table B.5.
Socialization data come from the Condition and Social Integration of Foreign Nationals Sur- vey (2011-2012). The survey is targeted to foreign residents in Italy with the aim of detect essential information on their living conditions, behaviours, attitudes and opinions. We ex- clude from our analysis, respondents who are not married and families without children, at the time of the interview. The final sample consists of 17,512 individuals belonging to 4,996 families and the 18.59% of those families are either separated or divorced. The survey is in- tended to provide a comprehensive representation of the socio-cultural as well as economic integration of foreign residents. In particular, we focus on the language spoken at home by parents with children, to recover socialization frequencies by spouses’ ethnic group.
Our interest for intergenerational language transmission is twofold. First, the linguis- tic socialization is a relevant cultural dimension for parents [Dustmann (1997),Ginsburgh and Weber (2011), Clots-Figueras and Masella (2013) and Fouka (2016)]. Secondly, it al- lows us to study the degree of convergence of migrants to the host socio-economic environ- ment. Indeed, several studies uncover a positive association between the proficiency in the destination language and migrants socio-economic integration, favouring the educational achievement of lag-behind children during compulsory schools and fostering employment
15The Spearman rank correlation test corresponds to the Pearson correlation between the rank values of the
variables considered. It assesses the monotonic relationship between variables, without imposing ant linear relationship.
and earning opportunities [Dustmann and Fabbri(2003) andDustmann et al.(2010)]. We delve into this relationship in our data, by looking at the correlation between our measure of linguistic socialization and different measure of socio-cultural integration of children, as for example the language spoken with school mates or friends out of school, or the na- tionality of school mates and friends out of school. Table B.7 shows that our measure of linguistic socialization, by capturing the persistence in migrants’ cultural-ethnic identity, is negatively correlated with the measures of socio-cultural integration.
Finally, we derive the population distribution by ethnic group and province for the time period 1995-2012 from municipality records on the movements of the foreign resident pop- ulation. Population shares by ethnic group and province are calculated thanks to munici- pality data on the total resident population, aggregated at the province level. The maps in Figure B.3, display the geographical heterogeneity in the population distribution between marriage markets, for the overall migrant population (first map) and separately for all other ethnic groups considered in our analysis.