In the previous section, the methods of data collection for the data sources of the study were explained. Also, the suitability of the selected data sources for answering the research questions of the study was evaluated. In this section, the potential methods that could be used to analyse the dataset of the study will be described. First, the measurement levels of the variables in the dataset will be explained. Then, based on the measurement levels of the variables, the most suitable data analysis methods from the existing potential methods were selected.
A.1.13 Measurement-levels of the variables in the dataset
To select the suitable statistical analysis methods, the purpose of analysis and the measurement levels of the variables in the dataset under investigation should be considered (Yang, 2010). In general, there are three measurement levels in order of increasing sophistication, as follows:
1. Nominal or Categorical level: the lowest measurement level which defines mutually exclusive
categories in a dataset, such as Gender, Postcode.
2. Ordinal or Rank level: as well as defining categories in a dataset, this measurement also ranks them
in order, such as customers’ opinions on a three-point scale from Unsatisfied to Satisfied.
3. Interval or ratio level: the highest measurement level in which there are equal units of distance
between categories in a dataset, such as age, weight (Healey, 2010) (Sheskin, 2004).
In the following subsections, the measurement levels of the operational and financial variables of the study are explained.
Measurement-level of operational variables
Companies’ operational performance data was provided in percentage format. IMechE has an assessment guideline for its assessors to evaluate the applicant companies by the same standard. The assessment guideline recommends some ranges for each question to help assessors to give scores in the correct ranges. Then they need to use their own judgement to rank companies within those ranges. Table 3-6 shows an example from this guideline for question 1.1.2.
Table 3-6 Transformation of the scores into relative categories
Questions Scores Total
Mark Percentage Description Categories
Q 1.1.2 Identifying future customer requirements
0-7 20 0% - 35% Just talks to customers occasionally. 1 7-10 20 36% - 50% Regular dialogue with customers and potential
customers to gain feedback on future needs 2 10-17 20 51% - 85% A well-structured process with
several appropriate techniques. 3 17-20 20 86% -
100%
Comprehensive process probably quoting at least one example of actions taken as a result of the process.
4
As shown in table 3-6, for example, for question 1.1.2, if companies only occasionally talk to their customers to identify their future need, then their scores should be between 0 and 35%. This score should increase gradually as companies use more sophisticated approaches in identifying their future customer needs. Therefore, the assessors had to use their own judgement to give suitable scores to companies within the given ranges. To reduce the impact of the assessors’ judgement, the author used those recommended ranges in the IMechE’s assessment guideline to transform companies’ scores for each question into four sequential categories; from 1: poorest to 4: best performing companies. Therefore, operational performance data in this study is in ordinal measurement level.
Measurement-level of financial variables
The author used a year-to-year analysis to study companies’ financial performance between two financial periods. Horngren, et al. (2012) suggest using Horizontal analysis, which is the study of changes in a variable in two similar periods. This involves finding the difference in a ratio’s value between two periods and dividing it by the base year value, which is the first year (Horngren, et al., 2012). However, in some occasions, the financial performance of companies is a negative number and calculating the percentage difference between a positive and a negative number is meaningless. Also when the value for a base year is not available, then percentage difference is not computable (Gibson, 2011). So, instead of percentage difference, the author used a simple difference between the ratios in two periods. Companies’ financial ratios were in three different formats: Percentage, Whole numbers and Number of days. Table 3-7 shows ratios of a sample company in four consequent financial years in interval measurement level.
Table 3-7 Ratios of a sample company Ratios values of a sample company
Category Ratio 2005 2006 2007 2008 2009
Competitiveness Turnover growth rate (%) 0% 5% 10% 7% -4%
Profitability
Return on shareholders’ funds (%) 29% 21% 25% 5% 28% Return on capital employed (%) 11% 9% 8% 2% 11% Return on total assets (%) 7% 7% 5% 1% 9% Margin on sales (Profit margin) (%) 7% 6% 5% 1% 6% Gross profit margin (%) --- --- --- --- --- Asset
management
Net assets turnover (x) 1.47 1.43 1.56 1.82 1.72 Stock turnover (x) 3.19 3.41 3.52 3.51 3.69
liquidity
Accounts receivable (days) 35.51 32.61 34.61 34.30 30.30 Accounts payable (days) 28.39 21.87 29.14 22.61 20.41 Current ratio (x) 2.25 2.98 2.47 2.27 3.09 Liquidity ratio (x) 1.28 1.73 1.42 1.03 1.34 Debt
Management
Interest cover (Times-interest-earned) (x) 6.84 6.28 4.30 1.37 5.66 Leverage (Gearing) (Debt-to-equity ratio) (%) 255% 194% 305% 245% 182%
Cash flow
Cash flows to sales (%) 12% 10% 9% 7% 12% Cash Flow/Total debt (%) 8% 8% 6% 6% 11% Cash flow yield (%) 212% 229% 224% 785% 207% Cash flow (x) 3,031 2,741 2,790 2,095 3,536
:The higher this ratio, the better for the business
:The Lower this ratio, the better for the business % : Percentage
X : Whole number days : Number of days
However, the purpose of the study is to find the operational practices of the companies that have a positive impact on their financial performance. Therefore, similar to the earlier studies that were reviewed in the last chapter (section 2.3), only improvement or deterioration of the financial performance was important (instead of the scale of improvement). Therefore, the author converted the financial performance of the companies to a categorical level with only two possible outcomes. If there was an improvement in ratios compared to the year of participation in the awards, the outcome would be ‘1’ and if the ratio had worsened, the outcome would be ‘0’. For example, for the sample company in table 3-7, table 3-8 shows the converted ratios into the categorical level. For this company, the year of participation is 2006 and therefore all the ratios of that company in the following years are compared with the performance of the company in 2006. Therefore, the operational variables of the study were in ordinal level and the financial variables were in categorical level.
Table 3-8 Coded ratios for a sample company Ratios values of a sample company
Category Ratio 2006 2007 2008 2009
Competitiveness Turnover growth rate (%) 1 1 1 0
Profitability
Return on shareholders’ funds (%) 0 1 0 1
Return on capital employed (%) 0 0 0 1
Return on total assets (%) 0 0 0 1
Margin on sales (Profit margin) (%) 0 0 0 1
Gross profit margin (%) --- --- --- ---
Asset management
Net assets turnover (x) 0 1 1 1
Stock turnover (x) 1 1 1 1
liquidity
Accounts receivable (days) 1 0 0 1
Accounts payable (days) 0 1 1 0
Current ratio (x) 1 0 0 1
Liquidity ratio (x) 1 0 0 0
Debt Management
Interest cover (Times-interest-earned) (x) 0 0 0 0
Leverage (Gearing) (Debt-to-equity ratio) (%) 1 0 0 1
Cash flow
Cash flows to sales (%) 0 0 0 1
Cash Flow/Total debt (%) 0 0 0 1
Cash flow yield (%) 1 0 1 0
Cash flow (x) 0 1 0 1
:The higher this ratio, the better for the business
:The Lower this ratio, the better for the business % :Percentage
X :Whole number days :Number of days
A.1.14 The potential data analysis methods
The potential statistical methods that have been considered for use in this study will now be discussed. Correlation and regression analyses are the two most commonly suggested statistical methods for analysing the relationship between two or more variables. For example, Field (2005), Kirk (2008) and Urdan (2010) have all suggested these two methods for analysing the relationship between variables. Also, as stated in the previous chapter (section 2.3.2.4), these two methods were the most common statistical methods that have been used in the earlier studies which have been reviewed in this study. Therefore, these two methods were selected to be used in this study and in the following subsections the potential approaches for finding the correlation and regression between the variables of this study are explained. Using similar data analysis methods in this study can help compare the findings of this study with that of the earlier studies. This is because by using similar methods the impact that the potential difference in the analysis methods can have on identified relationships will be reduced.
Correlation analysis can be used to explain the degree which two variables are related to each other (Field, 2005) or, as argued by Urdan (2010), it explains the strength of the relationship between two variables. For example, in this study, correlation analysis can be used to explain whether companies with an improved
financial performance were also strong in their operational practices. After finding correlated pairs of
variables, the regression analysis can be used to find dependence of an outcome variable on one or more predictor variables (Field, 2005). Therefore, regression analysis can explain the nature of the relationship between the variables (Urdan, 2010). In this study, regression analysis can be used to show dependence of the companies’ financial performance on their operational practices. The potential approaches of correlation and regression analysis that can be used in this study will be discussed.
In addition, when there is large number of variables in a study, one common method to reduce the number of variables is factor analysis (Urdan, 2010). In this study, there are ninety operational predictor variables (appendix 2) and eighteen financial outcome variables (appendix 3). The purpose of the study is to find the impact of the individual operational variables on individual financial variables. However, factor analysis is also considered in order to find aggregated factors of operational practices, so many operational practices could be simultaneously analysed.
Therefore, the potential approaches for conducting factor analysis are discussed first. This is followed by a discussion of the potential approaches for performing correlation and regression analyses.
Factor Analysis (Principle Component Analysis)
There are two main approaches to factor analysis, namely the Principal Component Analysis (PCA) and the Common Factor Analysis (Malhotra & Birks, 2006). The purpose of the PCA is to keep as much as possible of the variation in a dataset (Jolliffe, 2002). The purpose of the Common Factor Analysis is to find the common variances in a dataset (Malhotra & Birks, 2006). Therefore, when the purpose of the analysis is to find the minimum number of factors that explain the maximum variance in the dataset, the PCA is recommended. Otherwise, when the purpose of the analysis is to find the underlying dimensions in a dataset, then the Common Factor Analysis is suggested (Malhotra & Birks, 2006).
In this study, before starting the analysis, the dimensions of the variables are known (i.e. the group of operational questions from the MX Awards to test each of the twenty hypotheses of the study). Therefore, the PCA is suitable in reducing the dataset to the least number of variables. However, there are two approaches for performing the PCA:
1- Exploratory Factor Analysis (EFA) and
Both approaches can be employed to find the latent factors that explain the variation or co-variation among a set of variables. Similarly, both of these approaches rely on the same statistical estimation method such as maximum likelihood (Brown, 2006). However, in the EFA, before start the analysis the researchers are not expected to specify the numbers and the dimensions of factors which are needed to be extracted from the dataset (Thompson, 2004). Conversely, to perform the CFA, the researchers are expected to specify those expectations (Brown, 2006). Therefore, the CFA needs a strong theoretical background, which is often gained from past studies or from an EFA procedure in the earlier stages of research (Brown, 2006).
In this study, the operational practices are classified under twenty hypotheses and it is only expected to confirm if the questions under each hypothesis have similar co-variations. Therefore the CFA is a suitable approach for this study.
In addition, the standard format of the PCA is only suitable for variables, which are measured at interval or ratio measurement levels, and it is not suitable for categorical variables (Linting et al., 2007). For categorical variables, the Categorical Principal Component Analysis (CATPCA) has been developed which uses optimal scaling process to transform the category labels of the variables into numerical values, while keeping the maximum variation among them (Linting & Van der Kooij, 2012). Since the predictor variables in this study are categorical, the CATPCA was selected for data reduction in this study.
The purpose of PCA is to find a subset of variables that measure different facets of the same underlying dimension (Field, 2005). Most of the reviewed studies in chapter two (section 2.3.2.4) have used the PCA. This is mainly to reduce the number of their studied variables into smaller sets of variables that represent most of the information in the original variables. An example of using the PCA for data reduction can be found in studies by Saunila et al. (2014) and Hofer et al. (2012). In addition, some studies have used PCA to compare the combined impact of their variables with the impact of the individual variables. For example, Fullerton et al.’s (2003) study found that the aggregated indicator of their studied quality measures had no significant relationship with the firms’ profitability. However, some of the individual measures in that study, such as waste-reduction practices, had a positive influence on their profitability.
In this study, PCA was seen to reduce the number of Explanatory variables (operational practices), so many operational practices could be simultaneously analysed. However, as explained in the next chapter (section 4.2.2.3), there were two problems with the result of this analysis. Therefore, the result of the analysis was not used in the correlation and regression analyses.
Correlation Analysis methods
As explained by Urdan (2010), Correlation Analysis can be used to find the strength of the relationship between two variables. Table 3-9 shows the potential correlation analysis methods suggested by Brown (2014) and Bryman & Cramer (2001) based on the variables’ measurement levels.
Table 3-9 Choice of correlation analysis methods based on data measurement levels Combination of variables Potential correlation analysis methods
Variable 1 Variable 2 Statistical significance Strength of association Interval Interval Pearson product-moment correlation (Linear)or Spearman correlation (Non-linear) Coefficient of determination r2 Ordinal Interval Spearman correlation Kendall's tau-b orKendall's tau-c Ordinal Ordinal Spearman correlation Kendall's tau-b orKendall's tau-c Nominal Interval Analysis of variances Cramer's V orPhi test Nominal Ordinal Chi-square test for independence/ Fisher'sexact test Cramer's V orPhi test Nominal Nominal Chi-square test for independence/ Fisher'sexact test Cramer's V orPhi test
In the following subsections, the suitability of each of the potential Correlation Analysis methods for the dataset of this study will be explained.
Pearson product-moment correlation:
The Pearson correlation is the most commonly used method of analysing correlation. It measures the strength and direction of relationship between variables (Gravetter & Wallnau, 2013). For applying this method on a dataset, it should meet the following four basic assumptions:
1. Both independent and dependent variables should have continuous values measured on an interval
scale.
2. The dataset should represent a random sample of interest population.
3. The relationship between the response item and predictor items should be linear.
4. The dataset should follow a normal distribution (Lehman, et al., 2005).
The predictor variables of this study (Operational questions from MX survey) are ordinal; therefore it fails to meet the first assumption of this method. Thus this correlation analysis method is not suitable for the selected sample.
Spearman correlation:
The Spearman correlation is a special form of the Pearson correlation, suitable for variables at ordinal measurement levels (Urdan, 2010). When a dataset violates assumptions of Pearson correlation, such as a lack of normally spread data, this method is a suitable choice (Field, 2005). Based on these two assumptions, this method is suitable for correlation analysis in this study. However, using this method entails uniformity of the dataset too. Uniformity refers to the simultaneous increase or decrease of the two variables under analysis (Kirk, 2008).
In this study’s dataset, improvement in operational performance variables did not always match with improvement in financial variables. Therefore, the dataset failed to meet the Spearman correlation’s assumptions. The selected dataset failed to meet the assumptions of both the Pearson and the Spearman correlation analysis methods.
Chi-square test for independence and Fisher’s exact test:
Howell (2010) suggests using the Chi-square test for analysing the correlation between ordinal and nominal variables; as in this study. It compares the number of collected cases in each category with the number of expected cases in each category and states if the collected cases are significantly different from expected numbers (Howell, 2010). There are two assumptions for using this method. First, collected records should have no influence on one another (Urdan, 2005); data in this study meet this assumption. Second, using Chi- square is not suitable for samples with fewer than five expected numbers in their categories. For these samples, Fisher’s exact test is more suitable (Urdan, 2005). For some variables of the dataset, the expected number was fewer than five; therefore the author used Fisher’s exact test instead.
Chi-square and Fisher’s exact only test the statistical significance of a relationship and do not measure the strength of it. Cramer’s V and Phi tests are two common methods for measuring the strength of the relationship between nominal variables (Morgan, et al., 2004). Morgan et al. (2004) suggest using Phi test for two variables when each has only two categories and Cramer’s V for more than two categories. Since operational performance data has more than two categories, therefore Cramer’s V was suitable for this study.
Regression Analysis methods
Based on the measurement levels of the variables, either linear or logistic regression analysis methods can be used. Linear regression analysis is for the interval measurement level. Binomial-Logistic-regression is suitable for predicting dichotomous response data-items based on one or more independent predictor items (Kleinbaum & Klein, 2010). Response items in this study, which are financial ratios, are divided into two distinct categories (Improved and Worsened); therefore Logistic-regression fit the dataset. Based on number of predictor items, regression analysis is either simple or multiple types. The multiple regression analysis is for predicting the response items based on more than one predictor. The simple regression analysis is for predicting the response variable, based on only one independent predictor (Marczyk, et al., 2005).
Linear-regression assumes a linear link between dependence and independence items. But there is no such assumption in Logistic-regression (Allison, 1999). Linear-regression predicts the variations of a dependent item for every unit change in the independent items. But logistic-regression estimates the likelihood of happening or not happening of an event in the response item (Allison, 1999). As with the Chi-Square test in the previous subsection, Logistic-regression assumes application on large sample size. Exact-logistic- regression is a variation of Fisher’s exact test in regression analysis for small samples (Hosmer & Lemeshow, 2000). Therefore, the author used Exact-logistic-regression for regression analysis of this study. Overall, in this section, the potential research methods for conducting the statistical analysis of this study are discussed. First the Categorical Principal Component Analysis (CATPCA) is selected to find the aggregated factors of operational practices. This is because the individual operational practices might not have a direct impact on the companies’ financial results and need to be combined with other operational practices. Therefore, by using the CATPCA, the author tried to increase the chance of finding the impact of the companies’ operational practices on their financial results.
Following the CATPCA, the Fisher’s exact test was selected to find the correlated pairs of operational practices and financial results. Finally, for each of the identified correlated pairs of operational and financial variables, the exact logistic regression was selected to find dependence of the companies’ financial results on their operational variables.