Data analysis is a process of inspecting, cleaning, transforming and modelling data with the goal of highlighting useful information relevant to the phenomenon under
investigation. The ultimate goal of the analysis is to provide the researcher with the knowledge required to answer the research questions.
I collected the data for this study over a sustained period of seven months in close proximity to the specific situation on naturally occurring, ordinary events in natural settings (Miles & Huberman, 1994, p. 10), which gave me a strong handle on the ‘daily life’ of WYM. Data collected in this manner makes them powerful for studying any process; we can go far beyond snapshots of ‘what?’ or ‘how many?’ and why things happen as they do and even access causality as it actually plays out in a particular setting (Miles & Huberman, 1994, p. 10).
However there are a number of challenges facing the researcher when compiling field data of this nature. The challenges are due to multiple data sources; some information comes from observations, some from interviews, other from structured ethnographic or elite sources, and other from archival records questionnaires, surveys, videos films etc (Miles & Huberman, 1994, p. 55). There is often information overload and there is a ‘golden mean’ that every researcher has to find in their pursuit of data, which lies
somewhere between recording everything versus recording too little (Yin, 2011, p. 156). In my case there were certainly challenges due to the data overload – at least one
meeting each day, sometimes two meetings or more. As each meeting was recorded via a data recorder the challenge was to transcribe it as soon as possible hence capturing the exact words and staying loyal to what was said (as far as possible). Due to the large number of meetings if not transcribed at the time, my job would be a lot harder if impossible later on.
As is recommended in case study research (Eisenhardt & Graebner, 2007; Yin, 2009) the starting point in data analysis is to establish an analytic strategy. One might start with a research question (Yin, 2009, p. 128) and “play” with the data as described by Miles & Huberman (1994) who suggest:
68
Putting information into arrays
Making a matrix of categories and placing the evidence within such categories
Creating data displays – flowcharts and graphics – for examining the data
Tabulating the frequency of different events
Examining the complexity of such tabulations and their relationships
Putting information in chronological order
As a strategy in analysing data I used the conceptual framework explained in literature review chapter and combined it with my research questions, which helped me generate selectivity.
Every case study has a ‘story’ to tell. The story is not a fictional account as it embraces real life data. Questions like (a) what are the distinctive features of this study; (b) how the collected data relate to the research questions or (c) have any new insights emerged (Yin, 2011, p. 183) are always at the researcher’s mind and mark the entire analytic process. Rich familiarity with WYM was developed through synthesizing the data from the meetings, interviews and documents. The transcribed data amounted to 200
documents and totalled 1,592 single spaced pages collected over seven months. I divided all transcribed data among five folders (i.e. daily meetings, weekly meetings, monthly meetings, observations and finally WYM company documents). More orderly data lead to strong analyses and ultimately more rigorous qualitative research (Yin, 2011, p. 182). Based on my earlier statement that I took strategic and operational episodes as my unit of analysis, the transcripts represented aggregated units of analysis, but in this phase of my work, I sought to code paragraphs or sentences. I created
paragraphs and sentences in the transcripts where I heard natural breaks (e.g., pauses or hesitations) in the discourse.
I analysed the transcripts using content analysis, a research technique used to determine the presence of words or concepts in collections of textual documents. It works by breaking down text into categories based on explicit rules of coding (Slater & Narver, 1995). The most common forms of content analysis are conceptual analysis and relational analysis. Conceptual analysis, involves the detection of explicit and implicit
69 concepts in the text. Relational analysis considers the relationships between concepts. As I mentioned previously the vast quantity of text that I gathered over time would make a manual coding job impossible.
This is why I looked into using a software tool to make this job less stressful. In the following sections I briefly explain the first qualitative software that I used to analyse my data. I justify my initial choice of software, but then highlight its main drawback (it cannot prevent analytical bias) that eventually led me to look into other means of analysing my data. This led me to evaluate a second qualitative analysis software tool, which enabled me to reanalyse all my data in a systematic, comprehensive and unbiased fashion.
3.5.1 NVivo qualitative analysis software
The first qualitative software that I used for my data analysis was NVivo 9; a software tool designed to analyse qualitative research data. NVivo acts as a data repository and organises data in a systematic fashion into a form of database before formal analysis begins. The software also offers the ability to define a set of codes (concepts) and allocate textual data to the codes. It is a recommended tool to (Bazeley, 2007, pp. 2-3):
Manage and organize the data and keep track of the many messy records that go into making a qualitative project
Manage and organize ideas and provide rapid access to conceptual and theoretical knowledge that has been generated in the course of the study
Query data and have the software retrieve from its database all information relevant to determining an answer to the questions
Graphically model the data and Report from the data
During conventional coding analysis the researcher usually moves through five phases (Yin, 2011, p. 177); compiling, disassembling, reassembling, interpreting and
concluding. The compiling analytic phase is where the data are stored systematically inside a repository ready to be analysed.
70 Following on from the compiling phase, the second analytic phase is the disassembling
of the data where coding portions of data – that is by assigning new labels or codes to selected words, phrases or other chunks of data in a database – occurs (Yin, 2011, p. 187). This is also well known among grounded theorists as open coding where “the
analyst is concerned with generating categories and their properties” (Strauss & Corbin, 1998, p. 143). Miles and Huberman (1994, p. 56) assert
“Codes are tags or labels for assigning units of meaning to the descriptive or inferential information compiled during a study. Codes usually are attached to “chunks” of varying size – words, phrases, sentences, or whole paragraphs, connected or unconnected to a specific setting”
As a point of reference I considered all three literatures demonstrated in Figure 2-2 and broke them down into their theoretical constructs. The constructs are ‘families’ of basic concepts. Hence the behavioural theory of the firm has the construct decision-making process. This process consists of four basic concepts; quasi-resolution of conflict, organizational learning, problemistic search and uncertainty avoidance. Similarly strategic product creation revolves around three constructs: central business strategy,
product strategy and new product creation, which encapsulate concepts such as
strategic flexibility, speed to market and real-time market research. Finally, the business sustainability outcomes construct draws together conceptualizations of optimum social and financial performance and minimum environmental impact. The constructs are all defined in the literature review.
Based on the constructs and their underlying concepts, I ended up with 15 theoretically defined concepts all adhere to the following format:
a. Concept name – name of the concept
b. Concept definition – definition as given by theory from which it is derived
The entire analytic process occurred over a period of time and was iterative more than linear (Yin, 2011) where I iterated between data and coding in order to make sense and interpret the phenomena in terms of the meanings that people bring to them.
Using NVivo I codified the data into the various concepts identified either from the literature review or from the ‘ad hoc’ concepts. In essence I sorted my data into various categories, which either emerged from the data or were extensions of existing
71 theoretical frameworks (Lofland, Snow, Anderson, & Lofland, 2006, p. 200). This process was highly iterative, gradually building more codes or revisiting and refining the existing ones. It also was supposed to increase reliability in my findings by rechecking transcripts and cross-checking codes (Flick, 2007, p. 102). Once the data were all coded into NVivo, I used the software’s functions to query the data and ask questions on the coding, focusing more on the issues that I wanted to explore further.
During the data collection and analysis phase, I often revisited and reviewed data records. Richards (2005, p. 79) advises to visit whole or part of the data many times during the data analysis stage as the researcher’s knowledge increments as the project progresses. As Richards (2005, p. 97) advises revisiting text coded at a category is highly beneficial, as the understanding of categories and concepts can change as further reviewing of data develops. This continuous iteration between data and codes identified that the Level 1 codes may relate to each other or to the theory codes. I looked for keywords or patterns that could allow the coding to be reduced. This process is what known as clustering which is the forming of categories and the iterative sorting of things – events, actors, processes, settings, sites – into those categories (Miles & Huberman, 1994, p. 249). During this phase of analysis the goal was to move
incrementally to an even higher conceptual level by recognising the categories within the Level 1 codes. Deciding what can be relabelled and what should be left alone was a crucial judgment call (Yin, 2011, p. 184). Hence careful consideration was given when merging codes together. Looking for patterns when you are coding the data is very useful. For instance some keywords or themes or a specific context within the textual references might trigger the text to be coded against one or more codes. This exercise resulted to the clustering of all Level 1 codes into the theoretical codes. Once all data are codified into NVivo, the software has the ability to query the data and ask questions on the coding focusing more on the issues that the researcher wants to explore further.
When the research method involves codes and coding the researcher is presented with an additional challenge; introducing bias into the results. This became a major issue for me, as my supervisors and colleagues attempted to recode data samples and come up with similar conceptualisations to me. Whilst there was some commonality, there was sufficient disagreement which led me to consider just how relaible my findings were. Hence, I first looked at length at the meaning of data reliability and for techniques that might help me demonstrate that my initial findings could be relied upon. I was able to
72 identify support for much of what I had found, but was not fully confident. Ultimately I switched my analytical approach and software to, as one colleague put it “take [me] out of the analysis”.