CAPÍTULO 3: VALIDACIÓN Y APLICACIÓN PRÁCTICA DE LA PROPUESTA
3.1 V ALIDACIÓN DEL MODELO PROPUESTO POR UN MÉTODO TEÓRICO
Over the past decade, AFC systems have been adopted by more transit agencies to replace the traditional paper ticket method and enhance the management of passenger fare collection (Bagchi and White, 2005; Pelletier et al., 2011). In an AFC system, the use of smart cards for fare collection is usually accomplished by the implementation of a group of devices within a UPT system, typically including the smart cards (i.e., credit-card sized plastic cards embedded with memory chips), on-board card readers, on-board GPS trackers and a central server. A smart card commonly stores the information of a unique card ID (e.g., a series of numbers), card type (e.g., adult, children, senior card user) and fare deposit balance. The central server stores the service information such as routes and schedules that is regularly transferred to the on-board card readers. When boarding (and sometimes alighting) a transit vehicle, a passenger touches his or her smart card to the on-board card readers that verify card and service information. Transaction records are generated in this way and stored in the central server. A smart card record normally contains information including date, card ID, route ID, route direction, boarding stop ID and time (and sometimes alighting stop ID and time), vehicle ID and employer ID.
Initially serving as a new mechanism for fare charging, the transaction records collected by smart card (or smart card data) apparently provide rather detailed and continuous travel behaviour itineraries of UPT passengers. As such, smart card data can offer a much enhanced data source for studying UPT passenger travel behaviour and related activities in comparison to travel survey data. This enables the investigation of UPT passenger travel behaviour at more disaggregated levels (e.g., trip-based, activity-based) and over longer temporal ranges (e.g., months to years) (Pelletier et al., 2011; Yue et al., 2014).
34 A number of studies have utilised smart card data to investigate the behavioural dynamics of UPT passengers across the world. Some studies focused on examining the spatial and temporal variability of UPT passenger boarding behaviour, using a variety of techniques including cluster analysis, where Argard et al (2006) and Morency et al (2006; 2007) detected the temporal boarding patterns across different card holder groups in Gatineau, Canada. By enumerating the number of non-repeated passenger-boarding stops, Morency et al (2006; 2007) further explored the spatial variety of the card users’ boarding behaviour.
Focusing on the same study context, Chu and Chapleau (2010) applied a GIS-based visualisation method to reveal the anchor points of student card holders, where reoccurring boarding behaviours were tangible. Park et al (2008) examined temporal boarding patterns across different transit modes in Seoul, South Korea. In another related study, Nishiuchi et al (2013) compared the reoccurrence of boarding time and stops to identify spatial-temporal consistency of rail transit passengers in Kochi City, Japan, highlighting that travel patterns of student passengers were more consistent in comparison to other groups (e.g., adult, senior passengers).
In addition to passengers’ boarding behaviour, many researchers also managed to estimate transfer stop and time of UPT passengers, shedding lighting on the more complicated phenomenon of transfer behaviour and linked trips within the UPT context. Rule-based algorithms are a commonly applied method in dealing with this issue. Drawing on experience data, many researchers used fixed time constraints (e.g., 30 minutes, 2 hours) to estimate alighting stop and time, and as such link trips into journeys, e.g., Bagchi and White (2005), Utsunomiya et al (2006), Hofmann and O'Mahony (2005) and Jang (2010).
Using a similar method, Devillaine et al assigned activity types to different time periods, and generated and compared activity temporal patterns of bus and metro passengers in two cities (i.e., Santiago, Chile and Gatineau, Canada). Alternative to the fixed-time constraint method, fixed-distance methods that assume a common walking-distance between transfer stops (e.g., 400, 800 metres) were also applied in linking trips into journeys (Barry et al., 2002; Trépanier et al., 2009; Zhao et al., 2007).
Considering the arbitrary nature of the fixed-time/distance constraint methods, some researchers developed more sophisticated methods to estimate linked journeys. For instance, Seaborn et al (2009) developed an elapsed time threshold method that takes differences in multi-modal transfers to estimate multimodal trips. By taking a number of variables into account, e.g., planned departure and arrival times for each run, stop sequence and linear distance between stops, Chu and Chapleau (2008) developed a
35 multi-rule algorithm to detect transfer patterns of individual passengers. Munizaga and Palma (2012) applied a generalised time approach to estimate alighting point and origin-destination (OD) matrix. Their method minimised the generalised time distance between two sequential boarding position-times, which is argued to be more accurate than the fixed-distance method. Drawing on various methods, by reconstructing the journey information and OD matrix, smart card data can be transformed from raw datasets into more prepared data for further policy-making processes and sophisticated mathematical demand modelling.
In addition to the aforementioned studies, another group of work has focused on estimating O-D matrices of UPT passengers. A critical limitation of smart card data is that many smart card systems only capture boarding information (i.e., boarding stop and time) in order to charge fares while omitting alighting information (Pelletier et al., 2011). In addressing this issue, some researchers have developed algorithms to estimate alighting stop based on the assumption of shortest walking distance between the destination of one trip and the origin of the next one, e.g., Barry et al (2002), Trépanier et al (2007). To attain more reliable estimations, smart card data has also been applied in conjunction with other datasets that contain detailed trip and geographic information of a transit service. For example, Zhao et al (2007), Farzin (2008), Munizaga and Palma (2012) have integrated smart card data with Automatic Vehicle Location (AVL) data (or Global Positioning System (GPS) data) to infer an O-D matrix of transit passengers (bus or rail passengers) within various situational contexts. In addition, Nassir et al (2011) demonstrated a related approach by joining smart card data with Automatic Passenger Count (APC) data and General Transit Feed Specification (GTFS) to infer passenger-alighting stops in their study of Chicago’s metro transit network, US.
Despite the great potential of smart card data demonstrated by the previous studies, smart card data has rarely, if not never, been applied to investigate BRT usage dynamics. While it is desirable to do so, some methodological challenges and limitations associated with smart card data should be carefully considered, as they critically relate to fulfilling the utility of smart card data in dealing with the issue of interest (i.e., passengers’ travel behaviour related to a BRT system).
As discussed in the last section, smart card data are essentially the ‘by-product’ of AFC systems with a primary aim of assisting the ticketing process, rather than providing travel behaviour data. Hence, smart card data still lack certain important trip and personal
36 information (Bagchi and White, 2005; Pelletier et al., 2011). In terms of trip information, the actual geographic location (e.g., the coordinate of a boarding stop) is usually not provided in the smart card data. In dealing with this issue, the supplement of other datasets such as GPS data is necessary. In addition, personal information including trip purpose and the socio-demographic characteristics of passengers are normally unavailable in smart card data. This loss is largely due to the issue of personal privacy of smart card users (Pelletier et al., 2011; Utsunomiya et al., 2006; Yue et al., 2014). Establishing a personal registration system for smart card users has been identified as a potential solution (Pelletier et al., 2011;
Utsunomiya et al., 2006). Yet, the privacy concerns of smart card users remain as a major drawback for this approach (Bagchi and White, 2005; Pelletier et al., 2011).
As elaborated in previous literature review of smart card data studies (Pelletier et al., 2011;
Yue et al., 2014) and more general ‘big data’ application (Boyd and Crawford, 2012; Kitchin, 2013; Miller, 2010), big data sets (or data deluge) also posit critical methodological challenges for researchers, which particularly question the effectiveness of conventional statistical methods in extracting meaningful results from ‘big data’. Given a large dataset (e.g., hundreds of thousands to millions of data entries), it is imaginable that many statistical methods applied in the earlier travel behaviour studies will very likely render significant results that may contain little practical meaning. Hence it becomes a pressing issue to develop novel and tailored methods that can generate meaningful and interpretable results from big data (Kitchin, 2013; Miller, 2010).
In regards to spatial-temporal investigation of travel behaviour, spatial visualisation or geo-visualisation based methods have been widely advocated and applied in the exploratory analysis of big data (including smart card data) (Kwan, 2000; Kwan and Lee, 2004; Andrienko and Andrienko, 2008; Shaw and Yu, 2009; Yu, 2008). Spatial visualisation of big data commonly involves a series of steps including data processing (commonly spatial clustering and aggregation) and visualisation of the data. By doing this, big data can be reduced to a more manageable size of information (e.g., aggregated travel trajectories based on certain spatial nodes), and the attained results can well reflect the interactions between travel patterns and a spatial context. Given such strengths, spatial visualisation appears to be the proper analytic strategy for investigating BRT usage in this research.
Therefore, the next section discusses the existing methods of spatial visualisation applied to big data and their implications for this research.
37