CAPITULO III: ANALISIS DE LA MORBILIDAD
3.5.- ANÁLISIS DE LOS EGRESOS HOSPITALARIOS
The computer vision community has largely turned towards the recognition of object classes, rather than specific roadway assets such as traffic signs. Current research efforts in devising a computer vision model for roadway asset detection are roughly divided into three stages:
Segmentation, Detection, and
Condition assessment.
Detection of traffic signs as classified in Manual on Uniform Traffic Control Devices (MUTCD) is an area that has received considerable attention over the past few years. Traffic signs come in hundreds of variations, such as in dimension, color, text, and font. (Maldonado Bascón et al. 2010) presented a Support Vector Machine (SVM) to recognize road-signs. (Krishnan 2009) has presented a triangulation and bundle adjustment approach for identifying road signs. (Hu and Tsai 2011; Wu and Tsai 2006b) have created a nearest-neighbor assignment of feature descriptors for an image recognition model for developing a sign inventory. Although most of these techniques have achieved the goal of automation and accuracy to a reasonable level, nonetheless none of these
26
systems use the same visual information to locate the assets and more importantly detect them in a continuous fashion.
As a first step towards addressing this problem, research in intelligent driver assistance systems community has focused on detecting speed limit signs (Mogelmose et al. 2012). The performance of the proposed algorithms also widely varies. An earlier example is (Loy and Barnes 2004) where all signs in the testing dataset were detected successfully, however a large number of FPs rate per frame remained an open research problem. For roadway asset condition assessment, a method that can detect several different types of traffic signs at a low false detection rate is more appealing than a method that can only detect one specific traffic sign, but does that well. Table 2.3 categorizes some of the major state-of-the-art detection and classification methods based on the type of their visual features (i.e. color vs. shape).
Table 2.3 State-of-the-art Methods for Detection and Classification of Single-category Traffic Signs Categorized Based on the Type of Features
Different Type of Features Used
Features Examples from the Literature
Color (Lopez and Fuentes 2007; Maldonado-
Bascon et al. 2007)
Shape (Gil-Jim et al. 2005; Kim et al. 2005) Color, and Shape (Fang et al. 2003; Gao et al. 2003; Miura et
al. 2000; Shuang-dong et al. 2005) Geometrical, Physical Features, and
Text (Yangxing et al. 2006)
Dimension, Color, Text, and Font (Fatmehsan et al. 2010; Hu and Tsai 2011)
The Specific Type of Traffic
Sign used for Detection
Rectangle and Triangle Shape (Ballerini et al. 2005; Ruta et al. 2010; Shuang-dong et al. 2005)
Stop and/or Speed Limit Signs
(Fatmehsan et al. 2010; Meuter et al. 2008; Tsai and Wu 2002; Wu and Tsai 2005; Wu and Tsai 2006b; Yea-Shuan and Yun-Shin 2010; Yea-Shuan et al. 2012)
The most recent methods in (Baro et al. 2009; Overett et al. 2011; Timofte et al. 2014) have validated their performance with reported detection rates above 90% with relatively low number of FPs. However, all these methods have been validated on European datasets and for only a few types of traffic signs. Table 2.4 summarizes the features and detection methods for these methods.
27
Table 2.4 Overview of the Performance for the Best Detection Rates
Paper Features Detection
Method Best Detection Rate FPs for Best Detection Rate Average Detection Rate Average FPs Type of Traffic Sign (Baro et al. 2009) Dissociated dipoles* Cascade Classifier 97% 5.6% 92% 4.8% Circular speed, Triangular (Overett et al. 2011) Histogram of Oriented Gradients (HOG) 5 Stage cascade classifier trained with LogitBoost 98.68% 10% - - Circular Red signs (Timofte et al. 2014) Adaptive RGB threshold + Edges Fuzzy template of a Hough derivative 95.7% 2.5% 95.29% 10.41% Circular red and blue, Diamond white
* A more general type of features than the Haar-like features
In addition to detection, 3D localization of traffic signs from video streams has also been the focus of some of the recent works. Examples include (Soheilian et al. 2013; Timofte et al. 2014) which mainly visualize the detected signs within sparse 3D point cloud models. (Balali and Golparvar-Fard 2014; Golparvar-Fard et al. 2012) have also proposed two methods for segmentation of roadway assets at a higher-level (e.g., guardrail, signs, safety cones, etc.) based on scalable non-parametric parsing and Semantic Texton Forest algorithms, respectively. These methods can segment a video frame into different asset categories, and can serve as a basis for the task of detection and classification.
The prior work in detection and classification of traffic signs can be roughly divided into three categories of work on segmentation, feature extraction, and detection. In the following the state-of-the-art in each category is presented:
a. Segmentation and candidate extraction
The purpose of segmentation is to narrow down the search space in finding candidates for signs from the entirety of a video frame to small number of image patches (Golparvar-Fard et al. 2012). Because traffic signs have distinct colors, majority of the earlier segmentation methods focused on thresholding color channels. Since Red-Green-Blue (RGB) color space is generally
28
perceived to be not subject to wider variations in brightness, methods such as (FeiXiang et al. 2009; Hsin-Han et al. 2010; Wen-Jia and Chien-Chung 2007; Xu et al. 2010) leveraged the Hue- Saturation-Value (HSV) color space. Interestingly (Gomez-Moreno et al. 2010) reports that HSV- based color segmentation methods does not necessarily have a better performance against the normalized RGB color channel. In an attempt to minimize the impact of the instabilities caused by the lighting variations, (Balali et al. 2013; Prisacariu et al. 2010; Timofte et al. 2009; Timofte et al. 2014) proposed adaptive thresholds to be used on the RGB color space. While those segmentation methods that use color information perform much better than the shape-only methods, they struggle in detecting traffic signs with white background. For a more detailed comparison of the existing methods, readers are encouraged to look into (Geronimo et al. 2010).
Object detection and classification problem is traditionally solved by either the selective extraction of windows of interest, or exhaustive sliding window based classification. In the first approach small number of interest regions are selected in the images through fast and inexpensive methods. These interest regions are then subjected to a more sophisticated classification. Such approach risks overlooking some traffic signs. Second approach considers all candidate windows in the image. Given the large number of candidates, classification easily becomes intractable (Balali et al. 2013).
b. Feature extraction and detection
Because traffic signs have distinct shapes, the most dominant type of features used to-date are edges, intensity gradients, and more recently principled presentations such as Histogram of Oriented Gradients (HOG) (Alefs et al. 2007; Gao et al. 2006; Houben 2011; Mathias et al. 2013; Overett et al. 2011; Pettersson et al. 2008; Xie et al. 2009) and Haar-like features (Bahlmann et al. 2005; Baro et al. 2009; Keller et al. 2008; Prisacariu et al. 2010). (Creusen et al. 2010) also augmented the HOG feature vectors with CIELab and YCbCr color information for detecting blue- circular, red-circular and triangular signs using a relatively small training dataset (~ tens of samples per category).
c. Classification
The selection of classification method is constrained to the choice of features. The dominant methods are the Hough transform and its derivatives for model fitting (especially when
29
edges and intensity gradients are used). For HOG and Haar-like features, SVM, neural networks, and cascaded classifiers have been frequently reported. Particularly cascade classifiers are used more often with the Haar-like features (Bahlmann et al. 2005; Baro et al. 2009; Keller et al. 2008; Prisacariu et al. 2010). The application of HOG features with standard SVM (Creusen et al. 2010; Xie et al. 2009) and cascade classifiers with boosting variants (Overett et al. 2011; Pettersson et al. 2008) are also reported in the literature. A method using color and Haar-like features and AdaBoost cascade classifiers was also presented in (Balali and Golparvar-Fard 2014). While all these methods have shown reasonable accuracies, their performance has not been benchmarked and compared in the literature. More importantly their application in the context of U.S. traffic signs has never been investigated before.