1.1.1 Features
As stated above, objects are described by attributes called features. According to “The International Dictionary of Artificial Intelligence” [106] a feature can be defined as a (usually) named quantity that can take on different values. These values are the feature’s domain and, in general, can be either quantitative or qualitative [41, 73].
The branch of PR which operates with quantitative features is called statistical pattern recognition [73]. In this branch the features are rep- resented by numbers, such as integers or real numbers, for example the amplitude of a signal measured in dB. In particular numerical features values are arranged as an n-dimensional vector as represented in the following:
X = [x1, x2, . . . , xn] x ∈ Rn
where each element of the vector corresponds to a specific feature, while the whole vector represents an object of the data set. The real space Rn is called the feature space and each axis corresponds to a specific feature.
The choice of a good set of features is a basic point to obtain good performance in the pattern recognition process [23, 76].
In this dissertation we make use of quantitative features and we will explain how we manage to obtain these features to be later used in the classification process.
1.1.2 Classes
As stated above, a class should contain similar objects while objects from different classes should be dissimilar [73]. However, the concept of similarity and dissimilarity are not always perfectly clear, sometimes classes are well defined and, in the simplest case, the classes are mu- tually exclusive [73], such as in the handwriting recognition where the classification system receives and interprets intelligible handwritten in- put from a certain source (paper documents, photographs, or other devices). In fact each hand-written symbol corresponds to one and
10 Fundamentals of Pattern Recognition only one symbol stored in the computer, no matter if we are able to recognize the right matching symbol [73].
Nevertheless, in other classification problems, a distinct separation among the classes is not always simply identifiable. For example in the medical research there is an intrinsic variability that makes difficult to identify the classes, as well as to identify which are the most dis- criminant features [73]. Furthermore, there can be co-presence of more than one illness, which makes even more difficult to identify the specific illness we were interested in [73, 140].
1.1.3 Data sets
A data set is a collection of data (objects, elements, samples), usually presented in a matrix-like form, where each column represents a par- ticular feature, while each row (feature vector) corresponds to a given sample of the data set. Thus a data set can be represented by an N × n matrix where N is the number of rows which corresponds to the num- ber of objects composing the data set, and n is the number of features describing each object of the data set.
Figure 1.5 Representation of a data set with N objects described by n features as an N × n matrix.
A data set is described by several parameters which include the following ones:
• the number and types of the features, • the number of samples,
• the number of classes,
• the vector of the class labels associated to each object.
Normally the order in which the samples are “listed” does not matter and thus the list of objects is unordered. Of course there are cases where the order is important such as in regression problems [19].
Advanced vibration analysis for CBM programs 11 Data sets can be obtained in many ways. Besides there exist some data sets made available on Internet which can be used as benchmarks in the PR field. One of these repositories of data sets is the UCI Machine Learning Repository Database [40] athttp://archive.ics.uci.edu/ ml.
In this dissertation a real data set is used. This data set has been provided by Avio Propulsione Aerospaziale, via I Maggio, 99, Rivalta di Torino, Italy.
1.1.4 Classifiers
A classifier can be described by any function F: F : Rn→ Ω
which, starting from the n-size vector of features values describing an object x (x ∈ Rn), identifies the class (ω
i ∈ Ω) to which x belongs. A
classifier can be considered as a set of discriminant functions F [117], each yielding a score (probability) for one specific class (thus one func- tion fi per each class ωi). Each discriminant function fi returns a value
when applied to an object. More precisely each function fi returns
a value which specifies the confidence of the function in assigning the specific object x to the class ωi [73]. Then, typically, the object x is
assigned to the class with the highest score. Ties are broken randomly. Thus the classifier is the result of the application of the maximum rule to this set of discriminant functions (see Fig. 1.6). Therefore, gener- ally, a classifier performs a mapping from an n-dimensional space Rn to a c-dimensional space, Rc, where c is the number of classes [32, 73].
An example of classification performed by a Quadratic Discriminant Classifier (QDC) is represented in Fig. 1.7 [73].
There exist many types of classifiers such as linear, quadratic, and neural networks classifiers. Besides, more classifiers can be appropri- ately combined. We will deal with the combination of classifiers in the next chapters.