Escala de actitudes ante el género

mentos curriculares

ANEXO 03 Escala de actitudes ante el género

In the classification-based (data mining or machine learning-based) approach to sentiment analysis/extraction a pre-labelled training corpora (exhibiting prior knowledge) is used to learn a “classifier” using some established supervised learning mechanism.

The training data comprises a collection of ordered pairs hs, ci wheres is an instance

(observation) comprised of a set of attribute (feature) values and c is a known class

label for the instance taken from a set of class labels C. Once the classifier has been

generated it can be used to assign documents to the “fittest” class; essentially per-

forming a mapping si → ci where ci ∈ C (the set of known class labels). It has been

argued that classification-based approaches in political sentiment mining tend to work

well [Grijzenhout et al., 2010]. However, the need for appropriate training data is a

limiting factor, and the learning process is highly dependent on the quality of the prior

ing) and testing a classifier [Bird et al.,2009]. Once a classifier has been generated it can be applied to “unseen” data, provided that the unseen data is pre-processed in the same manner as that used with respect to the training data originally used to produce the classifier. Confidence in a generated classifier is typically gained by applying the classifier to pre-labeled test data.

Classifiers can be generated in a variety of different ways which in turn also dictates their usage. The following is a brief description of some of the most commonly used machine learning classifiers (which were used with respect to the work described in this

thesis as reported in Chapter4) for sentiment classification in sentiment mining.

• Na¨ıve Bayes: The classifier uses training data to learn the conditional probabil-

ity of each attribute given the class label and generates a probabilistic model of the features. This model is then used to predict the class of new instances using

the highest posterior probability [Duda et al.,2001].

• Support Vector Machine: Results in a discriminative classifier-based on the

concept of a separating hyperplane1 (class boundary) placed between a set of

objects having different class memberships [Theodoridis and Koutroumbas,2008].

In other words, given a labelled training dataset, an optimal hyperplane (decision plane) is defined which can then be used to classify the new instances. Support Vector Machines (SVMs) have been shown to work well with respect to textual

data [Joachims,1998]. However, two notable disadvantage of SVMs are: (i) they

are directed at binary classification problems and thus tend to be not suited to multi-class classification, and (ii) they are a black box technique in that it is unclear how a particular SVM, once generated, operates.

• Decision Trees: The algorithm learns a classifier from labelled training data by

considering each data attributes in turn using some measure, such as information gain, to determine the discriminative power of each attribute. The splitting pro- cedure stops if all instances in a subset belong to the same class. In this manner a “decision tree” is built where the internal nodes represent individual nodes. Leaf

nodes (terminals) represent class labels [Duda et al.,2001]. Decision tree classi-

fiers offer the advantage that they are easily understandable in that explanations as to why a certain classification is so can be easily generated.

• Rule-based: Classifiers built using rule-based approaches consist of a set of

conditional “ if ... then ... ” style rules. A training dataset of labelled observations is used to extract the classification rules and to build the classifier. Classification

1_{A separating hyperplane is a decision boundary which can be used for classification. The best} hyperplane is the one that represents the largest separation between the two classes.

rules are used in a given order during the prediction process so as to assign a class

label to a new unlabelled observation (instance) [Duda et al.,2001]. Rule-based

classifiers offer the advantage, as in the case of decision tree classifiers, that they are easily understandable by non-experts and that explanations can be easily generated.

• Nearest neighbour classifier: Nearest neighbors-based algorithms have been

extensively used for classification purposes. The idea is to simply find a predefined

constant numberk of the most adjacent (closest in distance) training instances to

a new instance and then use the labels from thek identified instances to predict

the label for the new instance. This is typically done using a simple majority

vote [Theodoridis and Koutroumbas, 2008]. The K Nearest Neighbour (KNN)

form of classification is an instance-based, or non-generalizing, learning method in that a “general model” of the application domain is not built (as in the case of all the foregoing methods). KNN classification has been shown to be successful

with respect to classification tasks with very irregular decision boundaries [Duda

et al.,2001]. A disadvantage of KNN classification is the complexity of searching for the nearest neighbours, especially in the context of high dimensional feature

spaces [Theodoridis and Koutroumbas,2008].

• Baseline classifier: Baseline classifiers simply predict the most common class

[Nasa and Suman,2012]. The ZeroR algorithm [Witten et al.,1999] is an exemplar baseline classifier. Baseline classifier have little practical usage, however they are useful in experimental contexts to provide a “baseline” with which the operation of other (real) classifiers can be compared.

In document ESCOLAR EN LA CONSTRUCCIÓN DEL GÉNERO (página 173-177)