CAPITULO II: MARCO TEÓRICO
2. BASES TEÓRICAS
2.2. EL ENFOQUE PSICOLINGÜÍSTICO DE LA LECTURA
2.2.2. LOS PROCESOS DE LA LECTURA
Figure 8.4 plots the changes of f-scores of different chunking systems relative to the word frequency. First, both systems work very badly when they never see a predicate or just see them a few times in the training data. As more and more instances of a particular predicate is available, the role labeling of this predicate is better and better. The amount of training data of a particular predicate influence the full parsing based system less. We think one main reason is that syntactic information significantly abstract the meaning from surface strings, and a semantic processor based on full
parsing is thus more robust than the one based on partial parsing. The availability of labeled data of a particular predicate significantly limits the SRL performance. We thus think it is an essential topic to better capture the paradigmatic relations of predicates, e.g. through hierarchical classification of verbs.
8.6
Conclusion and Discussion
In this chapter, we first went deep into the feature engineering problem for Chinese SRL. We then introduced a new method which took either full parses or partial parse as inputs, and detected and classified semantic roles in a chunking way. Our evaluation on the benchmark data showed that the full parsing based new features and new method lead to a significant improvement over the best published individual SRL system. Furthermore, we present a series of empirical analysis to achieve better understanding of Chinese SRL. We hope our analysis is helpful to enhance existing methods and to design new solutions for Chinese SRL.
Our comparative analysis of full and partial parsing based methods emphasize on the complementary strengths of the partial parsing based system to the full parsing based one. Our analysis suggests that Chinese SRL can benefit from the combination of the full and partial parsing based methods. This direction is explored in [Zhuang and Zong, 2010], which leverage a integer linear programming based post-inference to combine the outputs from different systems. If we take different parsers as pre- processing systems, even the same SRL method can provide different labeling results. In their experiments, the combination of different full parsing based systems was helpful, but the further combination with our partial parsing based system was more remarkable. This also confirms our motivation to develop a purely discriminative shallow semantic chunker.
Chapter 9
Conclusions
This chapter provides some brief concluding remarks and discusses topics for future research.
9.1
Summary of the Thesis
This thesis is motivated by the inadequacy of single view approaches in many areas in NLP. We have studied multi-view Chinese language processing, including word segmentation, POS tagging, syntactic parsing, and semantic role labeling. We con- sider three situations of multiple views in statistical NLP: (1) Heterogeneous methods have been designed for a given problem; (2) Heterogeneous annotation data, which could be either different in annotation schemes or in formalisms, is available to train single systems; (3) Heterogeneous machine learning paradigms, which could be either supervised or unsupervised, are applicable. Table 9.1 lists all the problems and het- erogeneous views we have investigated. Each discussed item is one evidence for the primary argument, that is, learning language structures could benefit from multiple, heterogeneous views.
• For word segmentation, we first present a comparative study of two state-of- the-art segmentation methods. Inspired by the diversity of the character-based and word-based views, we designed a novel stacked sub-word tagging model for joint word segmentation and POS tagging, which is robust to integrate different models, even models trained on heterogeneous annotations.
• For POS tagging, we introduced two improvements: (1) integrating chart pars- ing results to better capture syntagmatic relations among words and (2) inte-
Model Annotation Learning paradigm Scheme Formalism
Word segmentation √ √ √
POS tagging √ √ √
Syntactic parsing √ √ √
Semantic role labeling √
Table 9.1: The tasks and their corresponding multi-views investigated in the thesis. grating word clusters acquired from unlabeled data to better capture paradig- matic relations among words.
• For syntactic parsing, we focused on different linguistic annotations, includ- ing both the representation formalism and the annotation scheme. We present a comparative analysis for generative PCFG-LA constituency parsing and dis- criminative graph-based dependency parsing. To explore the diversity of parsing in different formalisms, we introduced a Bagging model to effectively enhance dependency parsing. We also explored heterogenous treebanks to improve con- stituency parsing via a reranking model.
• Our work on SRL focused on improving the full parsing method with linguisti- cally rich features and a chunking strategy. Furthermore, we developed a partial parsing based semantic chunking method, which has complementary strengths to the full parsing based method.
• Finally, we introduced a feature induction method to improve supervised a word segmenter and various syntactic processing systems via harvesting string and word knowledge from unlabeled data.
Multi-view learning can be advantageous when compared to learning with only a single view especially when learners built on different views are distinct and diverse enough. The impact of multi-views mainly stands from the diversity between learners, while it is less important whether the diversity is caused by using multiple computa- tional models, by training on heterogeneous data, or by implementing supervised or unsupervised learning paradigms. Our work has shown that view integration benefits language processing across a wide range of conditions.
An exciting but non-obvious fact is that even in cases that one learner is much weaker than another learner, it can still enhance the stronger one if it is relevant and increases the diversity. According to our experiments, as well as some others, a slightly weaker word-based segmenter can help a character-based segmenter (Chapter
2 and 3), a weaker chart parsing based POS tagger can help a sequential tagger (Chapter 5), and a significantly weaker partial parsing based SRL system can help a strong full parsing based system [Zhuang and Zong, 2010].
Finally, supervised and unsupervised learning paradigms usually work in very different ways and there is no guarantee that outputs of unsupervised learners can be directly compared to human labeled data. Nevertheless, knowledge acquired in an unsupervised manner can still help supervised systems, if it is relevant to the task. In our experiments, the knowledge about how independently a string is used is not directly related to word boundaries but can enhance a strong supervised segmenter; word clusters that are only roughly related to the paradigmatic lexical relations can enhance syntactic parsing in different levels.