Vocabulary Expansion Technique for Advertisement Classification
-
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.6 no.5
- /
- pp.1373-1387
- /
- 2012
Contextual advertising is an important revenue source for major service providers on the Web. Ads classification is one of main tasks in contextual advertising, and it is used to retrieve semantically relevant ads with respect to the content of web pages. However, it is difficult for traditional text classification methods to achieve satisfactory performance in ads classification due to scarce term features in ads. In this paper, we propose a novel ads classification method that handles the lack of term features for classifying ads with short text. The proposed method utilizes a vocabulary expansion technique using semantic associations among terms learned from large-scale search query logs. The evaluation results show that our methodology achieves 4.0% ~ 9.7% improvements in terms of the hierarchical f-measure over the baseline classifiers without vocabulary expansion.
The ontology alignment has two kinds of major problems. First, the features used for ontology alignment are usually defined by experts, but it is highly possible for some critical features to be excluded from the feature set. Second, the semantic and the structural similarities are usually computed independently, and then they are combined in an ad-hoc way where the weights are determined heuristically. This paper proposes the modified parse tree kernel (MPTK) for ontology alignment. In order to compute the similarity between entities in the ontologies, a tree is adopted as a representation of an ontology. After transforming an ontology into a set of trees, their similarity is computed using MPTK without explicit enumeration of features. In computing the similarity between trees, the approximate string matching is adopted to naturally reflect not only the structural information but also the semantic information. According to a series of experiments with a standard data set, the kernel method outperforms other structural similarities such as GMO. In addition, the proposed method shows the state-of-the-art performance in the ontology alignment.
This study is based on the Korean-Chinese parallel corpus, utilizing the Korean connective morpheme '-myenseo' and contrasting with the Chinese expression. Korean learners often struggle with the use of Korean Connective Morpheme especially when there is a lexical gap between their mother language. '-myenseo' is of the most use Korean Connective Morpheme, it usually contrast to the Chinese coordinating conjunction. But according to the corpus, the contrastive Chinese expression to '-myenseo' is more than coordinating conjunction. So through this study, can help the Chinese Korean language learners learn easier while studying '-myenseo', because the variety Chinese expression are found from the parallel corpus that related to '-myenseo'. In this study, firstly discussed the semantic features and syntactic characteristics of '-myenseo'. The significant semantic features of '-myenseo' are 'simultaneous' and 'conflict'. So in this chapter the study use examples of usage to analyse the specific usage of '-myenseo'. And then this study analyse syntactic characteristics of '-myenseo' through the subject constraint, predicate constraints, temporal constraints, mood constraints, negatives constraints. then summarize them into a table. And the most important part of this study is Chapter 4. In this chapter, it contrasted the Korean connective morpheme '-myenseo' to the Chinese expression by analysing the Korean-Chinese parallel corpus. As a result of the analysis, the frequency of the Chinese expression that contrasted to '-myenseo' is summarized into