• Title/Summary/Keyword: free word-order language

Search Result 16, Processing Time 0.028 seconds

Two-Path Language Modeling Considering Word Order Structure of Korean (한국어의 어순 구조를 고려한 Two-Path 언어모델링)

  • Shin, Joong-Hwi;Park, Jae-Hyun;Lee, Jung-Tae;Rim, Hae-Chang
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.8
    • /
    • pp.435-442
    • /
    • 2008
  • The n-gram model is appropriate for languages, such as English, in which the word-order is grammatically rigid. However, it is not suitable for Korean in which the word-order is relatively free. Previous work proposed a twoply HMM that reflected the characteristics of Korean but failed to reflect word-order structures among words. In this paper, we define a new segment unit which combines two words in order to reflect the characteristic of word-order among adjacent words that appear in verbal morphemes. Moreover, we propose a two-path language model that estimates probabilities depending on the context based on the proposed segment unit. Experimental results show that the proposed two-path language model yields 25.68% perplexity improvement compared to the previous Korean language models and reduces 94.03% perplexity for the prediction of verbal morphemes where words are combined.

A Simple Syntax for Complex Semantics

  • Lee, Kiyong
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2002.02a
    • /
    • pp.2-27
    • /
    • 2002
  • As pact of a long-ranged project that aims at establishing database-theoretic semantics as a model of computational semantics, this presentation focuses on the development of a syntactic component for processing strings of words or sentences to construct semantic data structures. For design arid modeling purposes, the present treatment will be restricted to the analysis of some problematic constructions of Korean involving semi-free word order, conjunction arid temporal anchoring, and adnominal modification and antecedent binding. The present work heavily relies on Hausser's (1999, 2000) SLIM theory for language that is based on surface compositionality, time-linearity arid two other conditions on natural language processing. Time-linear syntax for natural language has been shown to be conceptually simple and computationally efficient. The associated semantics is complex, however, because it must deal with situated language involving interactive multi-agents. Nevertheless, by processing input word strings in a time-linear mode, the syntax cart incrementally construct the necessary semantic structures for relevant queries and valid inferences. The fragment of Korean syntax will be implemented in Malaga, a C-type implementation language that was enriched for both programming and debugging purposes arid that was particluarly made suitable for implementing in Left-Associative Grammar. This presentation will show how the system of syntactic rules with constraining subrules processes Korean sentences in a step-by-step time-linear manner to incrementally construct semantic data structures that mainly specify relations with their argument, temporal, and binding structures.

  • PDF

Combinatory Categorial Grammar for Korean

  • Han, Sung-Kook;Park, Chan-Gon
    • Annual Conference on Human and Language Technology
    • /
    • 1990.11a
    • /
    • pp.164-171
    • /
    • 1990
  • A commutative productive category is proposed to the current CCG for the syntactic analysis of free word order languages like Korean. The introduction of this sort of category is quite natural for categorial lexicon and functional operations. We present the theorical basis of productive category and examine the linguistic availability through typical syntactic structures of Korean.

  • PDF

Analysis and Computational Processing of Quantifier Floating in Korean (양화사유동과 관련된 한국어의 분석과 전산처리)

  • 이진복;박종철
    • Language and Information
    • /
    • v.7 no.1
    • /
    • pp.1-22
    • /
    • 2003
  • Quantifier floating is one of the much studied phenomena in natural languages where quantifying expressions may appear in places other than their original prenominal one. Its presence is especially prominent in languages such as Korean that allow more or less free word order. We find that, in addition to what is described in the literature, there are other remarkable regularities in the way the language allows quantifiers to “float” with respect to various constructions including coordination, relative clauses, and embedded clauses. These regularities are captured syntactically in a combinatory categorial grammar (CCG) framework for Korean. We also show how to derive semantic representations for Korean quantifier floating in the same CCG framework.

  • PDF

Language and Symbolic Reference in Whitehead′s Philosophy (화이트헤드의 언어 이해와 상징적 연관)

  • 문창옥
    • Lingua Humanitatis
    • /
    • v.6
    • /
    • pp.147-166
    • /
    • 2004
  • Whitehead's discussion of language is not to be found in any one book or article. It is interwoven with his discussion of many other questions. He was, however, greatly concerned with the problem of symbolism in general and the uses of language. He regards language, spoken or written, as an instrument devised by men to aid them in their adjustment to the environment in which they live Language is used for many specific purposes in the process of this adjustment. Words are employed not only to refer to data and to express emotions. They may be used also to record experiences, and thoughts about these experiences. Worts also function as instruments in the organization of experiences as they are considered in retrospect. Thus words free us from the bondage of the immediate. And Whitehead's theory of meaning is implicit in his discussion of the functions of language. According to him, the human mind is functioning symbolically when some components of its experience elicit consciousness, beliefs, emotions, and usages, respecting other components of its experiences. The former set of components are the 'symbols', and the latter set constitute the 'meaning' of the symbols. Whitehead points out that one word may have several meanings, i.e. refer to several different data. In order to understand, thus, the meaning to which a word refers, it is sometimes very important to appreciate the system of thought within which a person is operating. Further, Whitehead's discussion of language includes a number of cogent warning the deficiencies of language, and hence the need for great care in the use of words. In fact, language developed gradually. For the most part we have created words designed to deal with practical problems. Attention focuses on the prominent features in a situation, in particular the changing aspects of things. With reference to such data our words are relatively adequate. However, this issues in an unfortunate superficiality. The enduring, the subtle, the complex and the general aspects of the universe do not have adequate verbal representation. for this reason, Whitehead's position concerning the uses of language in speculative philosophy is stated with pungent directness. The uncritical trust in the adequacy of language is one of the main errors to which philosophy is liable. Since ordinary language does not do justice to the generalities, profundities and complexities of life, it is obvious that philosophy requires new words and phrases, or at least the revision of familiar words and phrases. Proceeding to develop the theme Whitehead contends that words and phrases must be stretched towards a generality foreign to their ordinary usage. In the same vein Whitehead refers to the need to realize that language which is the tool of philosophy needs to be redesigned just as in physical science available physical apparatus needs to be redesigned. But even these words and phrases, stretched or redesigned, are never completely adequate in philosophical speculations. They are, in his opinion, merely a great improvement over ordinary language or the language science, mathematics or symbolic logic.

  • PDF

Romanian-Lexicon-Based Sentiment Analysis for Assesing Teachers' Activity

  • Barila, Adina;Danubianu, Mirela;Gradinaru, Bogdanel
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.10
    • /
    • pp.43-50
    • /
    • 2022
  • The students' feedback is important to measure and improve teaching performance. Many teacher performance evaluation systems are based on responses to closed question, but the free text answers can contain useful information which had to be explored. In this paper we present a lexicon-based sentiment analysis to explore students' text feedback. The data was collected from a system for the evaluation of teachers by students developed and used in our university. The students comments are in Romanian language so we built a Romanian sentiment word lexicon. We used this to categorize the feeback text as positive, negative or neutral. In addition, we added a new polarity - indifferent - in order to categorize blank and "I don't answer" responses.

Syntactic and Semantic Disambiguation for Interpretation of Numerals in the Information Retrieval (정보 검색을 위한 숫자의 해석에 관한 구문적.의미적 판별 기법)

  • Moon, Yoo-Jin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.8
    • /
    • pp.65-71
    • /
    • 2009
  • Natural language processing is necessary in order to efficiently perform filtering tremendous information produced in information retrieval of world wide web. This paper suggested an algorithm for meaning of numerals in the text. The algorithm for meaning of numerals utilized context-free grammars with the chart parsing technique, interpreted affixes connected with the numerals and was designed to disambiguate their meanings systematically supported by the n-gram based words. And the algorithm was designed to use POS (part-of-speech) taggers, to automatically recognize restriction conditions of trigram words, and to gradually disambiguate the meaning of the numerals. This research performed experiment for the suggested system of the numeral interpretation. The result showed that the frequency-proportional method recognized the numerals with 86.3% accuracy and the condition-proportional method with 82.8% accuracy.