• Title/Summary/Keyword: Syntactic

Search Result 720, Processing Time 0.023 seconds

A Research for Web Documents Genre Classification using STW (STW를 이용한 웹 문서 장르 분류에 관한 연구)

  • Ko, Byeong-Kyu;Oh, Kun-Seok;Kim, Pan-Koo
    • Journal of Information Technology and Architecture
    • /
    • v.9 no.4
    • /
    • pp.413-422
    • /
    • 2012
  • Many researchers have been studied to reveal human natural language to let machine understand its meaning by text based, page rank based or more. Particularly, it has been considered that URL and HTML Tag information in web documents are attracting people' attention again to analyze huge amount of web document automatically. In this paper, we propose a STW (Semantic Term Weight) approach based on syntactic and linguistic structure of web documents in order to classify what genres are. For the evaluation, we analyzed more than 1,000 documents from 20-Genre-collection corpus for training the documents based on SVM algorithm. Afterwards, we tested KI-04 corpus to evaluate performance of our proposed method. This paper measured their accuracy by classifying them into an experiment using STW and one without u sing STW. As the results, the proposed STW based approach showed approximately 10.2% which Is higher than one without use of STW.

Generation of Natural Referring Expressions by Syntactic Information and Cost-based Centering Model (구문 정보와 비용기반 중심화 이론에 기반한 자연스러운 지시어 생성)

  • Roh Ji-Eun;Lee Jong-Hyeok
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.12
    • /
    • pp.1649-1659
    • /
    • 2004
  • Text Generation is a process of generating comprehensible texts in human languages from some underlying non-linguistic representation of information. Among several sub-processes for text generation to generate coherent texts, this paper concerns referring expression generation which produces different types of expressions to refer to previously-mentioned things in a discourse. Specifically, we focus on pronominalization by zero pronouns which frequently occur in Korean. To build a generation model of referring expressions for Korean, several features are identified based on grammatical information and cost-based centering model, which are applied to various machine learning techniques. We demonstrate that our proposed features are well defined to explain pronominalization, especially pronominalization by zero pronouns in Korean, through 95 texts from three genres - Descriptive texts, News, and Short Aesop's Fables. We also show that our model significantly outperforms previous ones with a 99.9% confidence level by a T-test.

Detecting Inconsistent Code Identifiers (코드 비 일관적 식별자 검출 기법)

  • Lee, Sungnam;Kim, Suntae;Park, Sooyoung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.5
    • /
    • pp.319-328
    • /
    • 2013
  • Software maintainers try to comprehend software source code by intensively using source code identifiers. Thus, use of inconsistent identifiers throughout entire source code causes to increase cost of software maintenance. Although participants can adopt peer reviews to handle this problem, it might be impossible to go through entire source code if the volume of code is huge. This paper introduces an approach to automatically detecting inconsistent identifiers of Java source code. This approach consists of tokenizing and POS tagging all identifiers in the source code, classifying syntactic and semantic similar terms, and finally detecting inconsistent identifiers by applying proposed rules. In addition, we have developed tool support, named CodeAmigo, to support the proposed approach. We applied it to two popular Java based open source projects in order to show feasibility of the approach by computing precision.

Learning Rules for Identifying Hypernyms in Machine Readable Dictionaries (기계가독형사전에서 상위어 판별을 위한 규칙 학습)

  • Choi Seon-Hwa;Park Hyuk-Ro
    • The KIPS Transactions:PartB
    • /
    • v.13B no.2 s.105
    • /
    • pp.171-178
    • /
    • 2006
  • Most approaches for extracting hypernyms of a noun from its definitions in an MRD rely on lexical patterns compiled by human experts. Not only these approaches require high cost for compiling lexical patterns but also it is very difficult for human experts to compile a set of lexical patterns with a broad-coverage because in natural languages there are various expressions which represent same concept. To alleviate these problems, this paper proposes a new method for extracting hypernyms of a noun from its definitions in an MRD. In proposed approach, we use only syntactic (part-of-speech) patterns instead of lexical patterns in identifying hypernyms to reduce the number of patterns with keeping their coverage broad. Our experiment has shown that the classification accuracy of the proposed method is 92.37% which is significantly much better than that of previous approaches.

Intensifiers in Korean, English and German: Focusing on their non-head-bound-use (한국어, 영어 그리고 독일어의 강화사: 비결속 용법을 중심으로)

  • 최규련
    • Language and Information
    • /
    • v.7 no.2
    • /
    • pp.31-58
    • /
    • 2003
  • The main goal of this paper is to describe and analyse intensifiers, especially non-head-bound-intensifiers (NHBIs), which can be included in the discussion and analysis of these elements as focus particles. In doing so, NHBIs such as Korean susulo, casin/cache, English x-self and German selbst are dealt with in a rather cross-linguistical perspective. The pure and strict comparison between Korean, English and German is not intended. This paper is mainly concerned with the semantic domain where the respective contributions of the expressions in question overlap, which offers the common base for the discussion regarding Korean, one of the non-European languages and English and German, two European languages. They share the semantic domain ‘intensification’ regarding relevant subject-NP. They introduce an ordering distinguishing center and periphery. In contrast to head-bound-intensifiers (HBIs), however, NHBIs add self-involvement (directness of involvement) of subject-NP to the meaning of the relevant sentence. I adopt the proposals of Konig (1991), Primus (1992) and Siemund (2000) in the treatment of intensifiers as focus particles. However, I reject Konig (1991) that just NHBIs talre scope over a whole clause, Primus (1992) that NHBIs focus VPs, not NPs, and Siemund (2000) that NHBIs can be further devided into two groups, viz. NHBIs with exclusive readings and NHBIs with inclusive readings. Evidence for my position is presented mainly in the course of describing and analysing some syntactic properties and the meaning and use of NHBIs. I come to the conclusion that both the common meaning of intensifiers as focus particles and the common meaning of NHBIs of three languages can be represented by a simple logical formalism.

  • PDF

Analysis of Problem-Solving Protocol of Mathematical Gifted Children from Cognitive Linguistic and Meta-affect Viewpoint (인지언어 및 메타정의의 관점에서 수학 영재아의 문제해결 프로토콜 분석)

  • Do, Joowon;Paik, Suckyoon
    • Education of Primary School Mathematics
    • /
    • v.22 no.4
    • /
    • pp.223-237
    • /
    • 2019
  • There is a close interaction between the linguistic-syntactic representation system and the affective representation system that appear in the mathematical process. On the other hand, since the mathematical conceptual system is fundamentally metaphoric, the analysis of the mathematical concept structure through linguistic representation can help to identify the source of cognitive and affective obstacles that interfere with mathematics learning. In this study, we analyzed the problem-solving protocols of mathematical gifted children from the perspective of cognitive language and meta-affect to identify the relationship between the functional characteristics of the text and metaphor they use and the functional characteristics of meta-affect. As a result, the behavior of the cognitive and affective characteristics of mathematically gifted children differed according to the success of problem solving. In the case of unsuccessful problem-solving, the use of metaphor as an internal representation system was relatively more frequent than in the successful case. In addition, while the cognitive linguistic aspects of metaphors play an important role in problem-solving, meta-affective attributes are closely related to the external representation of metaphors.

Semantic-based Keyword Search System over Relational Database (관계형 데이터베이스에서의 시맨틱 기반 키워드 탐색 시스템)

  • Yang, Younghyoo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.12
    • /
    • pp.91-101
    • /
    • 2013
  • One issue with keyword search in general is its ambiguity which can ultimately impact the effectiveness of the search in terms of the quality of the search results. This ambiguity is primarily due to the ambiguity of the contextual meaning of each term in the query. In addition to the query ambiguity itself, the relationships between the keywords in the search results are crucial for the proper interpretation of the search results by the user and should be clearly presented in the search results. We address the keyword search ambiguity issue by adapting some of the existing approaches for keyword mapping from the query terms to the schema terms/instances. The approaches we have adapted for term mapping capture both the syntactic similarity between the query keywords and the schema terms as well as the semantic similarity of the two and give better mappings and ultimately 50% raised accurate results. Finally, to address the last issue of lacking clear relationships among the terms appearing in the search results, our system has leveraged semantic web technologies in order to enrich the knowledgebase and to discover the relationships between the keywords.

Comparison of prosodic characteristics by question type in left- and right-hemisphere-injured stroke patients (좌반구 손상과 우반구 손상 뇌졸중 환자의 의문문 유형에 따른 운율 특성 비교)

  • Yu, Youngmi;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.13 no.3
    • /
    • pp.1-13
    • /
    • 2021
  • This study examined the characteristics of linguistic prosody in terms of cerebral lateralization in three groups of 9 healthy speakers and 14 speakers with a history of stroke (7 with left hemisphere damage (LHD), 7 with right hemisphere damage (RHD)). Specifically, prosodic characteristics related to speech rate, duration, pitch, and intensity were examined in three types of interrogative sentences (wh-questions, yes-no questions, alternative questions) with auditory perceptual evaluation. As a result, the statistically significant key variables showed flaws in production of the linguistic prosody in the speakers with LHD. The statistically significant variables were more insufficiently produced for wh-questions than for yes-no and alternative questions. This trend was particularly noticeable in variables related to pitch and speech rate. This result suggests that when Korean speakers process linguistic prosody, such as that of lexico-semantic and syntactic information in interrogative sentences, the left hemisphere seems to be superior to the right hemisphere.

Revisiting 'It'-Extraposition in English: An Extended Optimality-Theoretic Analysis

  • Khym, Han-gyoo
    • International Journal of Advanced Culture Technology
    • /
    • v.7 no.2
    • /
    • pp.168-178
    • /
    • 2019
  • In this paper I discuss a more complicated case of 'It'-Extraposition in English in the Optimality Theory [1] by further modifying and extending the analysis done in Khym (2018) [2] in which only the 'relatively' simple cases of 'It'-Extraposition such as 'CP-Predicate' was dealt with. I show in this paper that the constraints and the constraint hierarchy developed to explain the 'relatively' simple cases of 'It'-Extraposition are no longer valid for the more complicated cases of 'It'-Extraposition in configuration of 'CP-V-CP'. In doing so, I also discuss two important theoretic possibilities and suggest a new view to look at the 'It'-Extraposition: first, the long-bothering question of which syntactic approach between P&P (Chomsky 1985) [3] and MP (Chomsky 1992) [4] should be based on in projecting the full surface forms of candidates may boil down to just a simple issue of an intrinsic property of the Gen(erator). Second, the so-called 'It'- Extraposition phenomenon may not actually be a derived construction by the optional application of Extraposition operation. Rather, it could be just a representational construction produced by the simple application of 'It'-insertion after the structure projection with 'that-clause' at the post-verbal position. This observation may lead to elimination of one of the promising candidates of '$It_i{\ldots}[_{CP}that{\sim}]_i$' out of the computation table in Khym [2], and eventually to excluding the long-named 'It'-Extraposition case from Extrsposition phenomena itself. The final constraints and the constraint hierarchy that are explored are as follows: ${\bullet}$ Constraints: $^*SSF$, AHSubj, Subj., Min-D ${\bullet}$ Constraint Hierarchy: SSF<<>>Subj.>> AHSubj.

An Approach to Chinese Conversations in the Textbook based on Social Units of Communication (중국어 회화문에 대한 의사소통 분석단위에 기초한 접근)

  • Park, Chan-Wook
    • Cross-Cultural Studies
    • /
    • v.49
    • /
    • pp.127-150
    • /
    • 2017
  • The objective of this study is to classify the conversations in Chinese textbooks into four social units (speech community, speech situation, speech event, speech act) adopted by Dell Hymes (1972), and suggest application of the results involving the conversation to the curriculum of Chinese education. Towards this end, this study assumes every conversation in the Chinese textbooks as coordination of specific speech events and acts under specific situations. This study introduces the concept of social unit adopted by Dell Hymes (1972), and elucidates their role in conversation. Thus, this study reconsiders the conversations recorded in the textbooks not from a morphological or syntactic viewpoint but from a speech perspective. Finally, this study suggests effective use of the results in the Chinese conversation classes.