• Title/Summary/Keyword: Lexical Knowledge

Search Result 57, Processing Time 0.021 seconds

Analysis on Vocabulary Used in School Newsletters of Korean elementary Schools: Focus on the areas of Busan, Ulsan and Gyeongnam (한국 초등학교 가정통신문의 어휘 특성 연구 -부산·울산·경남 지역을 중심으로-)

  • Kang, Hyunju
    • Journal of Korean language education
    • /
    • v.29 no.2
    • /
    • pp.1-23
    • /
    • 2018
  • This study aims to analyze words and phrases which are frequently used in newsletters from Korean elementary schools. In order to achieve this goal, high frequent words from school newsletters were selected and classified into content and function words, and the domains of the words were looked up. For this study 1,000 school newsletters were collected in the areas of Busan, Ulsan and Gyeongnam. In terms of parts of speech, nouns, especially common nouns, most frequently appeared in the school newsletters followed by verbs and adjectives. This result shows that for immigrant women who have basic knowledge on Korean language, it is useful to give translated words to get the message of school letters. Furthermore, school related terms such as facilities, regulations and activities of school and Chinese-based vocabularies are found in school newsletters. In case of verbs, the words which contain the meaning of requests and suggestions are used the most. Adjectives which are related to positive value and evaluation, and describing weather and season is frequently used as well.

A Multimedia Bulletin Board System Providing Semantic-based Searching (의미 기반 정보 검색을 제공하는 멀티미디어 게시판 시스템)

  • Jung Eui-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.6 s.38
    • /
    • pp.75-84
    • /
    • 2005
  • Bulletin board systems have evolved to support diverse multimedia data as well as text. However, current board systems have an weakness : it takes much time and efforts for users to figure out contents of articles. Most board systems provide a searching function with lexical level data access for solving that problem, however it fails to serve users' intented searching results. Moreover, it is nearly impossible to search proper articles if they contain multimedia data. This paper proposed a bulletin board system adopting the Semantic Web to solve this issue. The proposed system provides users with new ontology which is used for describing articles' domain knowledge and multimedia features. Users can describe their own board ontology using the proposed ontology. To support semantic-based searching for diverse domain knowledge without modification of the system, the system dynamically generated input/query interface and RDF data access module according to the board ontology written by administrators. The proposed board system shows that semantic-based searching is feasible and effective for users to find their intended articles.

  • PDF

Construction of Korean Wordnet "KorLex 1.5" (한국어 어휘의미망 "KorLex 1.5"의 구축)

  • Yoon, Ae-Sun;Hwang, Soon-Hee;Lee, Eun-Ryoung;Kwon, Hyuk-Chul
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.1
    • /
    • pp.92-108
    • /
    • 2009
  • The Princeton WordNet (PWN), which was developed during last 20 years since the mid 80, aimed at representing a mental lexicon inside the human mind. Its potentiality, applicability and portability were more appreciated in the fields of NLP and KE than in cognitive psychology. The semantic and knowledge processing is indispensable in order to obtain useful information using human languages, in the CMC and HCI environment. The PWN is able to provide such NLP-based systems with 'concrete' semantic units and their network. Referenced to the PWN, about 50 wordnets of different languages were developed during last 10 years and they enable a variety of multilingual processing applications. This paper aims at describing PWN-referenced Korean Wordnet, KorLex 1.5, which was developed from 2004 to 2007, and which contains currently about 130,000 synsets and 150,000 word senses for nouns, verbs, adjectives, adverbs, and classifiers.

Constructing the Semantic Information Model using A Collective Intelligence Approach

  • Lyu, Ki-Gon;Lee, Jung-Yong;Sun, Dong-Eon;Kwon, Dai-Young;Kim, Hyeon-Cheol
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.5 no.10
    • /
    • pp.1698-1711
    • /
    • 2011
  • Knowledge is often represented as a set of rules or a semantic network in intelligent systems. Recently, ontology has been widely used to represent semantic knowledge, because it organizes thesaurus and hierarchal information between concepts in a particular domain. However, it is not easy to collect semantic relationships among concepts. Much time and expense are incurred in ontology construction. Collective intelligence can be a good alternative approach to solve these problems. In this paper, we propose a collective intelligence approach of Games With A Purpose (GWAP) to collect various semantic resources, such as words and word-senses. We detail how to construct the semantic information model or ontology from the collected semantic resources, constructing a system named FunWords. FunWords is a Korean lexical-based semantic resource collection tool. Experiments demonstrated the resources were grouped as common nouns, abstract nouns, adjective and neologism. Finally, we analyzed their characteristics, acquiring the semantic relationships noted above. Common nouns, with structural semantic relationships, such as hypernym and hyponym, are highlighted. Abstract nouns, with descriptive and characteristic semantic relationships, such as synonym and antonym are underlined. Adjectives, with such semantic relationships, as description and status, illustration - for example, color and sound - are expressed more. Last, neologism, with the semantic relationships, such as description and characteristics, are emphasized. Weighting the semantic relationships with these characteristics can help reduce time and cost, because it need not consider unnecessary or slightly related factors. This can improve the expressive power, such as readability, concentrating on the weighted characteristics. Our proposal to collect semantic resources from the collective intelligence approach of GWAP (our FunWords) and to weight their semantic relationship can help construct the semantic information model or ontology would be a more effective and expressive alternative.

Network Analysis between Uncertainty Words based on Word2Vec and WordNet (Word2Vec과 WordNet 기반 불확실성 단어 간의 네트워크 분석에 관한 연구)

  • Heo, Go Eun
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.53 no.3
    • /
    • pp.247-271
    • /
    • 2019
  • Uncertainty in scientific knowledge means an uncertain state where propositions are neither true or false at present. The existing studies have analyzed the propositions written in the academic literature, and have conducted the performance evaluation based on the rule based and machine learning based approaches by using the corpus. Although they recognized that the importance of word construction, there are insufficient attempts to expand the word by analyzing the meaning of uncertainty words. On the other hand, studies for analyzing the structure of networks by using bibliometrics and text mining techniques are widely used as methods for understanding intellectual structure and relationship in various disciplines. Therefore, in this study, semantic relations were analyzed by applying Word2Vec to existing uncertainty words. In addition, WordNet, which is an English vocabulary database and thesaurus, was applied to perform a network analysis based on hypernyms, hyponyms, and synonyms relations linked to uncertainty words. The semantic and lexical relationships of uncertainty words were structurally identified. As a result, we identified the possibility of automatically expanding uncertainty words.

An Emotion Scanning System on Text Documents (텍스트 문서 기반의 감성 인식 시스템)

  • Kim, Myung-Kyu;Kim, Jung-Ho;Cha, Myung-Hoon;Chae, Soo-Hoan
    • Science of Emotion and Sensibility
    • /
    • v.12 no.4
    • /
    • pp.433-442
    • /
    • 2009
  • People are tending to buy products through the Internet rather than purchasing them from the store. Some of the consumers give their feedback on line such as reviews, replies, comments, and blogs after they purchased the products. People are also likely to get some information through the Internet. Therefore, companies and public institutes have been facing this situation where they need to collect and analyze reviews or public opinions for them because many consumers are interested in other's opinions when they are about to make a purchase. However, most of the people's reviews on web site are too numerous, short and redundant. Under these circumstances, the emotion scanning system of text documents on the web is rising to the surface. Extracting writer's opinions or subjective ideas from text exists labeled words like GI(General Inquirer) and LKB(Lexical Knowledge base of near synonym difference) in English, however Korean language is not provided yet. In this paper, we labeled positive, negative, and neutral attribute at 4 POS(part of speech) which are noun, adjective, verb, and adverb in Korean dictionary. We extract construction patterns of emotional words and relationships among words in sentences from a large training set, and learned them. Based on this knowledge, comments and reviews regarding products are classified into two classes polarities with positive and negative using SO-PMI, which found the optimal condition from a combination of 4 POS. Lastly, in the design of the system, a flexible user interface is designed to add or edit the emotional words, the construction patterns related to emotions, and relationships among the words.

  • PDF

A study on the predictability of acoustic power distribution of English speech for English academic achievement in a Science Academy (과학영재학교 재학생 영어발화 주파수 대역별 음향 에너지 분포의 영어 성취도 예측성 연구)

  • Park, Soon;Ahn, Hyunkee
    • Phonetics and Speech Sciences
    • /
    • v.14 no.3
    • /
    • pp.41-49
    • /
    • 2022
  • The average acoustic distribution of American English speakers was statistically compared with the English-speaking patterns of gifted students in a Science Academy in Korea. By analyzing speech recordings, the duration time of which is much longer than in previous studies, this research identified the degree of acoustic proximity between the two parties and the predictability of English academic achievement of gifted high school students. Long-term spectral acoustic power distribution vectors were obtained for 2,048 center frequencies in the range of 20 Hz to 20,000 Hz by applying an long-term average speech spectrum (LTASS) MATLAB code. Three more variables were statistically compared to discover additional indices that can predict future English academic achievement: the receptive vocabulary size test, the cumulative vocabulary scores of English formative assessment, and the English Speaking Proficiency Test scores. Linear regression and correlational analyses between the four variables showed that the receptive vocabulary size test and the low-frequency vocabulary formative assessments which require both lexical and domain-specific science background knowledge are relatively more significant variables than a basic suprasegmental level English fluency in the predictability of gifted students' academic achievement.