• Title/Summary/Keyword: Standard Korean Dictionary

Search Result 79, Processing Time 0.026 seconds

Sensitivity Identification Method for New Words of Social Media based on Naive Bayes Classification (나이브 베이즈 기반 소셜 미디어 상의 신조어 감성 판별 기법)

  • Kim, Jeong In;Park, Sang Jin;Kim, Hyoung Ju;Choi, Jun Ho;Kim, Han Il;Kim, Pan Koo
    • Smart Media Journal
    • /
    • v.9 no.1
    • /
    • pp.51-59
    • /
    • 2020
  • From PC communication to the development of the internet, a new term has been coined on the social media, and the social media culture has been formed due to the spread of smart phones, and the newly coined word is becoming a culture. With the advent of social networking sites and smart phones serving as a bridge, the number of data has increased in real time. The use of new words can have many advantages, including the use of short sentences to solve the problems of various letter-limited messengers and reduce data. However, new words do not have a dictionary meaning and there are limitations and degradation of algorithms such as data mining. Therefore, in this paper, the opinion of the document is confirmed by collecting data through web crawling and extracting new words contained within the text data and establishing an emotional classification. The progress of the experiment is divided into three categories. First, a word collected by collecting a new word on the social media is subjected to learned of affirmative and negative. Next, to derive and verify emotional values using standard documents, TF-IDF is used to score noun sensibilities to enter the emotional values of the data. As with the new words, the classified emotional values are applied to verify that the emotions are classified in standard language documents. Finally, a combination of the newly coined words and standard emotional values is used to perform a comparative analysis of the technology of the instrument.

Development of Sensibility Vocabulary Classification System for Sensibility Evaluation of Visitors According to Forest Environment

  • Lee, Jeong-Do;Joung, Dawou;Hong, Sung-Jun;Kim, Da-Young;Park, Bum-Jin
    • Journal of People, Plants, and Environment
    • /
    • v.22 no.2
    • /
    • pp.209-217
    • /
    • 2019
  • Generally human sensibility is expressed in a certain language. To discover the sensibility of visitors in relation to the forest environment, it is first necessary to determine their exact meanings. Furthermore, it is necessary to sort these terms according to their meanings based on an appropriate classification system. This study attempted to develop a classification system for forest sensibility vocabulary by extracting Korean words used by forest visitors to express their sensibilities in relation to the forest environment, and established the structure of the system to classify the accumulated vocabulary. For this purpose, we extracted forest sensibility words based on literature review of experiences reported in the past as well as interviews of forest visitors, and categorized the words by meanings using the Standard Korean Language Dictionary maintained by the National Institute of the Korean Language. Next, the classification system for these words was established with reference to the classification system for vocabulary in the Korean language examined in previous studies of Korean language and literature. As a result, 137 forest sensibility words were collected using a documentary survey, and we categorized these words into four types: emotion, sense, evaluation, and existence. Categorizing the collected forest sensibility words based on this Korean language classification system resulted in the extraction of 40 representative sensibility words. This experiment enabled us to determine from where our sensibilities that find expressions in the forest are derived, that is, from sight, hearing, smell, taste, or touch, along with various other aspects of how our human sensibilities are expressed such as whether the subject of a word is person-centered or object-centered. We believe that the results of this study can serve as foundational data about forest sensibility.

Enhancing Performance of Bilingual Lexicon Extraction through Refinement of Pivot-Context Vectors (중간언어 문맥벡터의 정제를 통한 이중언어 사전 구축의 성능개선)

  • Kwon, Hong-Seok;Seo, Hyung-Won;Kim, Jae-Hoon
    • Journal of KIISE:Software and Applications
    • /
    • v.41 no.7
    • /
    • pp.492-500
    • /
    • 2014
  • This paper presents the performance enhancement of automatic bilingual lexicon extraction by using refinement of pivot-context vectors under the standard pivot-based approach, which is very effective method for less-resource language pairs. In this paper, we gradually improve the performance through two different refinements of pivot-context vectors: One is to filter out unhelpful elements of the pivot-context vectors and to revise the values of the vectors through bidirectional translation probabilities estimated by Anymalign and another one is to remove non-noun elements from the original vectors. In this paper, experiments have been conducted on two different language pairs that are bi-directional Korean-Spanish and Korean-French, respectively. The experimental results have demonstrated that our method for high-frequency words shows at least 48.5% at the top 1 and up to 88.5% at the top 20 and for the low-frequency words at least 43.3% at the top 1 and up to 48.9% at the top 20.

An Analyses of the Terms used in the Information Boards of Geosites at Jeonbuk West Coast National Geopark (전북 서해안권 국가지질공원 지질명소 안내 표지판에 사용된 용어 분석)

  • Shin, Young-Jun;Cho, Kyu-Seong
    • Journal of the Korean earth science society
    • /
    • v.41 no.1
    • /
    • pp.40-47
    • /
    • 2020
  • The purpose of this study was to analyze the terms used in the Information Boards of Geosites at Jeonbuk West Coast National Geopark. Among the terms used in the Information Boards, nouns were extracted and listed based on the Standard Korean Language Dictionary, a glossary of earth and the data for the development of textbooks according to the 2015 revision of curriculum, by which eight types were classified. Seventy-one nouns (10.8%) of the extracted terms were not listed in any glossary. Most of these terms were compound words derived by combining [noun]+[noun] or [noun]+[affix] so that they were not easy to comprehend. In addition, two hundred fifty-six nouns (46%) of the terms were identified as jargons used in specific disciplines. Therefore, it is strongly suggested that when creating the National Geopark Information Boards, the academic jargon embedded terminologies be explained with annotation for general public visitors and students to understand without difficulty.

An Analysis of Earth Science Vocabularies Used in the 10th Grade Science Textbooks (10학년 과학 교과서 지구과학 용어 분석)

  • Choi, Haeng-Im;Lee, Hyon-Yong;Cho, Hyun-Jun
    • Journal of the Korean earth science society
    • /
    • v.29 no.4
    • /
    • pp.363-371
    • /
    • 2008
  • The purposes of this study were to analyze the level of Earth science vocabularies in the 10th grade textbooks with the Science Word Analysis (SWA) program and to investigate the vocabularies selected by the 10th grade students as difficult ones. For this purpose, we extracted the Earth science vocabularies from eleven textbooks, and classified into scientific and non-scientific vocabularies with the SWA program based on the standard Korean language dictionary. In addition, we investigated the difficulty of each vocabulary by surveying five hundred sixty students with a questionnaire. Results showed that the frequency of the scientific vocabularies that were beyond the designated level was the largest among any other levels in all textbooks. Most of the vocabularies selected by students as difficult ones to understand were classified into out of the level. From these results, it were suggested that the students' cognitive level should be considered when developing science textbooks and difficult vocabularies should be replaced with easy ones without a change of meaning.

A Study on the Chinese Characters Originated in Japan in Japanes in Industrial Standard (일본공업규격 "정보교환용한자부호" 에 포함된 일본한자에 대한 연구)

  • Lee Choon-Tack
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.22
    • /
    • pp.219-257
    • /
    • 1992
  • Among the Chinese Characters originated in Japan, some of them are very ancient in their origin and others come to exist as different forms by being used widely in forged books in Chinese. These Characters can be divided into three groups. First, the Chinese Characters whose forms are different. Most of these are 'hoiui' (회의)character, being made by imitating the forms of the original Chinese Letters. These characters do have meaning but not pronunciation. This is one distinct feature of Chinese Characters originated in Japan. Second, the Chinese Characters whose meaning has been assigned by the Japanese people. These letters can be grouped into two. One is the letters whose meanings are entirely different from original Chinese Characters, and the other is the letters whose meanings are not known although their pronunciations are known. It can be explained that the letters with different forms are made because of the ignorance of letter's existence. Or, the letters were made on purpose in ordoer to be used in different meanings. Third, the Characters with a partial modification of original Chinese Characters. Among the Characters in three groups above, pure Japanese-made Chinese Characters are those in group one and three since those in group two are Chinese Letters whose meanings (or pronunciation) only are Japanese. As a results of detailed investigation of pure Japanese-made Chinese Character in JIS X 0208-1990, the followings are discovered: 1. Pure Japanese-made Chinese Characters are 147 in numbers. 2. The Characters which were originally Chinese but now considered to be Japanese-made are 5 in numbers. Among these letters, 39 Characters are not listed in TaeHanHwaSaJon(Whose fame is well known as the authoritative dictionary of Chinese Characters), 47 Characters are not found in the dictionaries of Chinese Characters compiled in Korea. 3. 14 Characters seem to be Japanese-made Chinese Characters although it cannot be said so with accuracy because of various meanings found in several dictionaries of Chinese Characrters.

  • PDF

Design of Web Application Framework Using REDIS for Class Management (REDIS를 활용한 학급경영 웹 애플리케이션 프레임워크의 설계)

  • Park, Joonseok;Chun, Seokju
    • Journal of The Korean Association of Information Education
    • /
    • v.18 no.3
    • /
    • pp.381-390
    • /
    • 2014
  • It is a general tendency that a traditional class management system is operated by teachers' arbitrary decision. However, at the present day, it is needed that both students and teachers make an autonomy standard manual and manage the class by themselves in order to cultivate the qualified democratic citizen. Therefore existing class management systems do not meet present classes with diversity. In this paper, we design a new web application framework using REDIS(Remote Dictionary System) for class management. REDIS is a kind of data repository to store various key values and also generally provides a solution of developing web applications with shared memories. We designed a web application framework to maximize both convenience of use and accessibility. The scalability of the class management system can be effectively enhanced using diverse template functions which are basically provided by the framework.

Verb Sense Disambiguation using Subordinating Case Information (종속격 정보를 적용한 동사 의미 중의성 해소)

  • Park, Yo-Sep;Shin, Joon-Choul;Ock, Cheol-Young;Park, Hyuk-Ro
    • The KIPS Transactions:PartB
    • /
    • v.18B no.4
    • /
    • pp.241-248
    • /
    • 2011
  • Homographs can have multiple senses. In order to understand the meaning of a sentence, it is necessary to identify which sense isused for each word in the sentence. Previous researches on this problem heavily relied on the word co-occurrence information. However, we noticed that in case of verbs, information about subordinating cases of verbs can be utilized to further improve the performance of word sense disambiguation. Different senses require different sets of subordinating cases. In this paper, we propose the verb sense disambiguation using subordinating case information. The case information acquire postposition features in Standard Korean Dictionary. Our experiment on 12 high-frequency verb homographs shows that adding case information can improve the performance of word sense disambiguation by 1.34%, from 97.3% to 98.7%. The amount of improvement may seem marginal, we think it is meaningful because the error ratio reduced to less than a half, from 2.7% to 1.3%.

Recommending Core and Connecting Keywords of Research Area Using Social Network and Data Mining Techniques (소셜 네트워크와 데이터 마이닝 기법을 활용한 학문 분야 중심 및 융합 키워드 추천 서비스)

  • Cho, In-Dong;Kim, Nam-Gyu
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.127-138
    • /
    • 2011
  • The core service of most research portal sites is providing relevant research papers to various researchers that match their research interests. This kind of service may only be effective and easy to use when a user can provide correct and concrete information about a paper such as the title, authors, and keywords. However, unfortunately, most users of this service are not acquainted with concrete bibliographic information. It implies that most users inevitably experience repeated trial and error attempts of keyword-based search. Especially, retrieving a relevant research paper is more difficult when a user is novice in the research domain and does not know appropriate keywords. In this case, a user should perform iterative searches as follows : i) perform an initial search with an arbitrary keyword, ii) acquire related keywords from the retrieved papers, and iii) perform another search again with the acquired keywords. This usage pattern implies that the level of service quality and user satisfaction of a portal site are strongly affected by the level of keyword management and searching mechanism. To overcome this kind of inefficiency, some leading research portal sites adopt the association rule mining-based keyword recommendation service that is similar to the product recommendation of online shopping malls. However, keyword recommendation only based on association analysis has limitation that it can show only a simple and direct relationship between two keywords. In other words, the association analysis itself is unable to present the complex relationships among many keywords in some adjacent research areas. To overcome this limitation, we propose the hybrid approach for establishing association network among keywords used in research papers. The keyword association network can be established by the following phases : i) a set of keywords specified in a certain paper are regarded as co-purchased items, ii) perform association analysis for the keywords and extract frequent patterns of keywords that satisfy predefined thresholds of confidence, support, and lift, and iii) schematize the frequent keyword patterns as a network to show the core keywords of each research area and connecting keywords among two or more research areas. To estimate the practical application of our approach, we performed a simple experiment with 600 keywords. The keywords are extracted from 131 research papers published in five prominent Korean journals in 2009. In the experiment, we used the SAS Enterprise Miner for association analysis and the R software for social network analysis. As the final outcome, we presented a network diagram and a cluster dendrogram for the keyword association network. We summarized the results in Section 4 of this paper. The main contribution of our proposed approach can be found in the following aspects : i) the keyword network can provide an initial roadmap of a research area to researchers who are novice in the domain, ii) a researcher can grasp the distribution of many keywords neighboring to a certain keyword, and iii) researchers can get some idea for converging different research areas by observing connecting keywords in the keyword association network. Further studies should include the following. First, the current version of our approach does not implement a standard meta-dictionary. For practical use, homonyms, synonyms, and multilingual problems should be resolved with a standard meta-dictionary. Additionally, more clear guidelines for clustering research areas and defining core and connecting keywords should be provided. Finally, intensive experiments not only on Korean research papers but also on international papers should be performed in further studies.

An Analysis Study on the Contents of Occupation in Technology & Home Economics Textbooks for Middle School : focusing on preparation for Low Birthrate & Aging Society (저출산·고령사회 대비 관점에서 중학교 기술·가정 교과서에 제시된 직업 내용 분석)

  • Lee, Soo-jeong
    • Journal of vocational education research
    • /
    • v.37 no.1
    • /
    • pp.139-156
    • /
    • 2018
  • This study analyzed the aspect of occupational contents shown in total 24 types of Technology & Home Economics (1). (2) textbooks for middle school in accordance with the 2009 revised curriculum. Analyzing the type of occupation shown in textbooks based on the Korean Standard Classification of Occupations(hierarchical classification), the frequency(ratio), and the aspect of occupational contents in each unit of textbooks and each data type, this study provided basic data to be able to understand diverse aspects of occupational contents. In the results of study, in case of Technology & Home Economics textbooks for middle school, the large area of 'home life' presented occupational contents in the relatively high ratio than the 'world of technology' while the frequency(ratio) of occupational contents was very much different in each publisher, large area, and unit. The occupation name presented in textbooks provided very limited information in the level of 5.27% of occupations presented in the Korea Dictionary of Occupations. Especially, providing occupational information concentrated in professionals & relevant practiceans in the type of the Korean Standard Classification of Occupations(hierarchical classification), it was limited to provide opportunities to learn the diversity related to occupation. Based on such results of study, on top of introducing diverse occupational contents to make students cherish all the occupations, it would be also necessary to seek for institutional measures related to textbook development/career education, so that they could explore career by considering their aptitude and interest.