• Title/Summary/Keyword: Korean Unabridged Dictionary

Search Result 6, Processing Time 0.022 seconds

A Comparative Study of Mathematical Terms in Korean Standard Unabridged Dictionary and the Editing Material (표준국어대사전과 편수자료의 수학 용어 비교 조사)

  • Her, Min
    • Journal for History of Mathematics
    • /
    • v.33 no.4
    • /
    • pp.237-257
    • /
    • 2020
  • In this paper, we classify the mathematical terms in Korean Standard Unabridged Dictionary into four groups; ① group 1 consists of the terms which coincide with the mathematical terms in the 2015 Editing Material, ② group 2 consists of the terms which are synonyms or old terms or inflection forms of the mathematical terms in the Editing Material, ③ group 3 consists of the terms which do not belong to group 1 or group 2, but relate to the elementary or secondary school mathematics, ④ group 4 consists of the terms which do not relate to the elementary or secondary school mathematics. And then we make a comparative study with the mathematical terms in the Editing Material. In this study, we find out the mathematical terms in the Editing Material, but not in Korean Standard Unabridged Dictionary. And by using synonyms and old terms of the mathematical terms in the Editing Material we guess the rough tendency which terms belong to the Editing Material. By investigating the terms in group 3 and 4, we find out the mathematical terms which may belong to the Editing Material. We also find out the wrong or inconsistent explanations in Korean Standard Unabridged Dictionary.

Korean Word Sense Disambiguation using Dictionary and Corpus (사전과 말뭉치를 이용한 한국어 단어 중의성 해소)

  • Jeong, Hanjo;Park, Byeonghwa
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.1-13
    • /
    • 2015
  • As opinion mining in big data applications has been highlighted, a lot of research on unstructured data has made. Lots of social media on the Internet generate unstructured or semi-structured data every second and they are often made by natural or human languages we use in daily life. Many words in human languages have multiple meanings or senses. In this result, it is very difficult for computers to extract useful information from these datasets. Traditional web search engines are usually based on keyword search, resulting in incorrect search results which are far from users' intentions. Even though a lot of progress in enhancing the performance of search engines has made over the last years in order to provide users with appropriate results, there is still so much to improve it. Word sense disambiguation can play a very important role in dealing with natural language processing and is considered as one of the most difficult problems in this area. Major approaches to word sense disambiguation can be classified as knowledge-base, supervised corpus-based, and unsupervised corpus-based approaches. This paper presents a method which automatically generates a corpus for word sense disambiguation by taking advantage of examples in existing dictionaries and avoids expensive sense tagging processes. It experiments the effectiveness of the method based on Naïve Bayes Model, which is one of supervised learning algorithms, by using Korean standard unabridged dictionary and Sejong Corpus. Korean standard unabridged dictionary has approximately 57,000 sentences. Sejong Corpus has about 790,000 sentences tagged with part-of-speech and senses all together. For the experiment of this study, Korean standard unabridged dictionary and Sejong Corpus were experimented as a combination and separate entities using cross validation. Only nouns, target subjects in word sense disambiguation, were selected. 93,522 word senses among 265,655 nouns and 56,914 sentences from related proverbs and examples were additionally combined in the corpus. Sejong Corpus was easily merged with Korean standard unabridged dictionary because Sejong Corpus was tagged based on sense indices defined by Korean standard unabridged dictionary. Sense vectors were formed after the merged corpus was created. Terms used in creating sense vectors were added in the named entity dictionary of Korean morphological analyzer. By using the extended named entity dictionary, term vectors were extracted from the input sentences and then term vectors for the sentences were created. Given the extracted term vector and the sense vector model made during the pre-processing stage, the sense-tagged terms were determined by the vector space model based word sense disambiguation. In addition, this study shows the effectiveness of merged corpus from examples in Korean standard unabridged dictionary and Sejong Corpus. The experiment shows the better results in precision and recall are found with the merged corpus. This study suggests it can practically enhance the performance of internet search engines and help us to understand more accurate meaning of a sentence in natural language processing pertinent to search engines, opinion mining, and text mining. Naïve Bayes classifier used in this study represents a supervised learning algorithm and uses Bayes theorem. Naïve Bayes classifier has an assumption that all senses are independent. Even though the assumption of Naïve Bayes classifier is not realistic and ignores the correlation between attributes, Naïve Bayes classifier is widely used because of its simplicity and in practice it is known to be very effective in many applications such as text classification and medical diagnosis. However, further research need to be carried out to consider all possible combinations and/or partial combinations of all senses in a sentence. Also, the effectiveness of word sense disambiguation may be improved if rhetorical structures or morphological dependencies between words are analyzed through syntactic analysis.

Hanja Information in the Entries of Korean Unabridged Dictionary (국어대사전의 표제어에 나타나는 한자 정보)

  • Kim, Cheol-Su
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.4
    • /
    • pp.438-446
    • /
    • 2010
  • For language information processing that includes both Hangul and Hanja, an electronic dictionary supporting Hangul and Hanja simultaneously is necessary. This paper examined statistical information on Hanja entries of Korean Unabridged Dictionary such as the number of entries that include Hanja based on the KSC-5601 character set, the frequency of the pronunciation and meaning of each character of Hanja included in the entries, the frequency per part of speech of Hanja in entries and the average number of Hanja characters per entry. At least one or more of Hanja characters appear in 303,951 entries out of 440,594, accounting for 68.99% of the total. 858,595 characters of Hanja are included in the 440,594 entries, which is 1.95 Hanja characters per entry. As the average syllable length of the entries is 3.56 and the average count of the Hanja characters per entry is 1.96, it can be said that 54.7% of all the characters of the entries are in Hanja. Among 4,888 Hanja character codes, 4,660 are used once or more, whereas 228 Hanja codes never appear in any entry. There were 5 characters which appear more than 4,000 times. A total of 858,595 Hanja characters used in all the entries correspond to 471 Hangeul codes.

Selection of Korean General Vocabulary for Machine Readable Dictionaries (자연언어처리용 전자사전을 위한 한국어 기본어휘 선정)

  • 배희숙;이주호;시정곤;최기선
    • Language and Information
    • /
    • v.7 no.1
    • /
    • pp.41-54
    • /
    • 2003
  • According to Jeong Ho-seong (1999), Koreans use an average of only 20% of the 508,771 entries of the Korean standard unabridged dictionary. To establish MRD for natural language processing, it is necessary to select Korean lexical units that are used frequently and are considered as basic words. In this study, this selection process is done semi-automatically using the KAIST large corpus. Among about 220,000 morphemes extracted from the corpus of 40,000,000 eojeols, 50,637 morphemes (54,797 senses) are selected. In addition, the coverage of these morphemes in various texts is examined with two sub-corpora of different styles. The total coverage is 91.21 % in formal style and 93.24% in informal style. The coverage of 6,130 first degree morphemes is 73.64% and 81.45%, respectively.

  • PDF

A Study on the Formation of the 'Jeokbyeokdol (Red brick)' in Modern Korea (근대 적벽돌 생산사에 관한 연구)

  • Cho, Hong-Seok;Kim, Chung-Dong
    • Journal of architectural history
    • /
    • v.19 no.6
    • /
    • pp.99-120
    • /
    • 2010
  • According to it, a final goal of this study sets up 'Renovation of the Red brick architecture' and development of theoretical foundation and substantial conservation about Red brick architecture through historical records must be settled without delay. Firstly, it analyzes related terminology and adjusts brick architecture's history and features for architectural authenticity about Red brick architecture. It would study production and construction process of brick in korea. From analysis of records, brick of traditional meaning is 'Jeondol' and western brick of modern meaning is 'Red brick'. 'Brick' defines a common designation. This study shows definition of words based on documents published until 19th century and a korean language and architecture terms dictionary. In view of this results, the meaning of brick which combines different types extensively uses 'Chu', 'Jeon', 'Byeok' according to the purpose of use and the current of the times. In case of 'Jeon', it uses jointly different types such as '塼', '磚', '甎'. but '塼' is frequently used. Even though these words like 'byeok' used individual or combination types until the late 19th century, there is no use because of japanese terms in japanese colonial. After liberation, it was the term of the traditional brick. Brick is generally used through modern times. In an unabridged Korean language dictionary, it defines this term as orthodox korean '壁乭' and '?乭'. At that time of japanese colonial, 'Yeonwa(煉瓦)' used in combination with brick. Due to influence it, it partly uses until now but it is not in common use. Also, a Korean language dictionary contains transcription of 'Yeonwa' with same definition as 'Byeokdol'. In the other side, it results from translating japanese into Korean. It would make exact definition of 'Yeonwa'.

Playing with Rauschenberg: Re-reading Rebus (라우센버그와 게임하기-<리버스> 다시읽기)

  • Rhee, Ji-Eun
    • The Journal of Art Theory & Practice
    • /
    • no.2
    • /
    • pp.27-48
    • /
    • 2004
  • Robert Rauschenberg's artistic career has often been regarded as having reached its culmination when the artist won the first prize at the 1964 Venice Biennale. With this victory, Rauschenberg triumphantly entered the pantheon of all-American artists and firmly secured his position in the history of American art. On the other hand, despite the artist's ongoing new experiments in his art, the seemingly precocious ripeness in his career has led the critical discourses on Rauschenberg's art to the artist's early works, most of which were done in the mid-1950s and the 1960s. The crux of Rauschenberg criticism lies not only in focusing on the artist's 50's and 60's works, but also in its large dismissal of the significance of the imagery that the artist employed in his works. As art historians Roger Cranshaw and Adrian Lewis point out, the critical discourse of Rauschenberg either focuses on the formalist concerns on the picture plane, or relies on the "culturalist" interpretation of Rauschenberg's imagery which emphasizes the artist's "Americanness." Recently, a group of art historians centered around October has applied Charles Sanders Peirce's semiotics as art historical methodology and illuminated the indexical aspects of Rauschenberg's work. The semantic inquiry into Rauschenberg's imagery has also been launched by some art historians who seek the clues in the artist's personal context. The first half of this essay will examine the previous criticism on Rauschenberg's art and the other half will discuss the artist's 1955 work Rebus, which I think intersects various critical concerns of Rauschenberg's work, and yet defies the closure of discourses in one direction. The categories of signs in the semiotics of Charles Sanders Peirce and the discourse of Jean-Francois Lyotard will be used in discussing the meanings of Rebus, not to search for the semantic readings of the work, hut to make an analogy in terms of the paradoxical structures of both the work and the theory. The definitions of rebus is as follows: Rebus 1. a representation or words or syllables by pictures of object or by symbols whose names resemble the intended words or syllables in sound; also: a riddle made up wholly or in part of such pictures or symbols. 2. a badge that suggests the name of the person to whom it belongs. Webster's Third New International Dictionary of the English Language Unabridged. Since its creation in 1955, Robert Rauschenberg's Rebus has been one of the most intriguing works in the artist's oeuvre. This monumental 'combine' painting($6feet{\times}10feet$ 10.5 inches) consists of three panels covered with fabric, paper, newspaper, and printed reproductions. On top of these, oil paints, pencil and crayon drawings connect each section into a whole. The layout of the images is overall horizontal. Starting from a torn election poster, which is partially read as "THAT REPRE," on the far left side of the painting. Rebus leads us to proceed from the left to the right, the typical direction of reading in a Western context. Along with its seemingly proper title. Rebus, the painting has triggered many art historians to seek some semantic readings of it. These art historians painstakingly reconstruct the iconography based on the artist's interviews, (auto)biography, and artistic context of his works. The interpretation of Rebus varies from a 'image-by-image' collation with a word to a more general commentary on Rauschenberg's work overall, such as a work that "bridges between art and life." Despite the title's allusion to the legitimate purpose of the painting as a decoding of the imagery into sound, Rebus, I argue, actually hinders a reading of it. By reading through Peirce to Rauschenberg, I will delve into the subtle anxiety between words and images in their works. And on this basis, I suggest Rauschenberg's strategy in playing Rebus is to hide the meaning of the imagery rather than to disclose it.

  • PDF