• 제목/요약/키워드: Syntactic analysis

Search Result 263, Processing Time 0.017 seconds

Characteristics and Meanings of the SF Genre in Korea - From Propaganda of Modernization to Post-Human Discourse (한국 SF의 장르적 특징과 의의 -근대화에 대한 프로파간다부터 포스트휴먼 담론까지)

  • Lee, Ji-Yong
    • Journal of Popular Narrative
    • /
    • v.25 no.2
    • /
    • pp.33-69
    • /
    • 2019
  • This thesis aims to reveal the meanings of SF as a genre in Korea. Most of the studies on the characteristics of SF novels in Korea have revealed the meanings of characteristic elements of SF, or peripherally reviewed the characteristics of works. However, these methodologies have a limitation, such as analysis through the existing methodologies, while overlooking the identity of SF texts with the characteristics as a genre. To clearly define the value of texts in the SF genre, an understanding of the customs and codes of the genre is first needed. Thus, this thesis aims to generally handle matters like the historical context in which Korean SF was accepted by Korean society, and the meanings and characteristics when they were created and built up relationships with readers. In addition to fully investigating SF as a popular narrative & genre narrative that has not been fully handled by academic discourses, this thesis aims to practically reconsider the present/future possibilities of SF, which is currently being reconsidered given that the scientific imagination is regarded as important in the 21st century. This thesis considers the basic signification of Korean SF texts in academic discourses. Through this work, numerous Korean SF that have not been fully handled in the area of literature and cultural phenomena will be evaluated for their significance within the academic discourses, and also reviewed through diverse research afterwards. As a result, this work will be helpful for the development of discourse and the expansion of the Korean narrative area that has been diversely changed since the 21st century.

Korean Word Sense Disambiguation using Dictionary and Corpus (사전과 말뭉치를 이용한 한국어 단어 중의성 해소)

  • Jeong, Hanjo;Park, Byeonghwa
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.1-13
    • /
    • 2015
  • As opinion mining in big data applications has been highlighted, a lot of research on unstructured data has made. Lots of social media on the Internet generate unstructured or semi-structured data every second and they are often made by natural or human languages we use in daily life. Many words in human languages have multiple meanings or senses. In this result, it is very difficult for computers to extract useful information from these datasets. Traditional web search engines are usually based on keyword search, resulting in incorrect search results which are far from users' intentions. Even though a lot of progress in enhancing the performance of search engines has made over the last years in order to provide users with appropriate results, there is still so much to improve it. Word sense disambiguation can play a very important role in dealing with natural language processing and is considered as one of the most difficult problems in this area. Major approaches to word sense disambiguation can be classified as knowledge-base, supervised corpus-based, and unsupervised corpus-based approaches. This paper presents a method which automatically generates a corpus for word sense disambiguation by taking advantage of examples in existing dictionaries and avoids expensive sense tagging processes. It experiments the effectiveness of the method based on Naïve Bayes Model, which is one of supervised learning algorithms, by using Korean standard unabridged dictionary and Sejong Corpus. Korean standard unabridged dictionary has approximately 57,000 sentences. Sejong Corpus has about 790,000 sentences tagged with part-of-speech and senses all together. For the experiment of this study, Korean standard unabridged dictionary and Sejong Corpus were experimented as a combination and separate entities using cross validation. Only nouns, target subjects in word sense disambiguation, were selected. 93,522 word senses among 265,655 nouns and 56,914 sentences from related proverbs and examples were additionally combined in the corpus. Sejong Corpus was easily merged with Korean standard unabridged dictionary because Sejong Corpus was tagged based on sense indices defined by Korean standard unabridged dictionary. Sense vectors were formed after the merged corpus was created. Terms used in creating sense vectors were added in the named entity dictionary of Korean morphological analyzer. By using the extended named entity dictionary, term vectors were extracted from the input sentences and then term vectors for the sentences were created. Given the extracted term vector and the sense vector model made during the pre-processing stage, the sense-tagged terms were determined by the vector space model based word sense disambiguation. In addition, this study shows the effectiveness of merged corpus from examples in Korean standard unabridged dictionary and Sejong Corpus. The experiment shows the better results in precision and recall are found with the merged corpus. This study suggests it can practically enhance the performance of internet search engines and help us to understand more accurate meaning of a sentence in natural language processing pertinent to search engines, opinion mining, and text mining. Naïve Bayes classifier used in this study represents a supervised learning algorithm and uses Bayes theorem. Naïve Bayes classifier has an assumption that all senses are independent. Even though the assumption of Naïve Bayes classifier is not realistic and ignores the correlation between attributes, Naïve Bayes classifier is widely used because of its simplicity and in practice it is known to be very effective in many applications such as text classification and medical diagnosis. However, further research need to be carried out to consider all possible combinations and/or partial combinations of all senses in a sentence. Also, the effectiveness of word sense disambiguation may be improved if rhetorical structures or morphological dependencies between words are analyzed through syntactic analysis.

A quantitative study on the minimal pair of Korean phonemes: Focused on syllable-initial consonants (한국어 음소 최소대립쌍의 계량언어학적 연구: 초성 자음을 중심으로)

  • Jung, Jieun
    • Phonetics and Speech Sciences
    • /
    • v.11 no.1
    • /
    • pp.29-40
    • /
    • 2019
  • The paper investigates the minimal pair of Korean phonemes quantitatively. To achieve this goal, I calculated the number of consonant minimal pairs in the syllable-initial position as both raw counts and relative counts, and analyzed the part of speech relations of the two words in the minimal pair. "Urimalsaem" was chosen as the object of this study because it was judged that the minimal pair analysis should be done through a dictionary and it is the largest among Korean dictionaries. The results of the study are summarized as follows. First, there were 153 types of minimal pairs out of 337,135 examples. The ranking of phoneme pairs from highest to lowest was 'ㅅ-ㅈ, ㄱ-ㅅ, ㄱ-ㅈ, ㄱ-ㅂ, ㄱ-ㅎ, ${\ldots}$, ㅆ-ㅋ, ㄸ-ㅋ, ㅉ-ㅋ, ㄹ-ㅃ, ㅃ-ㅋ'. The phonemes that played a major role in the formation of the minimal pair were /ㄱ, ㅅ, ㅈ, ㅂ, ㅊ/, in that order, which showed a high proportion of palatals. The correlation between the raw count of minimal pairs and the relative count of minimal pairs was found to be quite high r=0.937. Second, 87.91% of the minimal pairs shared the part of speech (same syntactic category). The most frequently observed type has been 'noun-noun' pair (70.25%), and 'vowel-vowel' pair (14.77%) was the next ranking. It can be indicated that the minimal pair could be grouped into similar categories in terms of semantics. The results of this study can be useful for various research in Korean linguistics, speech-language pathology, language education, language acquisition, speech synthesis, and artificial intelligence-machine learning as basic data related to Korean phonemes.