• Title/Summary/Keyword: contextual words

Search Result 60, Processing Time 0.028 seconds

An Efficient Correction Method for Misrecognized Words in Off-line Hangul Character Recognition (오프라인 한글 문자 인식을 위한 효율적인 오인식 단어 교정 방법)

  • Lee, Byeong-Hui;Kim, Tae-Gyun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.6
    • /
    • pp.1598-1606
    • /
    • 1996
  • In order to achieve high accuracy of off-line character recognition(OCR) systems, the recognized text must be processed through a post-processing stage using contextual information. In this paper, we reclassify Korean word classes in terms of OCR word correction. And we collect combinations of Korean particles(approximately 900) linguistic verbal from(around 800). We aggregate 9 Korean irregular verbal phrases defined from a Korean linguistic point of view. Using these Korean word information and a Head-tail method, we can correct misrecognized words. A Korean character recognizer demonstrates 93.7% correct character recognition without a post-processing stage. The entire recognition rate of our system with a post-processing stage exceeds 97% correct character recognition.

  • PDF

A Method for Detection and Correction of Pseudo-Semantic Errors Due to Typographical Errors (철자오류에 기인한 가의미 오류의 검출 및 교정 방법)

  • Kim, Dong-Joo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.10
    • /
    • pp.173-182
    • /
    • 2013
  • Typographical mistakes made in the writing process of drafts of electronic documents are more common than any other type of errors. The majority of these errors caused by mistyping are regarded as consequently still typo-errors, but a considerable number of them are developed into the grammatical errors and the semantic errors. Pseudo semantic errors among these errors due to typographical errors have more noticeable peculiarities than pure semantic errors between senses of surrounding context words within a sentence. These semantic errors can be detected and corrected by simple algorithm based on the co-occurrence frequency because of their prominent contextual discrepancy. I propose a method for detection and correction based on the co-occurrence frequency in order to detect semantic errors due to typo-errors. The co-occurrence frequency in proposed method is counted for only words with immediate dependency relation, and the cosine similarity measure is used in order to detect pseudo semantic errors. From the presented experimental results, the proposed method is expected to help improve the detecting rate of overall proofreading system by about 2~3%.

Emotion Analysis Using a Bidirectional LSTM for Word Sense Disambiguation (양방향 LSTM을 적용한 단어의미 중의성 해소 감정분석)

  • Ki, Ho-Yeon;Shin, Kyung-shik
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.197-208
    • /
    • 2020
  • Lexical ambiguity means that a word can be interpreted as two or more meanings, such as homonym and polysemy, and there are many cases of word sense ambiguation in words expressing emotions. In terms of projecting human psychology, these words convey specific and rich contexts, resulting in lexical ambiguity. In this study, we propose an emotional classification model that disambiguate word sense using bidirectional LSTM. It is based on the assumption that if the information of the surrounding context is fully reflected, the problem of lexical ambiguity can be solved and the emotions that the sentence wants to express can be expressed as one. Bidirectional LSTM is an algorithm that is frequently used in the field of natural language processing research requiring contextual information and is also intended to be used in this study to learn context. GloVe embedding is used as the embedding layer of this research model, and the performance of this model was verified compared to the model applied with LSTM and RNN algorithms. Such a framework could contribute to various fields, including marketing, which could connect the emotions of SNS users to their desire for consumption.

Metaverse Platform Customer Review Analysis Using Text Mining Techniques (텍스트 마이닝 기법을 활용한 메타버스 플랫폼 고객 리뷰 분석)

  • Hye Jin Kim;Jung Seung Lee;Soo Kyung Kim
    • Journal of Information Technology Applications and Management
    • /
    • v.31 no.1
    • /
    • pp.113-122
    • /
    • 2024
  • This comprehensive study delves into the analysis of user review data across various metaverse platforms, employing advanced text mining techniques such as TF-IDF and Word2Vec to gain insights into user perceptions. The primary objective is to uncover the factors that contribute to user satisfaction and dissatisfaction, thereby providing a nuanced understanding of user experiences in the metaverse. Through TF-IDF analysis, the research identifies key words and phrases frequently mentioned in user reviews, highlighting aspects that resonate positively with users, such as the ability to engage in creative activities and social interactions within these virtual environments. Word2Vec analysis further enriches this understanding by revealing the contextual relationships between words, offering a deeper insight into user sentiments and the specific features that enhance their engagement with the platforms. A significant finding of this study is the identification of common grievances among users, particularly related to the processes of refunds and login, which point to broader issues within payment systems and user interface designs across platforms. These insights are critical for developers and operators of metaverse platforms, suggesting a focused approach towards enhancing user experiences by amplifying positive aspects. The research underscores the importance of continuous improvement in user interface design and the transparency of payment systems to foster a loyal user base. By providing a comprehensive analysis of user reviews, this study offers valuable guidance for the strategic development and optimization of metaverse platforms, ensuring they remain responsive to user needs and continue to evolve as vibrant, engaging virtual environments.

Working Mechanisms of Organizational Ambidexterity for Creative Performance (창의적 성과를 제고하는 조직 양면성 구현양식에 대한 연구)

  • Kwon, Jung-Eon;Woo, Hyung-Rok
    • Knowledge Management Research
    • /
    • v.17 no.2
    • /
    • pp.51-73
    • /
    • 2016
  • The organizational ambidexterity has been emerging as the way to gain competitive advantage in turbulent environment. The concept of ambidexterity is simultaneously accomplishing the balance between the activities of exploration and exploitation, and overcoming their conflicting tension. The beneficial merits of ambidexterity has been investigated in innovation, financial performance, strategic management, and etc. Our study focused on the impact of ambidextrous activities on creative performance. Although three ambidextrous modes-structural ambidexterity, contextual ambidexterity, and sequential ambidexterity-have been already acknowledged, scant studies suggested the specific mechanisms to achieve ambidexterity in practice at the operating level. To address the issue we performed the semantic network analysis on the basis of the previous literatures prescribing ambidexterity theory. We took interview with 21 teams to explore behaviors of teams from the ambidextrous perspective, and then interpreted the relationship among words which appeared in the interview. This study found the appropriate mechanism which alleviate tension revealed by exploitation and exploration exist as practical reality. We demonstrated how these ambidextrous mechanisms can be used to generate the creative performance as well as examined various antecedents. These findings would contribute to the more fine-grained understanding of organizational ambidexterity, especially in conjunction with organizational creativity.

The Characteristics of Lesson Planning of Pre-service Elementary Teachers to Develop Scientific Communication Skills for Elementary School Students (초등학생의 과학적 의사소통 능력 함양을 위해 예비 초등교사들이 작성한 수업과정안의 특징)

  • Na, Jiyeon;Jang, Byung-Ghi
    • Journal of Korean Elementary Science Education
    • /
    • v.37 no.1
    • /
    • pp.54-65
    • /
    • 2018
  • The purpose of this study was to investigate the characteristics of lesson planning of pre-service elementary teachers to develop scientific communication skills for elementary school students. For this purpose, lesson plans and lesson planning journals written by the 53 pre-service teachers were collected and analyzed. The results of the research were as follows. The pre-service elementary teachers used an implicit and contextual approach to develop scientific communication skills. Teaching and learning activities for enhancing scientific communication were mainly conducted in words or in writing. There were many activities expressing elementary school students' thoughts and presenting the results of experiments in the lesson plan. There were many cases in which the pre-service teachers' lesson plans did not include the evaluation of scientific communication skills. In their lesson plans, there are a lot of mutual activities between teachers and students, team and whole students, and inter-students within teams together, and students' individual activities from requests of teachers. We found that the pre-service teachers had various difficulties when they planned the science lesson to develop scientific communication skills. The pre-service teachers were less likely to refer to specialized materials related to science education when planning their classes.

Lexical Semantic Information and Pitch Accent in English (영어 어휘 의미 정보와 피치 액센트)

  • Jeon, Yoon-Shil;Kim, Kee-Ho;Lee, Yong-Jae
    • Speech Sciences
    • /
    • v.10 no.3
    • /
    • pp.187-209
    • /
    • 2003
  • In this paper, we examine if the lexical information of the verb and its noun object affects the pitch accent patterns of the verb phrase focus. Three types of verb-object combinations with different semantic weights are discussed: when the verbs have optional direct objects, when the objects have the greater semantic weight relative to verbs, and when the verbs and the objects have equal semantic weight. Argument-structure-based works note that the pitch accent location in a focused phrase is closely related to the argument structure and contextual information. For example, it has been argued that contextually new noun objects receive accent while given noun objects don't. Contrary to nouns, verbs can be accented or not in verb phrase focus regardless of whether they are given information or new information (Selkirk 1984, 1992). However, the production experiment in this paper shows that the accenting of verbs is not fully optional, but influenced by the lexical semantic information of the verbs. The accenting of noun objects with given information is possible and the deaccenting of new noun objects also occurs depending on the lexical information of the noun objects. The results demonstrate that in addition to argument structure and information by means of context sentences, the lexical semantic information of words influences the pitch accent location in focused phrase.

  • PDF

Effects of mobile fashion shopping characteristics, perceived interactivity, and perceived usefulness on purchase intention (모바일 패션 쇼핑 특성과 지각된 상호작용성, 지각된 유용성이 구매의도에 미치는 영향)

  • Kim, Minjung;Shin, Suyun
    • The Research Journal of the Costume Culture
    • /
    • v.23 no.2
    • /
    • pp.228-241
    • /
    • 2015
  • The purpose of this study was to verify the effect of mobile fashion shopping characteristics and perceived interactivity on perceived usefulness, and the effect of perceived usefulness on purchase attitude and purchase intention based on TAM (Technology Acceptance Model). We conducted the survey targeting smartphone users in their 20s~30s living in Seoul and metropolitan area. Among 483 data collected, we used 452 samples except 31 unreliable respondents for the analysis. To analyze the structural equation model, we did factor analysis, reliability analysis, and structural equation model analysis using SPSS 18.0 and AMOS 16.0. The results were as follows: We confirmed 5 mobile fashion shopping characteristics (enjoyment, credibility, instant connectivity, security, and personalization) and 3 perceived interactivity factors (control, responsiveness and two-way communication, and contextual offer) as results of confirmative factor analysis. Mobile fashion shopping characteristics and perceived interactivity had positive effects on perceived usefulness. Mobile fashion shopping characteristics affected perceived interactivity and also had indirect effect on perceived usefulness via perceived interactivity. In other words, mobile fashion shopping characteristics had direct and indirect effect on perceived usefulness. Perceived usefulness influenced purchase attitude and purchase attitude influenced purchase intention. Perceived usefulness had direct effect on purchase intention and the indirect effect through purchase attitude was significant.

Zero-anaphora resolution in Korean based on deep language representation model: BERT

  • Kim, Youngtae;Ra, Dongyul;Lim, Soojong
    • ETRI Journal
    • /
    • v.43 no.2
    • /
    • pp.299-312
    • /
    • 2021
  • It is necessary to achieve high performance in the task of zero anaphora resolution (ZAR) for completely understanding the texts in Korean, Japanese, Chinese, and various other languages. Deep-learning-based models are being employed for building ZAR systems, owing to the success of deep learning in the recent years. However, the objective of building a high-quality ZAR system is far from being achieved even using these models. To enhance the current ZAR techniques, we fine-tuned a pretrained bidirectional encoder representations from transformers (BERT). Notably, BERT is a general language representation model that enables systems to utilize deep bidirectional contextual information in a natural language text. It extensively exploits the attention mechanism based upon the sequence-transduction model Transformer. In our model, classification is simultaneously performed for all the words in the input word sequence to decide whether each word can be an antecedent. We seek end-to-end learning by disallowing any use of hand-crafted or dependency-parsing features. Experimental results show that compared with other models, our approach can significantly improve the performance of ZAR.

Improving Bidirectional LSTM-CRF model Of Sequence Tagging by using Ontology knowledge based feature (온톨로지 지식 기반 특성치를 활용한 Bidirectional LSTM-CRF 모델의 시퀀스 태깅 성능 향상에 관한 연구)

  • Jin, Seunghee;Jang, Heewon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.253-266
    • /
    • 2018
  • This paper proposes a methodology applying sequence tagging methodology to improve the performance of NER(Named Entity Recognition) used in QA system. In order to retrieve the correct answers stored in the database, it is necessary to switch the user's query into a language of the database such as SQL(Structured Query Language). Then, the computer can recognize the language of the user. This is the process of identifying the class or data name contained in the database. The method of retrieving the words contained in the query in the existing database and recognizing the object does not identify the homophone and the word phrases because it does not consider the context of the user's query. If there are multiple search results, all of them are returned as a result, so there can be many interpretations on the query and the time complexity for the calculation becomes large. To overcome these, this study aims to solve this problem by reflecting the contextual meaning of the query using Bidirectional LSTM-CRF. Also we tried to solve the disadvantages of the neural network model which can't identify the untrained words by using ontology knowledge based feature. Experiments were conducted on the ontology knowledge base of music domain and the performance was evaluated. In order to accurately evaluate the performance of the L-Bidirectional LSTM-CRF proposed in this study, we experimented with converting the words included in the learned query into untrained words in order to test whether the words were included in the database but correctly identified the untrained words. As a result, it was possible to recognize objects considering the context and can recognize the untrained words without re-training the L-Bidirectional LSTM-CRF mode, and it is confirmed that the performance of the object recognition as a whole is improved.