• Title/Summary/Keyword: 자연 언어 처리

Search Result 429, Processing Time 0.03 seconds

A Usability Evaluation on the Visualization of Information Extraction Output (정보추출결과의 시각화 표현방법에 관한 이용성 평가 연구)

  • Lee Jee-Yeon
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.39 no.2
    • /
    • pp.287-304
    • /
    • 2005
  • The goal of this research is to evaluate the usability of visually browsing the automatically extracted information. A domain-independent information extraction system was used to extract information from news type texts to populate the visually browasable knowledge base. The information extraction system automatically generated Concept-Relation-Concept triples by applying various Natural Language Processing techniques to the text portion of the news articles. To visualize the information stored in the knowledge base, we used PersoanlBrain to develop a visualization portion of the user interface. PersonalBrain is a hyperbolic information visualization system, which enables the users to link information into a network of logical associations. To understand the usability of the visually browsable knowledge base, IS test subjects were observed while they use the visual interface and also interviewed afterward. By applying a qualitative test data analysis method. a number of usability Problems and further research directions were identified.

Design and Implementation of E-mail Client based on Automatic Feeling Recognition (인간의 감정을 자동 인식하는 전자메일 클라이언트의 설계 및 구현)

  • Kim, Na-young;Lee, Sang-kon
    • The Journal of Korean Association of Computer Education
    • /
    • v.12 no.2
    • /
    • pp.61-75
    • /
    • 2009
  • Modern day people can easily use an e-mail client for general communication, because of using Internet and cellular phone. The mail client for the purpose of private and business affair, advertisement, news searching, and business letter is widely used and has side effects. People could send an important document via an electronic mail client. It is important to support an e-mail client intelligent. We think that many kinds of techniques of natural language processing must be provided in the client with human's emotion. We consider to design a new mail client with six kinds of senders' emotional information; delight, angry, sad feeling and message to express, manner of talking, a discomfort index etc. Before sending an e-mail, we suggest a user to correct a bad word because we do not want to feel bad to a receiver. We present a proper process of sending/receiving for users with a new designed e-mail clients.

  • PDF

An Insight Study on Keyword of IoT Utilizing Big Data Analysis (빅데이터 분석을 활용한 사물인터넷 키워드에 관한 조망)

  • Nam, Soo-Tai;Kim, Do-Goan;Jin, Chan-Yong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.146-147
    • /
    • 2017
  • Big data analysis is a technique for effectively analyzing unstructured data such as the Internet, social network services, web documents generated in the mobile environment, e-mail, and social data, as well as well formed structured data in a database. The most big data analysis techniques are data mining, machine learning, natural language processing, and pattern recognition, which were used in existing statistics and computer science. Global research institutes have identified analysis of big data as the most noteworthy new technology since 2011. Therefore, companies in most industries are making efforts to create new value through the application of big data. In this study, we analyzed using the Social Matrics which a big data analysis tool of Daum communications. We analyzed public perceptions of "Internet of things" keyword, one month as of october 8, 2017. The results of the big data analysis are as follows. First, the 1st related search keyword of the keyword of the "Internet of things" has been found to be technology (995). This study suggests theoretical implications based on the results.

  • PDF

Word Sense Disambiguation of Predicate using Sejong Electronic Dictionary and KorLex (세종 전자사전과 한국어 어휘의미망을 이용한 용언의 어의 중의성 해소)

  • Kang, Sangwook;Kim, Minho;Kwon, Hyuk-chul;Jeon, SungKyu;Oh, Juhyun
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.7
    • /
    • pp.500-505
    • /
    • 2015
  • The Sejong Electronic(machine readable) Dictionary, which was developed by the 21 century Sejong Plan, contains a systematic of immanence information of Korean words. It helps in solving the problem of electronical presentation of a general text dictionary commonly used. Word sense disambiguation problems can also be solved using the specific information available in the Sejong Electronic Dictionary. However, the Sejong Electronic Dictionary has a limitation of suggesting structure of sentences and selection-restricted nouns. In this paper, we discuss limitations of word sense disambiguation by using subcategorization information as suggested by the Sejong Electronic Dictionary and generalize selection-restricted noun of argument using Korean Lexico-semantic network.

Structural Disambiguation using Mutual Information and the Measure of Confidence (상호 정보를 이용한 구조적 모호성 해소와 결과에 대한 확신도 측정)

  • 심광섭
    • Korean Journal of Cognitive Science
    • /
    • v.4 no.1
    • /
    • pp.153-176
    • /
    • 1993
  • Structual ambiguity is one of those problem that arise in the analysis of natural language sentences.It has been considered very difficult to solve the problem.Structural ambiguity,however,should be resolved no matter how difficult it may be.Otherwise natural language processing could be virtually impossible.A statistical approach to structural disambiguation is proposed in this dissertation.The information-theoretic concept of mutual information has been empolyed in resolving structural ambiguity Mutual information can be acquired in an automatic way.from text corpora. If a structural disambiguation subsystem had the capability of self-evaluating whether the results of structural disambiguation are correct or not.it would be possible to develop a more intelligent natural language proessing system.In this paper,the concept of confidence measure is also proposed to endow the disambiguation subsystem with such intelligence.Confidence measure is a numeric value calculated after structural disambiguation. Some experiments were performed in order to show the validity of the approach.Mutual information was auto matically acquired from a corpus of 1.6milion words that were collected from scientific abstracts.The accuracy of structural disambiguation was 80%when performed over 1,639 test sentences.Notice that there was no manual tuning in advance for the experiments.The task of detecting and correcting errors in structural disambiguation will be performed very effectively if the concept of confidence measure is employed in the process.

Voice Interactions with A. I. Agent : Analysis of Domestic and Overseas IT Companies (A.I.에이전트와의 보이스 인터랙션 : 국내외 IT회사 사례연구)

  • Lee, Seo-Young
    • Journal of Korea Entertainment Industry Association
    • /
    • v.15 no.4
    • /
    • pp.15-29
    • /
    • 2021
  • Many countries and companies are pursuing and developing Artificial intelligence as it is the core technology of the 4th industrial revolution. Global IT companies such as Apple, Microsoft, Amazon, Google and Samsung have all released their own AI assistant hardware products, hoping to increase customer loyalty and capture market share. Competition within the industry for AI agent is intense. AI assistant products that command the biggest market shares and customer loyalty have a higher chance of becoming the industry standard. This study analyzed the current status of major overseas and domestic IT companies in the field of artificial intelligence, and suggested future strategic directions for voice UI technology development and user satisfaction. In terms of B2B technology, it is recommended that IT companies use cloud computing to store big data, innovative artificial intelligence technologies and natural language technologies. Offering voice recognition technologies on the cloud enables smaller companies to take advantage of such technologies at considerably less expense. Companies also consider using GPT-3(Generative Pre-trained Transformer 3) an open source artificial intelligence language processing software that can generate very natural human-like interactions and high levels of user satisfaction. There is a need to increase usefulness and usability to enhance user satisfaction. This study has practical and theoretical implications for industry and academia.

A study on performance improvement considering the balance between corpus in Neural Machine Translation (인공신경망 기계번역에서 말뭉치 간의 균형성을 고려한 성능 향상 연구)

  • Park, Chanjun;Park, Kinam;Moon, Hyeonseok;Eo, Sugyeong;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.5
    • /
    • pp.23-29
    • /
    • 2021
  • Recent deep learning-based natural language processing studies are conducting research to improve performance by training large amounts of data from various sources together. However, there is a possibility that the methodology of learning by combining data from various sources into one may prevent performance improvement. In the case of machine translation, data deviation occurs due to differences in translation(liberal, literal), style(colloquial, written, formal, etc.), domains, etc. Combining these corpora into one for learning can adversely affect performance. In this paper, we propose a new Corpus Weight Balance(CWB) method that considers the balance between parallel corpora in machine translation. As a result of the experiment, the model trained with balanced corpus showed better performance than the existing model. In addition, we propose an additional corpus construction process that enables coexistence with the human translation market, which can build high-quality parallel corpus even with a monolingual corpus.

A Korean Homonym Disambiguation System Using Refined Semantic Information and Thesaurus (정제된 의미정보와 시소러스를 이용한 동형이의어 분별 시스템)

  • Kim Jun-Su;Ock Cheol-Young
    • The KIPS Transactions:PartB
    • /
    • v.12B no.7 s.103
    • /
    • pp.829-840
    • /
    • 2005
  • Word Sense Disambiguation(WSD) is one of the most difficult problem in Korean information processing. We propose a WSD model with the capability to filter semantic information using the specific characteristics in dictionary dictions, and nth added information, useful to sense determination, such as statistical, distance and case information. we propose a model, which can resolve the issues resulting from the scarcity of semantic information data based on the word hierarchy system (thesaurus) developed by Ulsan University's UOU Word Intelligent Network, a dictionary-based toxicological database. Among the WSD models elaborated by this study, the one using statistical information, distance and case information along with the thesaurus (hereinafter referred to as 'SDJ-X model') performed the best. In an experiment conducted on the sense-tagged corpus consisting of 1,500,000 eojeols, provided by the Sejong project, the SDJ-X model recorded improvements over the maximum frequency word sense determination (maximum frequency determination, MFC, accuracy baseline) of $18.87\%$ ($21.73\%$ for nouns and inter-eojeot distance weights by $10.49\%$ ($8.84\%$ for nouns, $11.51\%$ for verbs). Finally, the accuracy level of the SDJ-X model was higher than that recorded by the model using only statistical information, distance and case information, without the thesaurus by a margin of $6.12\%$ ($5.29\%$ for nouns, $6.64\%$ for verbs).

Detecting Errors in POS-Tagged Corpus on XGBoost and Cross Validation (XGBoost와 교차검증을 이용한 품사부착말뭉치에서의 오류 탐지)

  • Choi, Min-Seok;Kim, Chang-Hyun;Park, Ho-Min;Cheon, Min-Ah;Yoon, Ho;Namgoong, Young;Kim, Jae-Kyun;Kim, Jae-Hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.7
    • /
    • pp.221-228
    • /
    • 2020
  • Part-of-Speech (POS) tagged corpus is a collection of electronic text in which each word is annotated with a tag as the corresponding POS and is widely used for various training data for natural language processing. The training data generally assumes that there are no errors, but in reality they include various types of errors, which cause performance degradation of systems trained using the data. To alleviate this problem, we propose a novel method for detecting errors in the existing POS tagged corpus using the classifier of XGBoost and cross-validation as evaluation techniques. We first train a classifier of a POS tagger using the POS-tagged corpus with some errors and then detect errors from the POS-tagged corpus using cross-validation, but the classifier cannot detect errors because there is no training data for detecting POS tagged errors. We thus detect errors by comparing the outputs (probabilities of POS) of the classifier, adjusting hyperparameters. The hyperparameters is estimated by a small scale error-tagged corpus, in which text is sampled from a POS-tagged corpus and which is marked up POS errors by experts. In this paper, we use recall and precision as evaluation metrics which are widely used in information retrieval. We have shown that the proposed method is valid by comparing two distributions of the sample (the error-tagged corpus) and the population (the POS-tagged corpus) because all detected errors cannot be checked. In the near future, we will apply the proposed method to a dependency tree-tagged corpus and a semantic role tagged corpus.

A Study on Christian Educational Implications for 6 Key Competencies of 2015 Revised National Curriculum (2015 개정 교육과정의 6개 핵심역량에 대한 기독교교육적 함의)

  • Seo, Mikyoung
    • Journal of Christian Education in Korea
    • /
    • v.63
    • /
    • pp.221-253
    • /
    • 2020
  • The purpose of this study is to define the key competency as Christian(in another word: Christian key competency) and to interpret the six key competencies of the 2015 revised curriculum in a Christian educational way. Also as an alternative to the key competencies model of the 2015 revised curriculum, this study aims to materialize the formation of a Christian key competencies model based on Christian faith. This study derived 'faith' from the key competencies as Christian throughout preceding research analysis. The 'faith' of the key competencies as Christian means the ability to know oneself, and to know the world and God within the knowledge of the Bible (knowledge of God) in the personal relationship with God, and also it is the ability to think, judge, and act with biblical values, Christian world view, and Christian self-identity. The key competency 'faith' could be the basis (standard) of motivation, attitude and the value of all competencies in cultivation and exercise. The model of Christian key competencies has the structure in which each six key competencies become to be cultivated based on the Christian key competency called "faith." Based on the structure, the six key competencies of the 2015 revised curriculum were interpreted and explained from the perspective of Christian education. In the self-management competency, self-identity can be correctly formed in relations with transcendent God. In aesthetic emotional competency, the empathic understanding of human beings comes from the understanding of the image of God, the supreme beauty, the source of beauty. About the community competency, human community is the source of God who created the universe, human and all of things. It is because a Christian community is a community within the relationship of Trinity God, Nature and others. Therefore regions, countries, and the world become one community. Communication competency first stem from good attitudes toward yourself and others with respectful mind. It comes from an understanding of Christian human beings. Also, there is a need of having a common language for communications. The common language is the Bible that given to us for our communicative companionship. Through the language of the Bible, God made us to know about God, human being and the creative world, and also made us to continue to communicate with God, others and the world. For having the knowledge-information processing competency, a standard of value for the processing and utilization of knowledge and information is required. The standard should be the basis of moral and ethical values for human respect. About creative thinking competency, the source of creativity is God who created the world. Human beings who have the image of God own creative potential. As well as, creativity has different expression forms depending on individual preferences and interests, and different approaches will be made depending on each individual's importance and achievement. Individual creativity can be found through education, and it can be embodied by converging knowledge, skills and experience.