• Title/Summary/Keyword: Natural Language Process

Search Result 244, Processing Time 0.024 seconds

Development and Evaluation of Ontology for Diagnosis in Oriental Medicine (한의진단 Ontology 구축과 평가)

  • Shin Sang-Woo;Jung Gil-San;Park Kyung-Mo;Kim Seon-Ho;Park Jong-Hyun
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.20 no.1
    • /
    • pp.202-208
    • /
    • 2006
  • The goal of this study is to develop knowledge representation method for the construction and evaluation of ontology for diagnosis in oriental medicine. To develop the expert system for decision making on diagnosis and treatment, the systematic and structural knowledge which can be processible in EMR(Electronic Medical Record) must be precedent, and the Computational Process which control the system as well. This study set up an ontology as a trial model to represent the oriental medical knowledge into the machine processible one. Protege 2.1 has been used to build the ontology, and the serialization format of our ontology is the XML document based on OWL. The components of oriental medical diagnosis was arranged with the combination of symptoms which belong to the certain symptom patterns. Then natural language which expresses the oriental medical diagnosis components were converted into the logical sentence, and individual characteristic symptoms into each values of specific properties. In addition to the study, the diagnosis software for oriental medicine was developed and it used the ontology which we developed. Sequently, we tested the software to confirm the appropriateness of ontology. The result of the test shows that diagnostic questions are automatically formulated according to the diagnosis components of this ontology and that as such diagnostic results are induced. Therefore, the ontology system in this study will be efficient to develop the diagnosis program and useful as a tool for doctors to make decision. But, it is not recommendable to apply the system to the clinical environment until the clear diagnosis standards are introduced, and the more reliable diagnosis program can be developed based on the more appropriate ontology mentioned above.

Filter-mBART Based Neural Machine Translation Using Parallel Corpus Filtering (병렬 말뭉치 필터링을 적용한 Filter-mBART기반 기계번역 연구)

  • Moon, Hyeonseok;Park, Chanjun;Eo, Sugyeong;Park, JeongBae;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.5
    • /
    • pp.1-7
    • /
    • 2021
  • In the latest trend of machine translation research, the model is pretrained through a large mono lingual corpus and then finetuned with a parallel corpus. Although many studies tend to increase the amount of data used in the pretraining stage, it is hard to say that the amount of data must be increased to improve machine translation performance. In this study, through an experiment based on the mBART model using parallel corpus filtering, we propose that high quality data can yield better machine translation performance, even utilizing smaller amount of data. We propose that it is important to consider the quality of data rather than the amount of data, and it can be used as a guideline for building a training corpus.

Worker Symptom-based Chemical Substance Estimation System Design Using Knowledge Base (지식베이스를 이용한 작업자 증상 기반 화학물질 추정 시스템 설계)

  • Ju, Yongtaek;Lee, Donghoon;Shin, Eunji;Yoo, Sangwoo;Shin, Dongil
    • Journal of the Korean Institute of Gas
    • /
    • v.25 no.3
    • /
    • pp.9-15
    • /
    • 2021
  • In this paper, a study on the construction of a knowledge base based on natural language processing and the design of a chemical substance estimation system for the development of a knowledge service for a real-time sensor information fusion detection system and symptoms of contact with chemical substances in industrial sites. The information on 499 chemical substances contact symptoms from the Wireless Information System for Emergency Responders(WISER) program provided by the National Institutes of Health(NIH) in the United States was used as a reference. AllegroGraph 7.0.1 was used, input triples are Cas No., Synonyms, Symptom, SMILES, InChl, and Formula. As a result of establishing the knowledge base, it was confirmed that 39 symptoms based on ammonia (CAS No: 7664-41-7) were the same as those of the WISER program. Through this, a method of establishing was proposed knowledge base for the symptom extraction process of the chemical substance estimation system.

Development of Supervised Machine Learning based Catalog Entry Classification and Recommendation System (지도학습 머신러닝 기반 카테고리 목록 분류 및 추천 시스템 구현)

  • Lee, Hyung-Woo
    • Journal of Internet Computing and Services
    • /
    • v.20 no.1
    • /
    • pp.57-65
    • /
    • 2019
  • In the case of Domeggook B2B online shopping malls, it has a market share of over 70% with more than 2 million members and 800,000 items are sold per one day. However, since the same or similar items are stored and registered in different catalog entries, it is difficult for the buyer to search for items, and problems are also encountered in managing B2B large shopping malls. Therefore, in this study, we developed a catalog entry auto classification and recommendation system for products by using semi-supervised machine learning method based on previous huge shopping mall purchase information. Specifically, when the seller enters the item registration information in the form of natural language, KoNLPy morphological analysis process is performed, and the Naïve Bayes classification method is applied to implement a system that automatically recommends the most suitable catalog information for the article. As a result, it was possible to improve both the search speed and total sales of shopping mall by building accuracy in catalog entry efficiently.

Creating Songs Using Note Embedding and Bar Embedding and Quantitatively Evaluating Methods (음표 임베딩과 마디 임베딩을 이용한 곡의 생성 및 정량적 평가 방법)

  • Lee, Young-Bae;Jung, Sung Hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.11
    • /
    • pp.483-490
    • /
    • 2021
  • In order to learn an existing song and create a new song using an artificial neural network, it is necessary to convert the song into numerical data that the neural network can recognize as a preprocessing process, and one-hot encoding has been used until now. In this paper, we proposed a note embedding method using notes as a basic unit and a bar embedding method that uses the bar as the basic unit, and compared the performance with the existing one-hot encoding. The performance comparison was conducted based on quantitative evaluation to determine which method produced a song more similar to the song composed by the composer, and quantitative evaluation methods used in the field of natural language processing were used as the evaluation method. As a result of the evaluation, the song created with bar embedding was the best, followed by note embedding. This is significant in that the note embedding and bar embedding proposed in this paper create a song that is more similar to the song composed by the composer than the existing one-hot encoding.

System for Supporting the Decision about the Possibility of Concluding the Civil Law Agreements for Medical, Therapeutic and Dental Services

  • Hnatchuk, Yelyzaveta;Hovorushchenko, Tetiana;Shteinbrekher, Daria;Kysil, Tetiana
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.10
    • /
    • pp.155-164
    • /
    • 2022
  • The review of known decisions showed that currently there are no systems and technologies for supporting the decision about the possibility of concluding the civil law agreements for medical, therapeutic and dental services. The paper models the decision-making support process on the possibility of concluding the civil law agreements for medical, therapeutic and dental services, which is the theoretical basis for the development of rules, methods and system for supporting the decision about the possibility of concluding the civil law agreements for medical, therapeutic and dental services. The paper also developed the system for supporting the decision about the possibility of concluding the civil law agreements for medical, therapeutic and dental services, which automatically and free determines the possibility or impossibility of concluding the corresponding civil law agreement for the provision of a corresponding medical service. In the case of formation of a conclusion about the possibility of concluding the agreement, further conclusion and signing of the corresponding agreement takes place. In the case of forming a conclusion about the impossibility of concluding the agreement, a request is made for finalizing the relevant agreement for the provision of the relevant medical service, indicating the reasons for the impossibility of concluding the agreement - missing essential conditions in the agreement. After finalization, the agreement can be analyzed again by the developed system for supporting the decision.

Technology of Decision-Making Support Regarding the Possibility of Donation and Transplantation Considering Civil Law

  • Hnatchuk, Yelyzaveta;Hovorushchenko, Tetiana;Drapak, Georgii;Kysil, Tetiana
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.9
    • /
    • pp.307-315
    • /
    • 2022
  • The review of known decision-making support systems and technologies regarding the possibility of donation and transplantation showed that currently there are no systems and technologies of decision-making support regarding the possibility of donation and transplantation considering civil law. The paper models the decision-making support process regarding the possibility of donation and transplantation, which is a theoretical basis for the development of rules, methods and technology of decision-making support regarding the possibility of donation and transplantation considering civil law. The paper also developed the technology of decision-making support regarding the possibility of donation and transplantation considering civil law as a component of the Unified State Information System for Organ and Tissue Transplantation, which automatically and free of charge determines the possibility/impossibility of donation and transplantation. In the case of the possibility of donation, the admissible type of donation is also determined - over-life or after-life donation - and data about potential donor is entered in the relevant Donor Register. In the case of the possibility of transplantation, if the recipient needs a transplant of one of the paired organs or a part of the organ/tissue, then data about potential recipient are entered in the Transplantation List from both over-life and after-life donor, otherwise, if the recipient needs a transplant of a non-paired organ or both paired organs, then data about potential recipient are entered only in the Transplantation List from after-life donor.

Identification of User Preference Factor Using Review Information (리뷰 정보를 활용한 이용자의 선호요인 식별에 관한 연구)

  • Song, Sungjeon;Shim, Jiyoung
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.3
    • /
    • pp.311-336
    • /
    • 2022
  • This study analyzed the contents of Goodreads review data, which is a social cataloging service with the participation of book users around the world, to identify the preference factors that affect book users' book recommendations in the library information service environment. To understand user preferences from a more detailed point of view, sub-datasets for each rating group, each book, and each user were constructed in the sample selection process. Stratified sampling was also performed based on the result of topic modeling of review text data to include various topics. As a result, a total of 90 preference factors belonging to 7 categories('Content', 'Character', 'Writing', 'Reading', 'Author', 'Story', 'Form') were identified. Also, the general preference factors revealed according to the ratings, as well as the patterns of preference factors revealed in books and users with clear likes and dislikes were identified. The results of this study are expected to contribute to more sophisticated recommendations in future recommendation systems by identifying specific aspects of user preference factors.

A Study on the Definition of Data Literacy for Elementary and Secondary Artificial Intelligence Education (초·중등 인공지능 교육을 위한 데이터 리터러시 정의 연구)

  • Kim, SeulKi;Kim, Taeyoung
    • 한국정보교육학회:학술대회논문집
    • /
    • 2021.08a
    • /
    • pp.59-67
    • /
    • 2021
  • The development of AI technology has brought about a big change in our lives. As AI's influence grows from life to society to the economy, the importance of education on AI and data is also growing. In particular, the OECD Education Research Report and various domestic information and curriculum studies address data literacy and present it as an essential competency. Looking at domestic and international studies, one can see that the definition of data literacy differs in its specific content and scope from researchers to researchers. Thus, the definition of major research related to data literacy was analyzed from various angles and derived from various angles. In key studies, Word2vec natural language processing methods, along with word frequency analysis used to define data literacy, are used to analyze semantic similarities and nominate them based on content elements of curriculum research to derive the definition of 'understanding and using data to process information'. Based on the definition of data literacy derived from this study, we hope that the contents will be revised and supplemented, and more research will be conducted to provide a good foundation for educational research that develops students' future capabilities.

  • PDF

A Study on Book Recovery Method Depending on Book Damage Levels Using Book Scan (북스캔을 이용한 도서 손상 단계에 따른 딥 러닝 기반 도서 복구 방법에 관한 연구)

  • Kyungho Seok;Johui Lee;Byeongchan Park;Seok-Yoon Kim;Youngmo Kim
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.4
    • /
    • pp.154-160
    • /
    • 2023
  • Recently, with the activation of eBook services, books are being published simultaneously as physical books and digitized eBooks. Paper books are more expensive than e-books due to printing and distribution costs, so demand for relatively inexpensive e-books is increasing. There are cases where previously published physical books cannot be digitized due to the circumstances of the publisher or author, so there is a movement among individual users to digitize books that have been published for a long time. However, existing research has only studied the advancement of the pre-processing process that can improve text recognition before applying OCR technology, and there are limitations to digitization depending on the condition of the book. Therefore, support for book digitization services depending on the condition of the physical book is needed. need. In this paper, we propose a method to support digitalization services according to the status of physical books held by book owners. Create images by scanning books and extract text information from the images through OCR. We propose a method to recover text that cannot be extracted depending on the state of the book using BERT, a natural language processing deep learning model. As a result, it was confirmed that the recovery method using BERT is superior when compared to RNN, which is widely used in recommendation technology.

  • PDF