• Title/Summary/Keyword: Knowledge extraction

Search Result 384, Processing Time 0.044 seconds

A Research on Automatic Data Extract Method for Herbal Formula Combinations Using Herb and Dosage Terminology - Based on 『Euijongsonik』 - (본초 및 용량 용어를 이용한 방제구성 자동추출방법에 대한 연구 -『의종손익』을 중심으로-)

  • Keum, Yujeong;Lee, Byungwook;Eom, Dongmyung;Song, Jichung
    • Journal of Korean Medical classics
    • /
    • v.33 no.4
    • /
    • pp.67-81
    • /
    • 2020
  • Objectives : This research aims to suggest a automatic data extract method for herbal formula combinations from medical classics' texts. Methods : This research was carried out by using Access of Microsoft Office 365 in Windows 10 of Microsoft. The subject text for extraction was 『Euijongsonik』. Using data sets of herb and dosage terminology, herbal medicinals and their dosages were extracted. Afterwards, using the position value of the character string, the formula combinations were automatically extracted. Results :The PC environment of this research was Intel Core i7-1065G7 CPU 1.30GHz, with 8GB of RAM and a Windows 10 64bit operation system. Out of 6,115 verses, 19,277 herb-dosage combinations were extracted. Conclusions : In this research, it was demonstrated that in the case of classical texts that are available as data, knowledge on herbal medicine could be extracted without human or material resources. This suggests an applicability of classical text knowledge to clinical practice.

Rough Entropy-based Knowledge Reduction using Rough Set Theory (러프집합 이론을 이용한 러프 엔트로피 기반 지식감축)

  • Park, In-Kyoo
    • Journal of Digital Convergence
    • /
    • v.12 no.6
    • /
    • pp.223-229
    • /
    • 2014
  • In an attempt to retrieve useful information for an efficient decision in the large knowledge system, it is generally necessary and important for a refined feature selection. Rough set has difficulty in generating optimal reducts and classifying boundary objects. In this paper, we propose quick reduction algorithm generating optimal features by rough entropy analysis for condition and decision attributes to improve these restrictions. We define a new conditional information entropy for efficient feature extraction and describe procedure of feature selection to classify the significance of features. Through the simulation of 5 datasets from UCI storage, we compare our feature selection approach based on rough set theory with the other selection theories. As the result, our modeling method is more efficient than the previous theories in classification accuracy for feature selection.

KNOWLEDGE-BASED BOUNDARY EXTRACTION OF MULTI-CLASSES OBJECTS

  • Park, Hae-Chul;Shin, Ho-Chul;Lee, Jin-Sung;Cho, Ju-Hyun;Kim, Seong-Dae
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.1968-1971
    • /
    • 2003
  • We propose a knowledge-based algorithm for extracting an object boundary from low-quality image like the forward looking infrared image. With the multi-classes training data set, the global shape is modeled by multispace KL(MKL)[1] and curvature model. And the objective function for fitting the deformable boundary template represented by the shape model to true boundary in an input image is formulated by Bales rule. Simulation results show that our method has more accurateness in case of multi-classes training set and performs better in the sense of computation cost than point distribution model(PDM)[2]. It works well in distortion under the noise, pose variation and some kinds of occlusions.

  • PDF

A Regularity-Based Preprocessing Method for Collaborative Recommender Systems

  • Toledo, Raciel Yera;Mota, Yaile Caballero;Borroto, Milton Garcia
    • Journal of Information Processing Systems
    • /
    • v.9 no.3
    • /
    • pp.435-460
    • /
    • 2013
  • Recommender systems are popular applications that help users to identify items that they could be interested in. A recent research area on recommender systems focuses on detecting several kinds of inconsistencies associated with the user preferences. However, the majority of previous works in this direction just process anomalies that are intentionally introduced by users. In contrast, this paper is centered on finding the way to remove non-malicious anomalies, specifically in collaborative filtering systems. A review of the state-of-the-art in this field shows that no previous work has been carried out for recommendation systems and general data mining scenarios, to exactly perform this preprocessing task. More specifically, in this paper we propose a method that is based on the extraction of knowledge from the dataset in the form of rating regularities (similar to frequent patterns), and their use in order to remove anomalous preferences provided by users. Experiments show that the application of the procedure as a preprocessing step improves the performance of a data-mining task associated with the recommendation and also effectively detects the anomalous preferences.

Named Entity Recognition with Structural SVMs and Pegasos algorithm (Structural SVMs 및 Pegasos 알고리즘을 이용한 한국어 개체명 인식)

  • Lee, Chang-Ki;Jang, Myun-Gil
    • Korean Journal of Cognitive Science
    • /
    • v.21 no.4
    • /
    • pp.655-667
    • /
    • 2010
  • The named entity recognition task is one of the most important subtasks in Information Extraction. In this paper, we describe a Korean named entity recognition using structural Support Vector Machines (structural SVMs) and modified Pegasos algorithm. Using the proposed approach, we could achieve an 85.43% F1 and an 86.79% F1 for 15 named entity types on TV domain and sports domain, respectively. Moreover, we reduced the training time to 4% without loss of performance compared to Conditional Random Fields (CRFs).

  • PDF

An Expresson of Domain Searching Term Weight using Fuzzy (퍼지를 이용한 도메인 검색용어 중요성의 표시)

  • Jin, Hyun-Soo;Hong, You-Sik
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.9 no.4
    • /
    • pp.139-144
    • /
    • 2009
  • The leveling of technical internet domain term with its aim to accumulate knowledge that machine can comprehend, which has been used widely in recent years. If stratify domain term weight, we believe that machine can manage and analyze in formation on its own using the ontology. In this paper, we propose an algorithm that allows us to extract properties of ontology weight from structured information already existing in web documents. In particular by stratification of the domain knowledge that is composed of property information, we were able to make the algorithm better and improve the quality of extraction results. In our experiments with 50 thousands targeted documents, we were able to extract property information with 94% confidence.

  • PDF

Restructuring a Feed-forward Neural Network Using Hidden Knowledge Analysis (학습된 지식의 분석을 통한 신경망 재구성 방법)

  • Kim, Hyeon-Cheol
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.5
    • /
    • pp.289-294
    • /
    • 2002
  • It is known that restructuring feed-forward neural network affects generalization capability and efficiency of the network. In this paper, we introduce a new approach to restructure a neural network using abstraction of the hidden knowledge that the network has teamed. This method involves extracting local rules from non-input nodes and aggregation of the rules into global rule base. The extracted local rules are used for pruning unnecessary connections of local nodes and the aggregation eliminates any possible redundancies arid inconsistencies among local rule-based structures. Final network is generated by the global rule-based structure. Complexity of the final network is much reduced, compared to a fully-connected neural network and generalization capability is improved. Empirical results are also shown.

Analysis for Compatibility of Daehwangmokdan-tang and Its Pharmacological Activities (대황목란탕(大黃牧丹湯)의 배오(配伍)분석과 그 약리활성)

  • Kim, Do Hoy;Yoon, Michung;Shin, Soon Shik
    • Herbal Formula Science
    • /
    • v.26 no.1
    • /
    • pp.81-102
    • /
    • 2018
  • Objectives : I analysed daehwangmokdan-tang's compatibility principle by the system of chief, deputy, assistant, and envoy and investigated pharmacological activities by categorizing with chemical components, molecular level, cellular level, animal level and human level based on Korean and Chinese studies for this formula. Methods : Daehwangmokdan-tang's compatibiltity principle was examined by the system of chief, deputy, assistant, and envoy. I looked into studies that presented in Korea from 1956 to 2016 about daehwangmokdan-tang through KOREA INSTITUTE OF ORIENTAL MEDICINE, Korean medicine information system (OASIS) and in Chinese for 20 years about daehwangmokdan-tang through China National Knowledge Infrastructure, CNKI. Then classify into chemical components, molecular level, cellular level, animal level and human level to analyse. Results : According to the system of chief, deputy, assistant, and envoy, chief herb is Rhei Radix et Rhizoma and Moutan Cortex, deputy herb is Natrii Sulfas and Persicae Semen, assistant and envoy herbs are Trichosanthis Semen. The amount of extraction of paeonol, total anthraquinone, and conjugated anthraquinone from daehwangmokdan-tang with the formulation of the system of chief, deputy, assistant, and envoy was the highest, and in the formulation of chief herb and deputy herb, the extraction amount of paeonol and conjugated anthraquinone was the lowest. With other formulations, the amount of extraction of total anthraquinone and conjugated anthraquinone was improved, although the degree was different. In particular, when it is blended with Persicae Semen as a deputy herb, the extraction amount of total anthraquinone and conjugated anthraquinone of Rhei Radix et Rhizoma as a chief herb is greatly increased, and the extraction amount of paeonol is rather different, but it is lowered. It was found that the amount of Mg Ca K Na in daehwangmokdan-tang was the highest. Pharmacological activities can be detected in inflammatory mediators and enzymes for molecular level. For cellular level, it can be determined in lipopolysaccharide (LPS)-stimulated RAW 264.7 cell line. In mouse and rats for animal level and human level, in inflammatory diseases (acute appendicitis, acute pancreatitis, acute cholecystitis, acute abdominal disease, ect.), pharmacological activities was caught. Conclusions : From the above results, daehwangmokdan-tang is composed in line with the system of chief, deputy, assistant, and envoy, suggesting that there is certain rationality and scientific. Pharmacological activities of daehwangmokdan-tang are effective to anti-inflammation, improvement of sepsis, analgesic, muscle relaxation, and improvement of intestinal flora and its metabolites. Daehwangmokdan-tang is consistent with the action of diuresis and anti-inflammation in terms of the content of elements, suggesting that there is action of purging fire and removing blood stasis, defecation detoxification, cooling blood and clearing heat and activating blood and dispersing stasis.

Automatic Recognition and Normalization System of Korean Time Expression using the individual time units (시간의 단위별 처리를 이용한 자동화된 한국어 시간 표현 인식 및 정규화 시스템)

  • Seon, Choong-Nyoung;Kang, Sang-Woo;Seo, Jung-Yun
    • Korean Journal of Cognitive Science
    • /
    • v.21 no.4
    • /
    • pp.447-458
    • /
    • 2010
  • Time expressions are a very important form of information in different types of data. Thus, the recognition of a time expression is an important factor in the field of information extraction. However, most previously designed systems consider only a specific domain, because time expressions do not have a regular form and frequently include different ellipsis phenomena. We present a two-level recognition method consisting of extraction and transformation phases to achieve generality and portability. In the extraction phase, time expressions are extracted by atomic time units for extensibility. Then, in the transformation phase, omitted information is restored using basis time and prior knowledge. Finally, every complete atomic time unit is transformed into a normalized form. The proposed system can be used as a general-purpose system, because it has a language- and domain-independent architecture. In addition, this system performs robustly in noisy data like SMS data, which include various errors. For SMS data, the accuracies of time-expression extraction and time-expression normalization by using the proposed system are 93.8% and 93.2%, respectively. On the basis of these experimental results, we conclude that the proposed system shows high performance in noisy data.

  • PDF

Export Control System based on Case Based Reasoning: Design and Evaluation (사례 기반 지능형 수출통제 시스템 : 설계와 평가)

  • Hong, Woneui;Kim, Uihyun;Cho, Sinhee;Kim, Sansung;Yi, Mun Yong;Shin, Donghoon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.109-131
    • /
    • 2014
  • As the demand of nuclear power plant equipment is continuously growing worldwide, the importance of handling nuclear strategic materials is also increasing. While the number of cases submitted for the exports of nuclear-power commodity and technology is dramatically increasing, preadjudication (or prescreening to be simple) of strategic materials has been done so far by experts of a long-time experience and extensive field knowledge. However, there is severe shortage of experts in this domain, not to mention that it takes a long time to develop an expert. Because human experts must manually evaluate all the documents submitted for export permission, the current practice of nuclear material export is neither time-efficient nor cost-effective. Toward alleviating the problem of relying on costly human experts only, our research proposes a new system designed to help field experts make their decisions more effectively and efficiently. The proposed system is built upon case-based reasoning, which in essence extracts key features from the existing cases, compares the features with the features of a new case, and derives a solution for the new case by referencing similar cases and their solutions. Our research proposes a framework of case-based reasoning system, designs a case-based reasoning system for the control of nuclear material exports, and evaluates the performance of alternative keyword extraction methods (full automatic, full manual, and semi-automatic). A keyword extraction method is an essential component of the case-based reasoning system as it is used to extract key features of the cases. The full automatic method was conducted using TF-IDF, which is a widely used de facto standard method for representative keyword extraction in text mining. TF (Term Frequency) is based on the frequency count of the term within a document, showing how important the term is within a document while IDF (Inverted Document Frequency) is based on the infrequency of the term within a document set, showing how uniquely the term represents the document. The results show that the semi-automatic approach, which is based on the collaboration of machine and human, is the most effective solution regardless of whether the human is a field expert or a student who majors in nuclear engineering. Moreover, we propose a new approach of computing nuclear document similarity along with a new framework of document analysis. The proposed algorithm of nuclear document similarity considers both document-to-document similarity (${\alpha}$) and document-to-nuclear system similarity (${\beta}$), in order to derive the final score (${\gamma}$) for the decision of whether the presented case is of strategic material or not. The final score (${\gamma}$) represents a document similarity between the past cases and the new case. The score is induced by not only exploiting conventional TF-IDF, but utilizing a nuclear system similarity score, which takes the context of nuclear system domain into account. Finally, the system retrieves top-3 documents stored in the case base that are considered as the most similar cases with regard to the new case, and provides them with the degree of credibility. With this final score and the credibility score, it becomes easier for a user to see which documents in the case base are more worthy of looking up so that the user can make a proper decision with relatively lower cost. The evaluation of the system has been conducted by developing a prototype and testing with field data. The system workflows and outcomes have been verified by the field experts. This research is expected to contribute the growth of knowledge service industry by proposing a new system that can effectively reduce the burden of relying on costly human experts for the export control of nuclear materials and that can be considered as a meaningful example of knowledge service application.