• Title/Summary/Keyword: Knowledge extraction

Search Result 381, Processing Time 0.029 seconds

An Analysis of Trends in Natural Language Processing Research in the Field of Science Education (과학교육 분야 자연어 처리 기법의 연구동향 분석)

  • Cheolhong Jeon;Suna Ryu
    • Journal of The Korean Association For Science Education
    • /
    • v.44 no.1
    • /
    • pp.39-55
    • /
    • 2024
  • This study aimed to examine research trends related to Natural Language Processing (NLP) in science education by analyzing 37 domestic and international documents that utilized NLP techniques in the field of science education from 2011 to September 2023. In particular, the study systematically analyzed the content, focusing on the main application areas of NLP techniques in science education, the role of teachers when utilizing NLP techniques, and a comparison of domestic and international perspectives. The analysis results are as follows: Firstly, it was confirmed that NLP techniques are significantly utilized in formative assessment, automatic scoring, literature review and classification, and pattern extraction in science education. Utilizing NLP in formative assessment allows for real-time analysis of students' learning processes and comprehension, reducing the burden on teachers' lessons and providing accurate, effective feedback to students. In automatic scoring, it contributes to the rapid and precise evaluation of students' responses. In literature review and classification using NLP, it helps to effectively analyze the topics and trends of research related to science education and student reports. It also helps to set future research directions. Utilizing NLP techniques in pattern extraction allows for effective analysis of commonalities or patterns in students' thoughts and responses. Secondly, the introduction of NLP techniques in science education has expanded the role of teachers from mere transmitters of knowledge to leaders who support and facilitate students' learning, requiring teachers to continuously develop their expertise. Thirdly, as domestic research on NLP is focused on literature review and classification, it is necessary to create an environment conducive to the easy collection of text data to diversify NLP research in Korea. Based on these analysis results, the study discussed ways to utilize NLP techniques in science education.

Detection of Protein Subcellular Localization based on Syntactic Dependency Paths (구문 의존 경로에 기반한 단백질의 세포 내 위치 인식)

  • Kim, Mi-Young
    • The KIPS Transactions:PartB
    • /
    • v.15B no.4
    • /
    • pp.375-382
    • /
    • 2008
  • A protein's subcellular localization is considered an essential part of the description of its associated biomolecular phenomena. As the volume of biomolecular reports has increased, there has been a great deal of research on text mining to detect protein subcellular localization information in documents. It has been argued that linguistic information, especially syntactic information, is useful for identifying the subcellular localizations of proteins of interest. However, previous systems for detecting protein subcellular localization information used only shallow syntactic parsers, and showed poor performance. Thus, there remains a need to use a full syntactic parser and to apply deep linguistic knowledge to the analysis of text for protein subcellular localization information. In addition, we have attempted to use semantic information from the WordNet thesaurus. To improve performance in detecting protein subcellular localization information, this paper proposes a three-step method based on a full syntactic dependency parser and WordNet thesaurus. In the first step, we constructed syntactic dependency paths from each protein to its location candidate, and then converted the syntactic dependency paths into dependency trees. In the second step, we retrieved root information of the syntactic dependency trees. In the final step, we extracted syn-semantic patterns of protein subtrees and location subtrees. From the root and subtree nodes, we extracted syntactic category and syntactic direction as syntactic information, and synset offset of the WordNet thesaurus as semantic information. According to the root information and syn-semantic patterns of subtrees from the training data, we extracted (protein, localization) pairs from the test sentences. Even with no biomolecular knowledge, our method showed reasonable performance in experimental results using Medline abstract data. Our proposed method gave an F-measure of 74.53% for training data and 58.90% for test data, significantly outperforming previous methods, by 12-25%.

A Fast Iris Region Finding Algorithm for Iris Recognition (홍채 인식을 위한 고속 홍채 영역 추출 방법)

  • 송선아;김백섭;송성호
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.9
    • /
    • pp.876-884
    • /
    • 2003
  • It is essential to identify both the pupil and iris boundaries for iris recognition. The circular edge detector proposed by Daugman is the most common and powerful method for the iris region extraction. The method is accurate but requires lots of computational time since it is based on the exhaustive search. Some heuristic methods have been proposed to reduce the computational time, but they are not as accurate as that of Daugman. In this paper, we propose a pupil and iris boundary finding algorithm which is faster than and as accurate as that of Daugman. The proposed algorithm searches the boundaries using the Daugman's circular edge detector, but reduces the search region using the problem domain knowledge. In order to find the pupil boundary, the search region is restricted in the maximum and minimum bounding circles in which the pupil resides. The bounding circles are obtained from the binarized pupil image. Two iris boundary points are obtained from the horizontal line passing through the center of the pupil region obtained above. These initial boundary points, together with the pupil point comprise two bounding circles. The iris boundary is searched in this bounding circles. Experiments show that the proposed algorithm is faster than that of Daugman and more accurate than the conventional heuristic methods.

Effect of Rule Identification in Acquiring Rules from Web Pages (웹 페이지의 내재 규칙 습득 과정에서 규칙식별 역할에 대한 효과 분석)

  • Kang, Ju-Young;Lee, Jae-Kyu;Park, Sang-Un
    • Journal of Intelligence and Information Systems
    • /
    • v.11 no.1
    • /
    • pp.123-151
    • /
    • 2005
  • In the world of Web pages, there are oceans of documents in natural language texts and tables. To extract rules from Web pages and maintain consistency between them, we have developed the framework of XRML(extensible Rule Markup Language). XRML allows the identification of rules on Web pages and generates the identified rules automatically. For this purpose, we have designed the Rule Identification Markup Language (RIML) that is similar to the formal Rule Structure Markup Language (RSML), both as pares of XRML. RIML is designed to identify rules not only from texts, but also from tables on Web pages, and to transform to the formal rules in RSは syntax automatically. While designing RIML, we considered the features of sharing variables and values, omitted terms, and synonyms. Using these features, rules can be identified or changed once, automatically generating their corresponding RSML rules. We have conducted an experiment to evaluate the effect of the RIML approach with real world Web pages of Amazon.com, BamesandNoble.com, and Powells.com We found that $97.7\%$ of the rules can be detected on the Web pages, and the completeness of generated rule components is $88.5\%$. This is good proof that XRML can facilitate the extraction and maintenance of rules from Web pages while building expert systems in the Semantic Web environment.

  • PDF

Using the METHONTOLOGY Approach to a Graduation Screen Ontology Development: An Experiential Investigation of the METHONTOLOGY Framework

  • Park, Jin-Soo;Sung, Ki-Moon;Moon, Se-Won
    • Asia pacific journal of information systems
    • /
    • v.20 no.2
    • /
    • pp.125-155
    • /
    • 2010
  • Ontologies have been adopted in various business and scientific communities as a key component of the Semantic Web. Despite the increasing importance of ontologies, ontology developers still perceive construction tasks as a challenge. A clearly defined and well-structured methodology can reduce the time required to develop an ontology and increase the probability of success of a project. However, no reliable knowledge-engineering methodology for ontology development currently exists; every methodology has been tailored toward the development of a particular ontology. In this study, we developed a Graduation Screen Ontology (GSO). The graduation screen domain was chosen for the several reasons. First, the graduation screen process is a complicated task requiring a complex reasoning process. Second, GSO may be reused for other universities because the graduation screen process is similar for most universities. Finally, GSO can be built within a given period because the size of the selected domain is reasonable. No standard ontology development methodology exists; thus, one of the existing ontology development methodologies had to be chosen. The most important considerations for selecting the ontology development methodology of GSO included whether it can be applied to a new domain; whether it covers a broader set of development tasks; and whether it gives sufficient explanation of each development task. We evaluated various ontology development methodologies based on the evaluation framework proposed by G$\acute{o}$mez-P$\acute{e}$rez et al. We concluded that METHONTOLOGY was the most applicable to the building of GSO for this study. METHONTOLOGY was derived from the experience of developing Chemical Ontology at the Polytechnic University of Madrid by Fern$\acute{a}$ndez-L$\acute{o}$pez et al. and is regarded as the most mature ontology development methodology. METHONTOLOGY describes a very detailed approach for building an ontology under a centralized development environment at the conceptual level. This methodology consists of three broad processes, with each process containing specific sub-processes: management (scheduling, control, and quality assurance); development (specification, conceptualization, formalization, implementation, and maintenance); and support process (knowledge acquisition, evaluation, documentation, configuration management, and integration). An ontology development language and ontology development tool for GSO construction also had to be selected. We adopted OWL-DL as the ontology development language. OWL was selected because of its computational quality of consistency in checking and classification, which is crucial in developing coherent and useful ontological models for very complex domains. In addition, Protege-OWL was chosen for an ontology development tool because it is supported by METHONTOLOGY and is widely used because of its platform-independent characteristics. Based on the GSO development experience of the researchers, some issues relating to the METHONTOLOGY, OWL-DL, and Prot$\acute{e}$g$\acute{e}$-OWL were identified. We focused on presenting drawbacks of METHONTOLOGY and discussing how each weakness could be addressed. First, METHONTOLOGY insists that domain experts who do not have ontology construction experience can easily build ontologies. However, it is still difficult for these domain experts to develop a sophisticated ontology, especially if they have insufficient background knowledge related to the ontology. Second, METHONTOLOGY does not include a development stage called the "feasibility study." This pre-development stage helps developers ensure not only that a planned ontology is necessary and sufficiently valuable to begin an ontology building project, but also to determine whether the project will be successful. Third, METHONTOLOGY excludes an explanation on the use and integration of existing ontologies. If an additional stage for considering reuse is introduced, developers might share benefits of reuse. Fourth, METHONTOLOGY fails to address the importance of collaboration. This methodology needs to explain the allocation of specific tasks to different developer groups, and how to combine these tasks once specific given jobs are completed. Fifth, METHONTOLOGY fails to suggest the methods and techniques applied in the conceptualization stage sufficiently. Introducing methods of concept extraction from multiple informal sources or methods of identifying relations may enhance the quality of ontologies. Sixth, METHONTOLOGY does not provide an evaluation process to confirm whether WebODE perfectly transforms a conceptual ontology into a formal ontology. It also does not guarantee whether the outcomes of the conceptualization stage are completely reflected in the implementation stage. Seventh, METHONTOLOGY needs to add criteria for user evaluation of the actual use of the constructed ontology under user environments. Eighth, although METHONTOLOGY allows continual knowledge acquisition while working on the ontology development process, consistent updates can be difficult for developers. Ninth, METHONTOLOGY demands that developers complete various documents during the conceptualization stage; thus, it can be considered a heavy methodology. Adopting an agile methodology will result in reinforcing active communication among developers and reducing the burden of documentation completion. Finally, this study concludes with contributions and practical implications. No previous research has addressed issues related to METHONTOLOGY from empirical experiences; this study is an initial attempt. In addition, several lessons learned from the development experience are discussed. This study also affords some insights for ontology methodology researchers who want to design a more advanced ontology development methodology.

An Automatic Extraction of English-Korean Bilingual Terms by Using Word-level Presumptive Alignment (단어 단위의 추정 정렬을 통한 영-한 대역어의 자동 추출)

  • Lee, Kong Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.6
    • /
    • pp.433-442
    • /
    • 2013
  • A set of bilingual terms is one of the most important factors in building language-related applications such as a machine translation system and a cross-lingual information system. In this paper, we introduce a new approach that automatically extracts candidates of English-Korean bilingual terms by using a bilingual parallel corpus and a basic English-Korean lexicon. This approach can be useful even though the size of the parallel corpus is small. A sentence alignment is achieved first for the document-level parallel corpus. We can align words between a pair of aligned sentences by referencing a basic bilingual lexicon. For unaligned words between a pair of aligned sentences, several assumptions are applied in order to align bilingual term candidates of two languages. A location of a sentence, a relation between words, and linguistic information between two languages are examples of the assumptions. An experimental result shows approximately 71.7% accuracy for the English-Korean bilingual term candidates which are automatically extracted from 1,000 bilingual parallel corpus.

The Background and Current Research Applied to Development of Korean Cosmetics Based on Traditional Asian Medicine (한국 한방화장품 발달 배경 및 연구 현황)

  • Cho, Gayoung;Park, Hyomin;Choi, Sowoong;Kwon, Leekyung;Cho, Sunga;Suh, Byungfhy;Kim, Namil
    • The Journal of Korean Medical History
    • /
    • v.30 no.2
    • /
    • pp.63-71
    • /
    • 2017
  • Traditional Asian medicine has an extensive evidence base built upon thousands of years of experience within Asia, of curing various diseases. Only recently, within the past two centuries, have modern medical scientists developed interest in traditional Asian medicine. Asian Medicine seems to be regarded only as an adjunctive medicine and viewed as alargely un-proven alternative medicine to complement western medicine, used in some cases to establish a new paradigm of "integrative medicine". This article reviews how Korean herbal cosmetics emerged by applying traditional Asian medicine to the science of cosmetics. The characteristics of Korean herbal cosmetics are examined through examples of history, concepts and traditions. With the advancements in biotechnology, studies are now being conducted on the dermatological effects and processing methods of herbal ingredients, including ginseng. The authors explain the current research on the identification on the active ingredients of herbs, extraction methods, and bio-processing of ingredients to improve the biological efficacies of herbs on the skin. A summary of studies focused on modern reinterpretations of ageing theories, such as 'Seven year aging cycle', are provided. In conclusion, the development of Korean cosmetics products are based on the accumulated knowledge of thousands of years of experience including; 1) practical heritage of traditional Asian medicines such as Donguibogam; 2) excellent medicinal plants, such as ginseng, which are native to Korea; and 3) innovative attempts to modernize materials, processes, and principles.

Effect of Bio-Oss grafts on tooth eruption: an experimental study in a canine model (Bio-Oss 골이식이 치아맹출에 미치는 영향에 관한 동물실험 연구)

  • Kim, Ji-Hun;Chang, Chae-Ri;Choi, Byung-Ho
    • Journal of the Korean Association of Oral and Maxillofacial Surgeons
    • /
    • v.36 no.6
    • /
    • pp.528-532
    • /
    • 2010
  • Introduction: There are few reports on tooth eruption through Bio-Oss grafts. To our knowledge, there are no reports on whether teeth can erupt normally through the grafts. The aim of this study was to examine the effect of Bio-Oss grafts on tooth eruption in a canine model. Materials and Methods: In five 10-week-old dogs, the deciduous third mandibular molars in one jaw quadrant of each animal were extracted and the fresh extraction sockets were then filled with Bio-Oss particles (experimental side). No such treatments were performed on the contralateral side (control side). A clinical and radiological evaluation was carried out every other week to evaluate the eruption level of the permanent third mandibular premolars and compare the eruption levels between the two sides. Results: At week 4 after the experiment, the permanent third premolars began to erupt on both sides. At week 12, the crown of the permanent third premolar emerged from the gingiva on both sides. At week 20, the permanent third premolars on both sides erupted enough to occlude the opposing teeth. No significant differences were found between the control and experimental sides in terms of the eruption speed of the permanent third molars. Conclusion: These findings demonstrate that the grafting of Bio-Oss particles into the alveolar bone defects does not affect tooth eruption.

The World as Seen from Venice (1205-1533) as a Case Study of Scalable Web-Based Automatic Narratives for Interactive Global Histories

  • NANETTI, Andrea;CHEONG, Siew Ann
    • Asian review of World Histories
    • /
    • v.4 no.1
    • /
    • pp.3-34
    • /
    • 2016
  • This introduction is both a statement of a research problem and an account of the first research results for its solution. As more historical databases come online and overlap in coverage, we need to discuss the two main issues that prevent 'big' results from emerging so far. Firstly, historical data are seen by computer science people as unstructured, that is, historical records cannot be easily decomposed into unambiguous fields, like in population (birth and death records) and taxation data. Secondly, machine-learning tools developed for structured data cannot be applied as they are for historical research. We propose a complex network, narrative-driven approach to mining historical databases. In such a time-integrated network obtained by overlaying records from historical databases, the nodes are actors, while thelinks are actions. In the case study that we present (the world as seen from Venice, 1205-1533), the actors are governments, while the actions are limited to war, trade, and treaty to keep the case study tractable. We then identify key periods, key events, and hence key actors, key locations through a time-resolved examination of the actions. This tool allows historians to deal with historical data issues (e.g., source provenance identification, event validation, trade-conflict-diplomacy relationships, etc.). On a higher level, this automatic extraction of key narratives from a historical database allows historians to formulate hypotheses on the courses of history, and also allow them to test these hypotheses in other actions or in additional data sets. Our vision is that this narrative-driven analysis of historical data can lead to the development of multiple scale agent-based models, which can be simulated on a computer to generate ensembles of counterfactual histories that would deepen our understanding of how our actual history developed the way it did. The generation of such narratives, automatically and in a scalable way, will revolutionize the practice of history as a discipline, because historical knowledge, that is the treasure of human experiences (i.e. the heritage of the world), will become what might be inherited by machine learning algorithms and used in smart cities to highlight and explain present ties and illustrate potential future scenarios and visionarios.

Isolation and Purification of Bioactive Materials Using High-Performance Counter-Current Chromatography (HPCCC) (고속역류크로마토그래피 기술을 이용한 생리활성 물질의 분리 및 정제)

  • Jung, Dong-Su;Shin, Hyun-Jae
    • KSBB Journal
    • /
    • v.25 no.3
    • /
    • pp.205-214
    • /
    • 2010
  • Many successive liquid-liquid extractions occur enabling purification of the crude material to occur. In high performance counter-current chromatography (HPCCC), crude material is partitioned between two immiscible layers of solvent phases. The stationary phase (SP) is retained by hydrodynamic force field effect and the mobile phase (MP) is pumped through the column. Purification occurs because of the different solubility of the components in the liquid mobile and stationary phases. There are many key benefits of liquid stationary phases such as high mass and volume injection loadings, total sample recovery, and easy scale-up. Many researchers showed that predictable scale-up from simple test is feasible with knowledge of the stationary phase retention for the planned process scale run. In this review we review the recent advances in HPCCC research and also describe the key applications such as natural products and synthetics (small or large molecules).