• Title/Summary/Keyword: Document Format

Search Result 254, Processing Time 0.033 seconds

A School-tailored High School Integrated Science Q&A Chatbot with Sentence-BERT: Development and One-Year Usage Analysis (인공지능 문장 분류 모델 Sentence-BERT 기반 학교 맞춤형 고등학교 통합과학 질문-답변 챗봇 -개발 및 1년간 사용 분석-)

  • Gyeongmo Min;Junehee Yoo
    • Journal of The Korean Association For Science Education
    • /
    • v.44 no.3
    • /
    • pp.231-248
    • /
    • 2024
  • This study developed a chatbot for first-year high school students, employing open-source software and the Korean Sentence-BERT model for AI-powered document classification. The chatbot utilizes the Sentence-BERT model to find the six most similar Q&A pairs to a student's query and presents them in a carousel format. The initial dataset, built from online resources, was refined and expanded based on student feedback and usability throughout over the operational period. By the end of the 2023 academic year, the chatbot integrated a total of 30,819 datasets and recorded 3,457 student interactions. Analysis revealed students' inclination to use the chatbot when prompted by teachers during classes and primarily during self-study sessions after school, with an average of 2.1 to 2.2 inquiries per session, mostly via mobile phones. Text mining identified student input terms encompassing not only science-related queries but also aspects of school life such as assessment scope. Topic modeling using BERTopic, based on Sentence-BERT, categorized 88% of student questions into 35 topics, shedding light on common student interests. A year-end survey confirmed the efficacy of the carousel format and the chatbot's role in addressing curiosities beyond integrated science learning objectives. This study underscores the importance of developing chatbots tailored for student use in public education and highlights their educational potential through long-term usage analysis.

eXtensible Rule Markup Language (XRML): Design Principles and Application (확장형 규칙 표식 언어(eXtensible Rule Markup Language) : 설계 원리 및 응용)

  • 이재규;손미애;강주영
    • Journal of Intelligence and Information Systems
    • /
    • v.8 no.1
    • /
    • pp.141-157
    • /
    • 2002
  • extensible Markup Language (XML) is a new markup language for data exchange on the Internet. In this paper, we propose a language extensible Rule Markup Language (XRML) which is an extension of XML. The implicit rules embedded in the Web pages should be identifiable, interchangeable with structured rule format, and finally accessible by various applications. It is possible to realize by using XRML. In this light, Web based Knowledge Management Systems (KMS) can be integrated with rule-based expert systems. To meet this end, we propose the six design criteria: Expressional Completeness, Relevance Linkability, Polymorphous Consistency, Applicative Universality, Knowledge Integrability and Interoperability. Furthermore, we propose three components such as RIML (Rule Identification Markup Language), RSML (Rule Structure Markup Language) and RTML (Rule Triggering Markup Language), and the Document Type Definition DTD). We have designed the XRML version 0.5 as illustrated above, and developed its prototype named Form/XRML which is an automated form processing for disbursement of the research fund in the Korea Advanced Institute of Science and Technology (KAISI). Since XRML allows both human and software agent to use the rules, there is huge application potential. We expect that XRML can contribute to the progress of Semantic Web platforms making knowledge management and e-commerce more intelligent. Since there are many emerging research groups and vendors who investigate this issue, it will not take long to see XRML commercial products. Matured XRML applications may change the way of designing information and knowledge systems in the near future.

  • PDF

Preparation of Soil Input Files to a Crop Model Using the Korean Soil Information System (흙토람 데이터베이스를 활용한 작물 모델의 토양입력자료 생성)

  • Yoo, Byoung Hyun;Kim, Kwang Soo
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.19 no.3
    • /
    • pp.174-179
    • /
    • 2017
  • Soil parameters are required inputs to crop models, which estimate crop yield under a given environment condition. The Korean Soil Information System (KSIS), which provides detailed soil profile record of 390 soil series in the HTML (HyperText Markup Language) format, would be useful to prepare soil input files. Korean Soil Information System Processing Tool (KSISPT) was developed to aid generation of soil input data based on the KSIS database. Java was used to implement the tool that consists of a set of modules for parsing the HTML document of the KSIS, storing data required for preparing soil input file, calculating additional soil parameter, and writing soil input file to a local disk. Using the automated soil data preparation tool, about 940 soil input data were created for the DSSAT model and the ORYZA 2000 model, respectively. In combination with soil series distribution map at 30m resolution, spatial analysis of crop yield could be projected under climate change, which would help the development of adaptation strategies.

A Study on the Food Culture in the Early Joseon Dynasty through Gyemiseo (癸未書) (「계미서(癸未書)」를 통해 본 조선시대 초기의 음식문화에 대한 고찰)

  • Han, Bok-Ryo;Kim, Gwi-Young
    • Journal of the Korean Society of Food Culture
    • /
    • v.33 no.4
    • /
    • pp.307-321
    • /
    • 2018
  • This study will introduce the foods recorded in Gyemiseo and disclose the substantive characteristics of traditional Korean food in the early stage of the Joseon Dynasty. Gyemiseo is a cook book manuscript written in the Chinese language that was rebound into book format at the end of the Joseon Dynasty in 1911, some 358 years after it was originally written in the $163^{rd}$ year of the Joseon Dynasty (1554) While the majority of cook books begin with recipes for various types of wines and liquor followed by those for fermented sauces, fermented vegetables (such as kimchi), vinegars and storage methods, etc., Gyemiseo begins with recipes for fermented sauces, followed by recipes for various kimchis, how to make vinegars, main meals, side dishes, rice cakes and confectionaries, with recipes for wines and liquor introduced last. Therefore, it can be assumed that the methods of brewing wines and liquors were additionally recorded for bookbinding. There are a total of 128 recipes recorded in Gyemiseo, including 13 for fermented sauces, 14 for kimchi, 11 for the main meal, 26 for side dishes, three storage methods, four for rice cakes and confectionaries, and 44 for wines and liquors. It is believed that contents of Gyemiseo will provide a foundation on which to pursue researches on the process of transition of cooking methods of traditional cuisines of Korea during the Joseon Dynasty.

Adaptive Path Index for Efficient U Query Processing (효율적인 XML 질의 처리를 위한 적응형 경로 인덱스)

  • 민준기;심규석;정진완
    • Journal of KIISE:Databases
    • /
    • v.31 no.1
    • /
    • pp.61-71
    • /
    • 2004
  • XML can describe a wide range of data, from regular to irregular and from flat to deeply nested. Thus, XML is rapidly emerging as the do facto standard for the Web document format since XML supports an efficient data exchange and integration. Also, to retrieve the data represented by XML, several XML query languages are proposed. XML query languages such as XPath and XQuery use path expressions to traverse irregularly structured data which comprise B% elements. To evaluate path expressions, various path indexes are proposed. However, traditional path indexes are constructed by utilizing only the XML data structure. Therefore, in this paper, we propose an adaptive path index which utilizes the XML data structure as well as query workloads. To improve the query performance, the adaptive path index proposed by this paper manages the frequently used paths and the structural summary of the XML data using a hash tree and a graph structure. Experimental results show that the adaptive path index improves the query performance typically 2 to 69 times compared with the existing indexes.

An Approach to Structuralizing Business Information for Internet Shopping Malls (인터넷쇼핑몰의 사업자신원정보 구조화 방안)

  • 장용식
    • Journal of Intelligence and Information Systems
    • /
    • v.10 no.1
    • /
    • pp.27-45
    • /
    • 2004
  • While on-line shopping is increasing, the "Consumer Protection Law in Electronic Commerce" obliges each internet shopping mall to provide its business information. Although most internet shopping malls provide their business information in the semi-structured format on the bottom of their homepages, the attributes and expression forms of business information are different each other. It makes consumers difficult to identify their business information and lowers public confidence. Hence this study proposes three approaches - HTML-based structure, XML-based structure, and XML data island-based structure - to structuralizing business information for correct expression. The experiment results showed that the business information extraction time by XML data island-based structure is independent of the size of the web document, while the time by HTML-based structure is dependent on the size. By comparing the business information extraction times, we show that XML data island-based structure is more efficient and effective than HTML-based structure.structure.

  • PDF

Analysis of Korean Traditional Records Information System (국내 전통기록물 정보시스템 현황 조사)

  • Yang, Kiduk;Shin, Daye
    • Journal of Korean Library and Information Science Society
    • /
    • v.47 no.4
    • /
    • pp.191-217
    • /
    • 2016
  • Traditional records information system has greatly improved accessibility to users by providing internet access to the digitized form of traditional records, access to which have previously been restricted for the purpose of preservation. This study investigated the accessibility and serviceability of Korean traditional records by examining current traditional record information systems in Korea. After compiling a list of traditional records information systems, which were grouped by operating agency, we analyzed them by coverage period, document type, and content format as well as examining search options and browse categories. We also categorized and examined the information systems by user type. The result showed that out of 105 traditional records information systems serving various content types and services, only a fraction(16.1%) provide a comprehensive information that includes bibliographic information, annotated description, content image, content text, and translated text, and less than a half(49.5%) provide a detailed search, all of which point to a less than optimal conditions for access to traditional records and suggest a strong need for improved traditional records information systems in Korea.

A Research of Anomaly Detection Method in MS Office Document (MS 오피스 문서 파일 내 비정상 요소 탐지 기법 연구)

  • Cho, Sung Hye;Lee, Sang Jin
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.2
    • /
    • pp.87-94
    • /
    • 2017
  • Microsoft Office is an office suite of applications developed by Microsoft. Recently users with malicious intent customize Office files as a container of the Malware because MS Office is most commonly used word processing program. To attack target system, many of malicious office files using a variety of skills and techniques like macro function, hiding shell code inside unused area, etc. And, people usually use two techniques to detect these kinds of malware. These are Signature-based detection and Sandbox. However, there is some limits to what it can afford because of the increasing complexity of malwares. Therefore, this paper propose methods to detect malicious MS office files in Computer forensics' way. We checked Macros and potential problem area with structural analysis of the MS Office file for this purpose.

Improving Accuracy and Completeness in the Collaborative Staging System for Stomach Cancer in South Korea

  • Lim, Hyun-Sook;Won, Young-Joo;Boo, Yoo-Kyung
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.21
    • /
    • pp.9529-9534
    • /
    • 2014
  • Background: Cancer staging enables planning for the best treatments, evaluation of prognosis, and predictions for survival. The Collaborative Stage (CS) system makes it possible to significantly reduce the proportion of patients labeled at an "unknown" stage as well as discrepancies among different staging systems. This study aims to analyze the factors that influence the accuracy and validity of CS data. Materials and Methods: Data were randomly selected (233 cases) from stomach cancer cases enrolled for CS survey at the Korea Central Cancer Registry. Two questionnaires were used to assess CS values for each case and to review the cancer registration environment for each hospital. Data were analyzed in terms of the relationships between the time spent for acquisition and registration of CS information, environments relating to cancer registration in the hospitals, and document sources of CS information for each item. Results: The time for extracting and registering data was found to be shorter when the hospitals had prior experience gained from participating in a CS pilot study and when they were equipped with full-time cancer registrars. Evaluation of the CS information according to medical record sources found that the percentage of items missing for Site Specific Factor (SSF) was 30% higher than for other CS variables. Errors in CS coding were found in variables such as "CS Extension," "CS Lymph Nodes," "CS Metastasis at Diagnosis," and "SSF25 Involvement of Cardia and Distance from Esophagogastric Junction (EGJ)." Conclusions: To build CS system data that are reliable for cancer registration and clinical research, the following components are required: 1) training programs for medical records administrators; 2) supporting materials to promote active participation; and 3) format development to improve registration validity.

An Implementation of Mathematics Editor Using SGML Notation (SGML 표기법을 이용하는 수식 편집기의 설계 및 구현)

  • Kim, Tae-Hoon;Hyun, Deuk-Chang;Lee, Soo-Youn
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.5
    • /
    • pp.1082-1092
    • /
    • 1996
  • The design of distrbuted systems is difficult to achieve as the execution patterns of distrbuted systems are typically more complex than those of non- distributed systems. Thus, research toward the development of design methods for distributed systems is quitely needed. As object-oriented systems and distrbuted systems share similar properties, the combination of these two is somehow natural. In this work, a design of distributed systems is introduced. The goal of the method in this paper is to provide assistance to the process of specifying a formal object- oriented specification from graphical representation specification inputs such as data flow diagrams, state transition diagrams and Petri nets. It addresses the extraction of objects, operations and reationshipsfrom the problem domain with emphasis on the specification of the characteristics of distributed systems. This object identification method is supported by a knowledge base that provides for the automated analysis and reasoning about objects and their relationsships. The final object model is represented in a format which provides a formal mechanism for reprsenting the object information.

  • PDF