• Title/Summary/Keyword: Tree Structured Data

Search Result 90, Processing Time 0.038 seconds

A Study on Processing XML Documents (XML 문서 처리에 관한 연구)

  • Kim, Tae Gwon
    • Journal of KIISE
    • /
    • v.43 no.4
    • /
    • pp.489-496
    • /
    • 2016
  • XML can effectively express structured or semi-structured data as well as relational databases. XQuery is a query language for retrieving information for such an XML document. In this paper, an XQuery composer is designed and implemented, with an API provided for XQuery processors, and a proper processor is registered. This composer shows query results immediately processed by the processor. As this composer contains a parser for XQuery, it can compose XQuery effectively using a diverse dialog box designed for XQuery grammar. A dialog box is affiliated with a clause region, which is a region that algebra operates from the parsing tree. It can compose path expressions for an XML document easily as it shows an element tree from DTD graphically. Path expressions are composed automatically by marking elements in the structural hierarchy and by specifying the predicate of an element partially.

Agriculture Big Data Analysis System Based on Korean Market Information

  • Chuluunsaikhan, Tserenpurev;Song, Jin-Hyun;Yoo, Kwan-Hee;Rah, Hyung-Chul;Nasridinov, Aziz
    • Journal of Multimedia Information System
    • /
    • v.6 no.4
    • /
    • pp.217-224
    • /
    • 2019
  • As the world's population grows, how to maintain the food supply is becoming a bigger problem. Now and in the future, big data will play a major role in decision making in the agriculture industry. The challenge is how to obtain valuable information to help us make future decisions. Big data helps us to see history clearer, to obtain hidden values, and make the right decisions for the government and farmers. To contribute to solving this challenge, we developed the Agriculture Big Data Analysis System. The system consists of agricultural big data collection, big data analysis, and big data visualization. First, we collected structured data like price, climate, yield, etc., and unstructured data, such as news, blogs, TV programs, etc. Using the data that we collected, we implement prediction algorithms like ARIMA, Decision Tree, LDA, and LSTM to show the results in data visualizations.

A Case Study on the Web Publishing of Relational DB Via XML (XML을 이용한 관계DB의 웹출판에 관한 사례)

  • 우원택
    • Proceedings of the Korea Association of Information Systems Conference
    • /
    • 2001.12a
    • /
    • pp.64-82
    • /
    • 2001
  • HTML revolutionized the way we specify the appearance of data on the Internet. Today, XML (the eXtensible Markup Language) is changing the way we specify the meaning of data. XML, lets document authors define their own markup tags and attribute names to assign meaning to the data elements in the document. Further, XML elements can be nested and include references to indicate data relationships, as Listing One. Unlike HTML, XML markup tags do not describe how to render the data. Rather, they provide descriptions of data, allowing software to understand the meaning of the data automatically For publishing, instead, XSL, the eXtensible Stylesheet Language as a separate language , is in charge of specifying the presentation of XML documents. The purpose of this study is to discover how to transform your organizations relational data into potential e-commerce, business-to-business, and web application with XML and XSL documents. For this purpose, the literature survey, first of all, was undertaken to understand the basic structures of XML documents. Second, one case implementation was performed to understand how to transform Access 2002 XML Files into HTML with XSLTand VB script. The results come out to be successful, more or less. But the limitations of it still exist. One immediate limitation is that XML documents are essentially tree structure, as dictated by the nesting of elements. However, relational database tables are two dimensional matrix structure. In addition, real-world data often is graph structured-a single data element may be referenced in multiple ways. However, this study is useful for understanding how to convert relational database into XML documents and to publish them using XSL or VB script.

  • PDF

Machine Learning Approach to Blood Stasis Pattern Identification Based on Self-reported Symptoms (기계학습을 적용한 자기보고 증상 기반의 어혈 변증 모델 구축)

  • Kim, Hyunho;Yang, Seung-Bum;Kang, Yeonseok;Park, Young-Bae;Kim, Jae-Hyo
    • Korean Journal of Acupuncture
    • /
    • v.33 no.3
    • /
    • pp.102-113
    • /
    • 2016
  • Objectives : This study is aimed at developing and discussing the prediction model of blood stasis pattern of traditional Korean medicine(TKM) using machine learning algorithms: multiple logistic regression and decision tree model. Methods : First, we reviewed the blood stasis(BS) questionnaires of Korean, Chinese, and Japanese version to make a integrated BS questionnaire of patient-reported outcomes. Through a human subject research, patients-reported BS symptoms data were acquired. Next, experts decisions of 5 Korean medicine doctor were also acquired, and supervised learning models were developed using multiple logistic regression and decision tree. Results : Integrated BS questionnaire with 24 items was developed. Multiple logistic regression models with accuracy of 0.92(male) and 0.95(female) validated by 10-folds cross-validation were constructed. By decision tree modeling methods, male model with 8 decision node and female model with 6 decision node were made. In the both models, symptoms of 'recent physical trauma', 'chest pain', 'numbness', and 'menstrual disorder(female only)' were considered as important factors. Conclusions : Because machine learning, especially supervised learning, can reveal and suggest important or essential factors among the very various symptoms making up a pattern identification, it can be a very useful tool in researching diagnostics of TKM. With a proper patient-reported outcomes or well-structured database, it can also be applied to a pre-screening solutions of healthcare system in Mibyoung stage.

XML-based Modeling for Semantic Retrieval of Syslog Data (Syslog 데이터의 의미론적 검색을 위한 XML 기반의 모델링)

  • Lee Seok-Joon;Shin Dong-Cheon;Park Sei-Kwon
    • The KIPS Transactions:PartD
    • /
    • v.13D no.2 s.105
    • /
    • pp.147-156
    • /
    • 2006
  • Event logging plays increasingly an important role in system and network management, and syslog is a de-facto standard for logging system events. However, due to the semi-structured features of Common Log Format data most studies on log analysis focus on the frequent patterns. The extensible Markup Language can provide a nice representation scheme for structure and search of formatted data found in syslog messages. However, previous XML-formatted schemes and applications for system logging are not suitable for semantic approach such as ranking based search or similarity measurement for log data. In this paper, based on ranked keyword search techniques over XML document, we propose an XML tree structure through a new data modeling approach for syslog data. Finally, we show suitability of proposed structure for semantic retrieval.

Design of Algorithm for Efficient Retrieve Pure Structure-Based Query Processing and Retrieve in Structured Document (구조적 문서의 효율적인 구조 질의 처리 및 검색을 위한 알고리즘의 설계)

  • 김현주
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.8
    • /
    • pp.1089-1098
    • /
    • 2001
  • Structure information contained in a structured document supports various access paths to document. In order to use structure information contained in a structured document, it is required to construct an index structural on document structures. Content indexing and structure indexing per document require high memory overhead. Therefore, processing of pure structure queries based on document structure like relationship between elements or element orders, low memory overhead for indexing are required. This paper suggests the GDIT(Global Document Instance Tree) data structure and indexing scheme about structure of document which supports low memory overhead for indexing and powerful types of user queries. The structure indexing scheme only index the lowest level element of document and does not effect number of document having retrieval element. Based on the index structure, we propose an query processing algorithm about pure structure, proof the indexing schemes keeps up indexing efficient in terms of space. The proposed index structure bases GDR concept and uses index technique based on GDIT.

  • PDF

Korean Names

  • Kim, Chin-W.
    • Lingua Humanitatis
    • /
    • v.7
    • /
    • pp.11-30
    • /
    • 2005
  • Historical origins of both personal names and place names in Korea are reviewed. It is shown that names of native origin have been largely replaced by those of Sino-Korean names. Some statistics are given on the basis of the 2000 census data in South Korea. A unique method of naming personal names which contain a generation marker called hangnyol is reviewed. This enables the person to figure out one's position and others in the family tree up to as many as ten generations without going consulting the book of genealogy. While this practice had a role to play in a vertically structured society where seniority is important, it is less practiced as the society is becoming more egalitarian, so that native names, not writable in Chinese characters, are on the rise. In this global age, a person is not just a member of his family or clan, s/he is also a member of the international community. The author proposes several things that should be considered in naming to fit the modern global age: euphony of names, ambiguity, possible bad connotations when Romanized, unintended homophones with comic meanings, etc.

  • PDF

THRESHOLD MODELING FOR BIFURCATING AUTOREGRESSION AND LARGE SAMPLE ESTIMATION

  • Hwang, S.Y.;Lee, Sung-Duck
    • Journal of the Korean Statistical Society
    • /
    • v.35 no.4
    • /
    • pp.409-417
    • /
    • 2006
  • This article is concerned with threshold modeling of the bifurcating autoregressive model (BAR) originally suggested by Cowan and Staudte (1986) for tree structured data of cell lineage study where each individual $(X_t)$ gives rise to two off-spring $(X_{2t},\;X_{2t+1})$ in the next generation. The triplet $(X_t,\;X_{2t},\;X_{2t+1})$ refers to mother-daughter relationship. In this paper we propose a threshold model incorporating the difference of 'fertility' of the mother for the first and second off-springs, and thereby extending BAR to threshold-BAR (TBAR, for short). We derive a sufficient condition of stationarity for the suggested TBAR model. Also various inferential methods such as least squares (LS), maximum likelihood (ML) and quasi-likelihood (QL) methods are discussed and relevant limiting distributions are obtained.

Modality-Based Sentence-Final Intonation Prediction for Korean Conversational-Style Text-to-Speech Systems

  • Oh, Seung-Shin;Kim, Sang-Hun
    • ETRI Journal
    • /
    • v.28 no.6
    • /
    • pp.807-810
    • /
    • 2006
  • This letter presents a prediction model for sentence-final intonations for Korean conversational-style text-to-speech systems in which we introduce the linguistic feature of 'modality' as a new parameter. Based on their function and meaning, we classify tonal forms in speech data into tone types meaningful for speech synthesis and use the result of this classification to build our prediction model using a tree structured classification algorithm. In order to show that modality is more effective for the prediction model than features such as sentence type or speech act, an experiment is performed on a test set of 970 utterances with a training set of 3,883 utterances. The results show that modality makes a higher contribution to the determination of sentence-final intonation than sentence type or speech act, and that prediction accuracy improves up to 25% when the feature of modality is introduced.

  • PDF

Development of Semantic-Based XML Mining for Intelligent Knowledge Services (지능형 지식서비스를 위한 의미기반 XML 마이닝 시스템 연구)

  • Paik, Juryon;Kim, Jinyeong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2018.07a
    • /
    • pp.59-62
    • /
    • 2018
  • XML을 대상으로 하는 연구가 최근 5~6년 사이에 꾸준한 증가를 보이며 이루어지고 있지만 대다수의 연구들은 XML을 구성하고 있는 엘리먼트 자체에 대한 통계적인 모델을 기반으로 이루어졌다. 이는 XML의 고유 속성인 트리 구조에서의 텍스트, 문장, 문장 구성 성분이 가지고 있는 의미(semantics)가 명시적으로 분석, 표현되어 사용되기 보다는 통계적인 방법으로만 데이터의 발생을 계산하여 사용자가 요구한 질의에 대한 결과, 즉 해당하는 정보 및 지식을 제공하는 형식이다. 지능형 지식서비스 제공을 위한 환경에 부합하기 위한 정보 추출은, 텍스트 및 문장의 구성 요소를 분석하여 문서의 내용을 단순한 단어 집합보다는 풍부한 의미를 내포하는 형식으로 표현함으로써 보다 정교한 지식과 정보의 추출이 수행될 수 있도록 하여야 한다. 본 연구는 범람하는 XML 데이터로부터 사용자 요구의 의미까지 파악하여 정확하고 다양한 지식을 추출할 수 있는 방법을 연구하고자 한다. 레코드 구조가 아닌 트리 구조 데이터로부터 의미 추출이 가능한 효율적인 마이닝 기법을 진일보시킴으로써 다양한 사용자 중심의 서비스 제공을 최종 목적으로 한다.

  • PDF