• Title/Summary/Keyword: XML Mining

Search Result 51, Processing Time 0.027 seconds

Quick Decision Making Using Visual Dynamic Mining Tool;Spotfire (비쥬얼 다이나믹 마이닝 툴을 이용한 신속한 의사결정;Spotfire)

  • Kim, Seong-Ki
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2008.04a
    • /
    • pp.89-91
    • /
    • 2008
  • 엄청나게 쏟아져 나오는 데이터 홍수 속에서 오늘날의 업체와 연구기관에서는 신속하게 의사 결정을 해야 한다. 당면한 문제점들을 해결하기 위하여 접근할 수 있는 수많은 다양한 데이터 속에서 정확하게 경향을 파악하고 그 근본 원인을 찾아내어 신속하고 action을 행하는 것은 어떠한 회사에서도 성공에 있어서 가장 중요한 인자들 중의 하나이다. 초기 아이디어 도출, 연구 개발에서부터 제품의 생산, 판매 및 서비스에 이르기까지 모든 팀원들은 아주 빠르게 고도의 정확성으로 중요한 결정을 할 필요가 있다. 오늘날의 경쟁 시장에서 기업의 성공은 다른 경쟁자들보다 더 빠르게 결정을 할 수 있는 능력에 달려 있다. 이에 Sporfire에서는 사용자가 쉽고 빠르게 데이터를 분석하여 의사 결정을 할 수 있도록 다양한 기능을 제공하고 있다. 사용자가 SQL같은 전문 언어를 사용하지 않고도 다양한 데이터 source에서 쉽게 데이터를 가져오도록 Information Library를 이용할 수 있으며, 데이터베이스에 들어 있는 숫자들의 집합체를 다양한 차트와 도표들을 이용, 그래픽 적으로 제공해 줌으로써 데이터에 대하여 직관적으로 파악하여 신속하게 대응할 수 있도록 도와준다. 또한 그 결과물들을 MS 파워포인트, 엑셀시트, xml 등으로 저장하여 다른 용도로 사용할 수 있도록 하고 있다.

  • PDF

A Recommendation System of Exponentially Weighted Collaborative Filtering for Products in Electronic Commerce (지수적 가중치를 적용한 협력적 상품추천시스템)

  • Lee, Gyeong-Hui;Han, Jeong-Hye;Im, Chun-Seong
    • The KIPS Transactions:PartB
    • /
    • v.8B no.6
    • /
    • pp.625-632
    • /
    • 2001
  • The electronic stores have realized that they need to understand their customers and to quickly response their wants and needs. To be successful in increasingly competitive Internet marketplace, recommender systems are adapting data mining techniques. One of most successful recommender technologies is collaborative filtering (CF) algorithm which recommends products to a target customer based on the information of other customers and employ statistical techniques to find a set of customers known as neighbors. However, the application of the systems, however, is not very suitable for seasonal products which are sensitive to time or season such as refrigerator or seasonal clothes. In this paper, we propose a new adjusted item-based recommendation generation algorithms called the exponentially weighted collaborative filtering recommendation (EWCFR) one that computes item-item similarities regarding seasonal products. Finally, we suggest the recommendation system with relatively high quality computing time on main memory database (MMDB) in XML since the collaborative filtering systems are needed that can quickly produce high quality recommendations with very large-scale problems.

  • PDF

An Information Technology Architecture for Event CRM in Wired and Wireless Internet Environments (유무선 환경 하의 Event CRM 을 위한 정보기술 아키텍처 연구)

  • Park Ju-Seok;Kim Jae-Gyeong;Lee U-Gi;Jo Hyeong-Jin;Byeon Seong-Uk
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2006.05a
    • /
    • pp.1819-1824
    • /
    • 2006
  • 고객과의 지속적 관계를 통하여 기업 이익을 극대화시키는 CRM (Custom Relationship Management) 시스템은 국내에서는 고객 정보를 분석하여 마케팅 전략을 수립하는 분석 CRM 중심으로 구축되었다. 하지만 인터넷 및 모바일 기술의 발전으로 고객과의 접점이 다양한 채널을 통해서 이루어지고 있으나, 채널별 고객 정보가 체계적으로 관리되지 못하며, 고객에 즉시 대응할 수 있는 체계가 미흡하여 고객이 원하는 시점에 고객을 만족시켜 주지 못하고 있는 실정이다. 따라서 오프라인 중심의 분석 CRM 보다는 운영 CRM이나 협업 CRM 측면을 고려한 실시간(real-time) CRM을 목표로 하며, 다양한 기관과 다양한 채널로 구성된 고객 접점을 체계화하여 고객이 원하는 시점에 고객을 만족시킬 수 있는 CRM 모델을 도출하고자 한다. 본 논문에서는 이러한 모델을 근거로 새로운 관점의 정보기술 아키텍처를 제안한다. 이 아키텍처는 이질적인 유무선 환경을 위하여 XML 기반의 데이터인터페이스를 제시하고, 비즈니스 규칙(rule) 시스템과 데이터마이닝(data mining) 시스템을 포함하고 있다.

  • PDF

A Study of Main Contents Extraction from Web News Pages based on XPath Analysis

  • Sun, Bok-Keun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.7
    • /
    • pp.1-7
    • /
    • 2015
  • Although data on the internet can be used in various fields such as source of data of IR(Information Retrieval), Data mining and knowledge information servece, and contains a lot of unnecessary information. The removal of the unnecessary data is a problem to be solved prior to the study of the knowledge-based information service that is based on the data of the web page, in this paper, we solve the problem through the implementation of XTractor(XPath Extractor). Since XPath is used to navigate the attribute data and the data elements in the XML document, the XPath analysis to be carried out through the XTractor. XTractor Extracts main text by html parsing, XPath grouping and detecting the XPath contains the main data. The result, the recognition and precision rate are showed in 97.9%, 93.9%, except for a few cases in a large amount of experimental data and it was confirmed that it is possible to properly extract the main text of the news.

A Methodology for Ontology-based Knowledge Acquisition and Structuring in an Industry-Academic-Government Project ″Go Japan!″

  • Hideki-Mima;Yoon, Tae-Sung
    • Proceedings of the CALSEC Conference
    • /
    • 2003.09a
    • /
    • pp.197-203
    • /
    • 2003
  • The purpose of the study is to develop an integrated knowledge structuring system for the domain of engineering, in which ontology-based literature mining, knowledge acquisition, knowledge integration, and knowledge retrieval are combined using XML-based tag information and ontology management. The system supports combining different types of databases (papers and patents, technologies and innovations) and retrieving different types of knowledge simultaneously. The main objective of the system is to facilitate knowledge acquisition and knowledge retrieval from documents through an ontology-based dynamic similarity calculation and a visualization of automatically structured knowledge. Through experimentations we conducted using 100,000 words economic documents reported in the "Go! Japan" project for analyzing Japanese industrial situation, and 100,000 words molecular biology Papers, we show the system is Practical enough for accelerating knowledge acquisition and knowledge discovery from the information sea.

  • PDF

Semantic-Based Label Lists Represented Information Extraction from Tree Data (트리 구조 데이터의 의미 기반 라벨 리스트 표현 정보 추출)

  • Paik, Juryon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2020.07a
    • /
    • pp.27-28
    • /
    • 2020
  • 이형 데이터 간의 정보 전송과 교환을 가능하게 하는 유연한 트리 구조의 특성은 인터넷 및 IoT 환경에서의 대량 데이터 저장·전송·교환 등에 있어서 XML이나 JSON에서 주요하게 사용된다. 사용성에 있어서는 수월한 반면에, 감추어져 있는 가치있는 정보들을 트리 구조의 대량 데이터들로부터 찾아내는 것은 일반 단순 구조의 데이터에 비해서 훨씬 어려우며 복잡하고 난해한 문제들을 발생시킨다. 이는 트리가 갖는 계층 구조 때문이다. 본 논문에서는 계층 구조를 갖는 대량 트리 데이터들을 보다 단순한 리스트 구조로 변형한 후 해당 구조로부터 가장 자주 발생하는 유용한 정보들을 추출하는 방법을 제시한다.

  • PDF

The Conference Management System Architecture for Ontological Knowledge (지식의 온톨로지화를 위한 관리 시스템 아키텍처)

  • Hong, Hyun-Woo;Koh, Gwang-san;Kim, Chang-Soo;Jeong, Jae-Gil;Jung, Hoe-kyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.1115-1118
    • /
    • 2005
  • With the development of the internet technology, The on-line conference system have been producted. Now, the on-line conference system is developing for using pattern recognition system and voice recognition system. Comparing with the off-line conference, the on-line conference is excellent in free from distance limitation. But, the on-line meetings have unavoidable weak points. it is the same as the off-line conference that when the conference goes on, the content orthopedic and the content consistency is weak. So the conference members can not seize the conference flow. Therefore, in this paper, we introduce the ontology concept. Design a new architecture using ontology mining technique for making the conference content and conference knowledge ontological. Then in order to inspection the new architecture, We design and implementation the new conference management system based knowledge.

  • PDF

A comparison of three design tree based search algorithms for the detection of engineering parts constructed with CATIA V5 in large databases

  • Roj, Robin
    • Journal of Computational Design and Engineering
    • /
    • v.1 no.3
    • /
    • pp.161-172
    • /
    • 2014
  • This paper presents three different search engines for the detection of CAD-parts in large databases. The analysis of the contained information is performed by the export of the data that is stored in the structure trees of the CAD-models. A preparation program generates one XML-file for every model, which in addition to including the data of the structure tree, also owns certain physical properties of each part. The first search engine is specializes in the discovery of standard parts, like screws or washers. The second program uses certain user input as search parameters, and therefore has the ability to perform personalized queries. The third one compares one given reference part with all parts in the database, and locates files that are identical, or similar to, the reference part. All approaches run automatically, and have the analysis of the structure tree in common. Files constructed with CATIA V5, and search engines written with Python have been used for the implementation. The paper also includes a short comparison of the advantages and disadvantages of each program, as well as a performance test.

Facilitating Web Service Taxonomy Generation : An Artificial Neural Network based Framework, A Prototype Systems, and Evaluation (인공신경망 기반 웹서비스 분류체계 생성 프레임워크의 실증적 평가)

  • Hwang, You-Sub
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.2
    • /
    • pp.33-54
    • /
    • 2010
  • The World Wide Web is transitioning from being a mere collection of documents that contain useful information toward providing a collection of services that perform useful tasks. The emerging Web service technology has been envisioned as the next technological wave and is expected to play an important role in this recent transformation of the Web. By providing interoperable interface standards for application-to-application communication, Web services can be combined with component based software development to promote application interaction both within and across enterprises. To make Web services for service-oriented computing operational, it is important that Web service repositories not only be well-structured but also provide efficient tools for developers to find reusable Web service components that meet their needs. As the potential of Web services for service-oriented computing is being widely recognized, the demand for effective Web service discovery mechanisms is concomitantly growing. A number of public Web service repositories have been proposed, but the Web service taxonomy generation has not been satisfactorily addressed. Unfortunately, most existing Web service taxonomies are either too rudimentary to be useful or too hard to be maintained. In this paper, we propose a Web service taxonomy generation framework that combines an artificial neural network based clustering techniques with descriptive label generating and leverages the semantics of the XML-based service specification in WSDL documents. We believe that this is one of the first attempts at applying data mining techniques in the Web service discovery domain. We have developed a prototype system based on the proposed framework using an unsupervised artificial neural network and empirically evaluated the proposed approach and tool using real Web service descriptions drawn from operational Web service repositories. We report on some preliminary results demonstrating the efficacy of the proposed approach.

Development of Intelligent Job Classification System based on Job Posting on Job Sites (구인구직사이트의 구인정보 기반 지능형 직무분류체계의 구축)

  • Lee, Jung Seung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.123-139
    • /
    • 2019
  • The job classification system of major job sites differs from site to site and is different from the job classification system of the 'SQF(Sectoral Qualifications Framework)' proposed by the SW field. Therefore, a new job classification system is needed for SW companies, SW job seekers, and job sites to understand. The purpose of this study is to establish a standard job classification system that reflects market demand by analyzing SQF based on job offer information of major job sites and the NCS(National Competency Standards). For this purpose, the association analysis between occupations of major job sites is conducted and the association rule between SQF and occupation is conducted to derive the association rule between occupations. Using this association rule, we proposed an intelligent job classification system based on data mapping the job classification system of major job sites and SQF and job classification system. First, major job sites are selected to obtain information on the job classification system of the SW market. Then We identify ways to collect job information from each site and collect data through open API. Focusing on the relationship between the data, filtering only the job information posted on each job site at the same time, other job information is deleted. Next, we will map the job classification system between job sites using the association rules derived from the association analysis. We will complete the mapping between these market segments, discuss with the experts, further map the SQF, and finally propose a new job classification system. As a result, more than 30,000 job listings were collected in XML format using open API in 'WORKNET,' 'JOBKOREA,' and 'saramin', which are the main job sites in Korea. After filtering out about 900 job postings simultaneously posted on multiple job sites, 800 association rules were derived by applying the Apriori algorithm, which is a frequent pattern mining. Based on 800 related rules, the job classification system of WORKNET, JOBKOREA, and saramin and the SQF job classification system were mapped and classified into 1st and 4th stages. In the new job taxonomy, the first primary class, IT consulting, computer system, network, and security related job system, consisted of three secondary classifications, five tertiary classifications, and five fourth classifications. The second primary classification, the database and the job system related to system operation, consisted of three secondary classifications, three tertiary classifications, and four fourth classifications. The third primary category, Web Planning, Web Programming, Web Design, and Game, was composed of four secondary classifications, nine tertiary classifications, and two fourth classifications. The last primary classification, job systems related to ICT management, computer and communication engineering technology, consisted of three secondary classifications and six tertiary classifications. In particular, the new job classification system has a relatively flexible stage of classification, unlike other existing classification systems. WORKNET divides jobs into third categories, JOBKOREA divides jobs into second categories, and the subdivided jobs into keywords. saramin divided the job into the second classification, and the subdivided the job into keyword form. The newly proposed standard job classification system accepts some keyword-based jobs, and treats some product names as jobs. In the classification system, not only are jobs suspended in the second classification, but there are also jobs that are subdivided into the fourth classification. This reflected the idea that not all jobs could be broken down into the same steps. We also proposed a combination of rules and experts' opinions from market data collected and conducted associative analysis. Therefore, the newly proposed job classification system can be regarded as a data-based intelligent job classification system that reflects the market demand, unlike the existing job classification system. This study is meaningful in that it suggests a new job classification system that reflects market demand by attempting mapping between occupations based on data through the association analysis between occupations rather than intuition of some experts. However, this study has a limitation in that it cannot fully reflect the market demand that changes over time because the data collection point is temporary. As market demands change over time, including seasonal factors and major corporate public recruitment timings, continuous data monitoring and repeated experiments are needed to achieve more accurate matching. The results of this study can be used to suggest the direction of improvement of SQF in the SW industry in the future, and it is expected to be transferred to other industries with the experience of success in the SW industry.