• 제목/요약/키워드: Fuzzy Mining

검색결과 120건 처리시간 0.03초

PubMine: An Ontology-Based Text Mining System for Deducing Relationships among Biological Entities

  • Kim, Tae-Kyung;Oh, Jeong-Su;Ko, Gun-Hwan;Cho, Wan-Sup;Hou, Bo-Kyeng;Lee, Sang-Hyuk
    • Interdisciplinary Bio Central
    • /
    • 제3권2호
    • /
    • pp.7.1-7.6
    • /
    • 2011
  • Background: Published manuscripts are the main source of biological knowledge. Since the manual examination is almost impossible due to the huge volume of literature data (approximately 19 million abstracts in PubMed), intelligent text mining systems are of great utility for knowledge discovery. However, most of current text mining tools have limited applicability because of i) providing abstract-based search rather than sentence-based search, ii) improper use or lack of ontology terms, iii) the design to be used for specific subjects, or iv) slow response time that hampers web services and real time applications. Results: We introduce an advanced text mining system called PubMine that supports intelligent knowledge discovery based on diverse bio-ontologies. PubMine improves query accuracy and flexibility with advanced search capabilities of fuzzy search, wildcard search, proximity search, range search, and the Boolean combinations. Furthermore, PubMine allows users to extract multi-dimensional relationships between genes, diseases, and chemical compounds by using OLAP (On-Line Analytical Processing) techniques. The HUGO gene symbols and the MeSH ontology for diseases, chemical compounds, and anatomy have been included in the current version of PubMine, which is freely available at http://pubmine.kobic.re.kr. Conclusions: PubMine is a unique bio-text mining system that provides flexible searches and analysis of biological entity relationships. We believe that PubMine would serve as a key bioinformatics utility due to its rapid response to enable web services for community and to the flexibility to accommodate general ontology.

개선된 밀도 기반의 퍼지 C-Means 알고리즘을 이용한 클러스터 합병 (Cluster Merging Using Enhanced Density based Fuzzy C-Means Clustering Algorithm)

  • 한진우;전성해;오경환
    • 한국지능시스템학회논문지
    • /
    • 제14권5호
    • /
    • pp.517-524
    • /
    • 2004
  • 1960년대 퍼지 이론이 소개된 이후 데이터 마이닝을 포함한 기계 학습 분야의 군집화 작업에서 퍼지 이론이 폭넓게 사용되었다. 퍼지 C-평균 알고리즘은 가장 많이 사용되는 퍼지 군집화 알고리즘이다. 이 알고리즘은 하나의 데이터 개체가 서로 다른 소속 정도를 가지고 각 군집에 할당될 수 있도록 한다. 퍼지 C-평균 알고리즘도 K-평균 알고리즘과 같은 일반적인 군집화 알고리즘과 마찬가지로 초기 군집수와 군집 중심의 위치에 의해 최종 군집 결과의 성능 차이가 나타난다. 군집화를 위한 이러한 초기 설정은 주관적이며 이 때문에 적절치 못한 결과를 얻게 될 수도 있다. 본 논문에서는 이 문제를 해결할 수 있는 방법으로 주어진 학습 데이터의 속성을 기반으로 한 초기 군집수와 군집 중심을 결정하는 개선된 밀도 기반의 퍼지 C-평균 알고리즘을 제안하였다. 제안 방법은 격자를 사용하여 초기 군집 중심의 위치와 군집수를 결정하였다. 기존에 많이 이용되었던 객관적인 기계 학습 데이터를 이용하여 제안 알고리즘의 성능비교를 수행하였다.

Investigation on ground displacements induced by excavation of overlapping twin shield tunnels

  • Qi, Weiqiang;Yang, Zhiyong;Jiang, Yusheng;Yang, Xing;Shao, Xiaokang;An, Hongbin
    • Geomechanics and Engineering
    • /
    • 제28권5호
    • /
    • pp.531-546
    • /
    • 2022
  • Ground displacements caused by the construction of overlapping twin shield tunnels with small turning radius are complex, especially under special geological conditions of construction. To investigate the ground displacements caused due to shield machines in the unique calcareous sand layers in Israel for the first time and determine the main factors affecting the ground displacements, field monitoring, laboratory geological analysis, theoretical calculations, and parameter studies were adopted. By using rod extensometers, inclinometers, total stations, and automatic segment-displacement monitors, subsurface tunneling-induced displacement, surface settlement, and displacement of the down-track tunnel segments caused by the construction of an up-track tunnel were analyzed. The up-track tunnel and the down-track tunnel pass through different stratum, resulting in different construction parameters and ground displacements. The laws of variation of thrust and torque, soil pressure in the chamber, excavated soil quantity, synchronous grouting pressure, and grout volume of the two tunnels from parallel to fully overlapping orientations were compared. The thrust and torque of the shield in the fine sand are larger than those in the Kurkar layer, and the grouting amount in fine sand is unstable. According to fuzzy statistics and Gaussian curve fitting of the shield tunneling speed, the tunneling speed in the Kurkar stratum is twice that in the fine-sand stratum.

A Web Recommendation System using Grid based Support Vector Machines

  • Jun, Sung-Hae
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제7권2호
    • /
    • pp.91-95
    • /
    • 2007
  • Main goal of web recommendation system is to study how user behavior on a website can be predicted by analyzing web log data which contain the visited web pages. Many researches of the web recommendation system have been studied. To construct web recommendation system, web mining is needed. Especially, web usage analysis of web mining is a tool for recommendation model. In this paper, we propose web recommendation system using grid based support vector machines for improvement of web recommendation system. To verify the performance of our system, we make experiments using the data set from our web server.

Emerging Data Management Tools and Their Implications for Decision Support

  • Eorm, Sean B.;Novikova, Elena;Yoo, Sangjin
    • 한국산업정보학회논문지
    • /
    • 제2권2호
    • /
    • pp.189-207
    • /
    • 1997
  • Recently, we have witnessed a host of emerging tools in the management support systems (MSS) area including the data warehouse/multidimensinal databases (MDDB), data mining, on-line analytical processing (OLAP), intelligent agents, World Wide Web(WWW) technologies, the Internet, and corporate intranets. These tools are reshaping MSS developments in organizations. This article reviews a set of emerging data management technologies in the knowledge discovery in databases(KDD) process and analyzes their implications for decision support. Furthermore, today's MSS are equipped with a plethora of AI techniques (artifical neural networks, and genetic algorithms, etc) fuzzy sets, modeling by example , geographical information system(GIS), logic modeling, and visual interactive modeling (VIM) , All these developments suggest that we are shifting the corporate decision making paradigm form information-driven decision making in the1980s to knowledge-driven decision making in the 1990s.

  • PDF

The network model for Detection Systems based on data mining and the false errors

  • Lee Se-Yul;Kim Yong-Soo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제6권2호
    • /
    • pp.173-177
    • /
    • 2006
  • This paper investigates the asymmetric costs of false errors to enhance the detection systems performance. The proposed method utilizes the network model to consider the cost ratio of false errors. By comparing false positive errors with false negative errors this scheme achieved better performance on the view point of both security and system performance objectives. The results of our empirical experiment show that the network model provides high accuracy in detection. In addition, the simulation results show that effectiveness of probe detection is enhanced by considering the costs of false errors.

퍼지 데이타에 대한 퍼지 결정트리 기반 분류규칙 마이닝 (Classification Rue Mining from Fuzzy Data based on Fuzzy Decision Tree)

  • 이건명
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제28권1호
    • /
    • pp.64-72
    • /
    • 2001
  • 결정트리 생성은 일련의 특징값으로 기술된 사례들로부터 분류 지식을 추출하는 학습 방법중의 하나이다. 현장에서 수집되는 사례들은 관측 오류, 주관적인 판단, 불확실성 등으로 인해서 애매하게 주어지는 경우가 많다. 퍼지숫자나 구간값을 사용함으로써 이러한 애매한 데이타의 수치 속성은 쉽게 표현될 수 있다. 이 논문에서는 수치 속성은 보통값 뿐마아니라 퍼지숫자나 구간값을 갖을 수 있고, 비수치 속서은 보통값을 가지며, 데이터의 클래스는 확신도를 기자는 학습 데이터들로 부터, 분류 규칙을 마이닝하기 위한 퍼지 결정트리 생성 방법을 제안한다. 또한 제안한 방법에 의해 생성된 퍼지 결정트리를 사용하여, 새로운 데이터에 대한 클래스를 결정하는 추론 방법을 소개한다. 한편, 제안된 방법의 유용성을 보이기 위해 수행한 실험의 결과를 보인다.

  • PDF

A XML Schema Matching based on Fuzzy Similarity Measure

  • Kim, Chang-Suk;Sim, Kwee-Bo
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2005년도 ICCAS
    • /
    • pp.1482-1485
    • /
    • 2005
  • An equivalent schema matching among several different source schemas is very important for information integration or mining on the XML based World Wide Web. Finding most similar source schema corresponding mediated schema is a major bottleneck because of the arbitrary nesting property and hierarchical structures of XML DTD schemas. It is complex and both very labor intensive and error prune job. In this paper, we present the first complex matching of XML schema, i.e. XML DTD, inlining two dimensional DTD graph into flat feature values. The proposed method captures not only schematic information but also integrity constraints information of DTD to match different structured DTD. We show the integrity constraints based hierarchical schema matching is more semantic than the schema matching only to use schematic information and stored data.

  • PDF

Comparison of measurement uncertainty calculation methods on example of indirect tensile strength measurement

  • Tutmez, Bulent
    • Geomechanics and Engineering
    • /
    • 제12권6호
    • /
    • pp.871-882
    • /
    • 2017
  • Indirect measure of the tensile strength of laboratory samples is an important topic in rock engineering. One of the most important tests, the Brazilian strength test is performed to obtain the tensile strength of rock, concrete and other quasi brittle materials. Because the measurements are provided indirectly and the inspected rock materials may have heterogeneous properties, uncertainty quantification is required for a reliable test evaluation. In addition to the conventional measurement evaluation uncertainty methods recommended by the Guide to the Expression of Uncertainty in Measurement (GUM), such as Taylor's and Monte Carlo Methods, a fuzzy set-based approach is also proposed and resulting uncertainties are discussed. The results showed that when a tensile strength measurement is measured by a laboratory test, its uncertainty can also be expressed by one of the methods presented.

다차원 FCM을 이용한 웹 로그 데이터의 유사 패턴 분석 (Similarity Pattern Analysis of Web Log Data using Multidimensional FCM)

  • 김미라;조동섭
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2002년도 가을 학술발표논문집 Vol.29 No.2 (2)
    • /
    • pp.190-192
    • /
    • 2002
  • 데이터 마이닝(Data Mining)이란 저장된 많은 양의 자료로부터 통계적 수학적 분석방법을 이용하여 다양한 가치 있는 정보를 찾아내는 일련의 과정이다. 데이터 클러스터링은 이러한 데이터 마이닝을 위한 하나의 중요한 기법이다. 본 논문에서는 Fuzzy C-Means 알고리즘을 이용하여 웹 사용자들의 행위가 기록되어 있는 웹 로그 데이터를 데이터 클러스터링 하는 방법에 관하여 연구하고자 한다. Fuzzv C-Means 클러스터링 알고리즘은 각 데이터와 각 클러스터 중심과의 거리를 고려한 유사도 측정에 기초한 목적 함수의 최적화 방식을 사용한다. 웹 로그 데이터의 여러 필드 중에서 사용자 IP, 시간, 웹 페이지 필드를 WLDF(Web Log Data for FCM)으로 가공한 후, 다차원 Fuzzy C-Means 클러스터링을 한다. 그리고 이를 이용하여 샘플 데이터와 임의의 데이터간의 유사 패턴 분석을 하고자 한다.

  • PDF