• Title/Summary/Keyword: Similarity rule

Search Result 119, Processing Time 0.023 seconds

Rapid Hybrid Recommender System with Web Log for Outbound Leisure Products (웹로그를 활용한 고속 하이브리드 해외여행 상품 추천시스템)

  • Lee, Kyu Shik;Yoon, Ji Won
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.12
    • /
    • pp.646-653
    • /
    • 2016
  • Outbound market is a rapidly growing global industry, and has evolved into a 11 trillion won trade. A lot of recommender systems, which are based on collaborative and content filtering, target the existing purchase log or rely on studies based on similarity of products. These researches are not highly efficient as data was not obtained in advance, and acquiring the overwhelming amount of data has been relatively slow. The characteristics of an outbound product are that it should be purchased at least twice in a year, and its pricing should be in the higher category. Since the repetitive purchase of a product is rare for the outbound market, the old recommender system which profiles the existing customers is lacking, and has some limitations. Therefore, due to the scarcity of data, we suggest an improved customer-profiling method using web usage mining, algorithm of association rule, and rule-based algorithm, for faster recommender system of outbound product.

TF-IDF Based Association Rule Analysis System for Medical Data (의료 정보 추출을 위한 TF-IDF 기반의 연관규칙 분석 시스템)

  • Park, Hosik;Lee, Minsu;Hwang, Sungjin;Oh, Sangyoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.3
    • /
    • pp.145-154
    • /
    • 2016
  • Because of the recent interest in the u-Health and development of IT technology, a need of utilizing a medical information data has been increased. Among previous studies that utilize various data mining algorithms for processing medical information data, there are studies of association rule analysis. In the studies, an association between the symptoms with specified diseases is the target to discover, however, infrequent terms which can be important information for a disease diagnosis are not considered in most cases. In this paper, we proposed a new association rule mining system considering the importance of each term using TF-IDF weight to consider infrequent but important items. In addition, the proposed system can predict candidate diagnoses from medical text records using term similarity analysis based on medical ontology.

Effectiveness of Repeated Examination to Diagnose Enterobiasis in Nursery School Groups

  • Remm, Mare;Remm, Kalle
    • Parasites, Hosts and Diseases
    • /
    • v.47 no.3
    • /
    • pp.235-241
    • /
    • 2009
  • The aim of this study was to estimate the benefit from repeated examinations in the diagnosis of enterobiasis in nursery school groups, and to test the effectiveness of individual-based risk predictions using different methods. A total of 604 children were examined using double, and 96 using triple, anal swab examinations. The questionnaires for parents, structured observations, and interviews with supervisors were used to identify factors of possible infection risk. In order to model the risk of enterobiasis at individual level, a similarity-based machine learning and prediction software Constud was compared with data mining methods in the Statistica 8 Data Miner software package. Prevalence according to a single examination was 22.5%; the increase as a result of double examinations was 8.2%. Single swabs resulted in an estimated prevalence of 20.1% among children examined 3 times; double swabs increased this by 10.1%, and triple swabs by 7.3%. Random forest classification, boosting classification trees, and Constud correctly predicted about 2/3 of the results of the second examination. Constud estimated a mean prevalence of 31.5% in groups. Constud was able to yield the highest overall fit of individual-based predictions while boosting classification tree and random forest models were more effective in recognizing Enterobius positive persons. As a rule, the actual prevalence of enterobiasis is higher than indicated by a single examination. We suggest using either the values of the mean increase in prevalence after double examinations compared to single examinations or group estimations deduced from individual-level modelled risk predictions.

Automatic Construction of Syntactic Relation in Lexical Network(U-WIN) (어휘망(U-WIN)의 구문관계 자동구축)

  • Im, Ji-Hui;Choe, Ho-Seop;Ock, Cheol-Young
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.10
    • /
    • pp.627-635
    • /
    • 2008
  • An extended form of lexical network is explored by presenting U-WIN, which applies lexical relations that include not only semantic relations but also conceptual relations, morphological relations and syntactic relations, in a way different with existing lexical networks that have been centered around linking structures with semantic relations. So, This study introduces the new methodology for constructing a syntactic relation automatically. First of all, we extract probable nouns which related to verb based on verb's sentence type. However we should decided the extracted noun's meaning because extracted noun has many meanings. So in this study, we propose that noun's meaning is decided by the example matching rule/syntactic pattern/semantic similarity, frequency information. In addition, syntactic pattern is expanded using nouns which have high frequency in corpora.

A Fuzzy Neural Network Model Solving the Underutilization Problem (Underutilization 문제를 해결한 퍼지 신경회로망 모델)

  • 김용수;함창현;백용선
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.11 no.4
    • /
    • pp.354-358
    • /
    • 2001
  • This paper presents a fuzzy neural network model which solves the underutilization problem. This fuzzy neural network has both stability and flexibility because it uses the control structure similar to AHT(Adaptive Resonance Theory)-l neural network. And this fuzzy nenral network does not need to initialize weights and is less sensitive to noise than ART-l neural network is. The learning rule of this fuzzy neural network is the modified and fuzzified version of Kohonen learning rule and is based on the fuzzification of leaky competitive leaming and the fuzzification of conditional probability. The similarity measure of vigilance test, which is performed after selecting a winner among output neurons, is the relative distance. This relative distance considers Euclidean distance and the relative location between a datum and the prototypes of clusters. To compare the performance of the proposed fuzzy neural network with that of Kohonen Self-Organizing Feature Map the IRIS data and Gaussian-distributed data are used.

  • PDF

A Korean Domain Name System for Korean (한국어순을 따르는 인터넷 한글 도메인 네임 시스템)

  • Lee, Jae-Yong;Lee, Kyoon-Ha
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.7
    • /
    • pp.1855-1862
    • /
    • 1998
  • An Internet user obtains an IP address from DNS to access the host. But, Korean users must use English word order Domain that is not convenient for them. All existing systems have the reverse of the Korean word order and that is the obstacle to the spreading of Internet. So, the necessity of a Korean name system that well suited for Korean Internet user is increasing. In order to operate the proposed Korean Domain System with existing Domain Name Systems, we build a Korean to English transformation rule and Korean Domain Name composition rule. Also, we designed a STHOP for Korean word order processing. In a STROP, a SLDF transform a Korean SLD into English SLD, and a NFDNG makes this as an English word order. By using Korean Domain Name System proposed here, users could understand the embedded meaning of Domain Name with ease, and reluctance of Domain registration caused by similarity of organization names would be solved, Consequently. it could devote to the popularization of the Internet.

  • PDF

Ontology - Based Intelligent Rule Components Extraction (온톨로지 기반 지능형 규칙 구성요소 추출에 관한 연구)

  • Kim U-Ju;Chae Sang-Yong;Park Sang-Eon
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2006.06a
    • /
    • pp.237-244
    • /
    • 2006
  • 시맨틱 웹 관련연구가 증가함에 따라 하나의 관련분야로 규칙기반 시스템 동의 지능적인 웹 환경에 대한 기대 역시 커지고 있다. 하지만 규칙기반 시스템을 활용하기에는 아직도 규칙습득이 많은 제약이 되고 있다. 규칙습득은 웹으로부터 필요한 규칙을 습득하는 일련의 방법인데, 이러한 규칙을 습득하기 위해서는 규칙구성요소를 먼저 식별해야만 한다. 그러나 이러한 규칙을 식별하는 작업은 대부분 지식관리자의 수작업에 의해 이루어지고 있다. 본 연구의 목적은 웹으로부터 규칙구성요소 식별을 최대한 자동화하고 지식관리자의 수작업을 최소화함으로써 그 부담을 줄여 주는 데 있다. 이러한 방법으로는 온톨로지를 근간으로 하여 웹 페이지와의 문자열 비교, 이러한 비교의 한계를 극복하기 위한 확장등의 방법이 있다. 첫 번째 방법은 온툴로지 기반으로 규칙식별 할 웹 페이지와 비교를 통해 지식관리자의 규칙식별 과정을 최대한 자동화하여 주는 것이다. 여기서 만약 현재 규칙을 식별하고자 하는 웹 사이트와 유사한 시스템의 규칙들을 활용하여 일반화 된 온툴로지가 구축되었다면, 이 온톨로지를 기반으로 규칙을 식별하고자 하는 웹사이트와의 비교를 통해 규칙구성요소를 자동화하여 추출 할 수 있다. 이러한 온툴로지를 기반으로 규칙을 식별하기 위해서는 문자열 비교 기법을 사용하게 된다. 하지만 단순한 문자열 비교 기법만으로는 규칙을 식별하는 데에 자연어 처리에 대한 한계가 있다. 이를 극복하기 위해 다음의 두 번째 방법을 사용하고자 한다. 두 번째 방법은 정형화되지 않은 정보들을 확장하여 사용하는 것이다. 우선 찾고자 하는 단어들의 원형을 찾기 위한 스테밍 알고리즘 기법, WordNet을 이용하여 동의어 유의어등으로 확장을 하는 WordNet Expansion 기법, 의미 유사도를 측정하기 위한 방법인 Semantic Similarity Measure 등을 단계적으로 수행하여 자동화되고 정확한 규칙식별을 하고자 한다. 이러한 방법들의 조합으로 인하여 규칙구성요소 추출이 되지 않을 후보 단어들의 수를 줄여서 보다 더 정확하고, 지능적인 규칙구성요소 추출 방법론을 제시하고 구현하여 지식관리자의 규칙습득에 대한 부담을 줄여 주고자 한다.

  • PDF

A Direct Approach for Finding Functional Lipolytic Enzymes from the Paenibacillus polymyxa Genome

  • JUNG, YEO-JIN;KIM, HYUNG-KWOUN;KIM, JIHYUN F.;PARK, SEUNG-HWAN;OH, TAE-KWANG;LEE, JUNG-KEE
    • Journal of Microbiology and Biotechnology
    • /
    • v.15 no.1
    • /
    • pp.155-160
    • /
    • 2005
  • Abstract A direct approach was used to retrieve active lipases from Paenibacillus polymyxa genome databases. Twelve putative lipase genes were tested using a typical lipase sequence rule built on the basis of a consensus sequence of a catalytic triad and oxyanion hole. Among them, six genes satisfied the sequence rule and had similarity (about 25%) with known bacterial lipases. To obtain the six lipase proteins, lipase genes were expressed in E. coli cells and lipolytic activities were measured by using tributyrin plate and pnitrophenyl caproate. One of them, contig 160-26, was expressed as a soluble and active form in E. coli cell. After purifying on Ni-NTA column, its detailed biochemical properties were characterized. It had a maximum hydrolytic activity at $30^{\circ}C$ and pH 7- 8, and was stable up to $40^{\circ}C$ and in the range of pH 5- 8. It most rapidly hydrolyzed pNPC$_6$ among various PNPesters. The other contigs were expressed more or less as soluble forms, although no lipolytic activities were detected. As they have many conserved regions with lipase 160-26 as well as other bacterial lipases throughout their equence, they are suggested as true lipase genes.

Sequence and Characterization of the Genomic Clone of the FVFD16 and FVFD30 Gene Isolated from Flammulina velutipes (팽이버섯에서 분리된 FVFD16과 FVFD30 유전자의 게놈클론의 염기서열 및 특성)

  • Kim, Dool-Yi;Azuma, Tomo-Nori
    • The Korean Journal of Mycology
    • /
    • v.28 no.1
    • /
    • pp.26-31
    • /
    • 2000
  • We isolated genomic clone of FVFD16 and FVFD30 gene specifically expressed during fruit body formation of Flammulina velutipes [(Curt: Fr.) Sing] and determinated the sequences. The FVFD16 gene is including two introns in open reading frame, and FVFD30 gene is including four introns. The introns were matched GT/AG rule. The FVFD16 and FVFD30 genes contained CAAT box with similarity arrange and TATA box. CT-rich region was presented before the transcription start point. FVFD30 gene is investigated that expected the most activity of CCACC arrange. The result of FVFD16 gene analysis showed 80% homology by cDNA clone that is gene family. From the results of genomic southern blot analysis, we presumed more than two copy number gene family of FVFD16 and FVFD30 gene.

  • PDF

A Study on the Development of Web-based Expert System for Urban Transit (웹 기반의 도시철도 전문가시스템 개발에 관한 연구)

  • Kim Hyunjun;Bae Chulho;Kim Sungbin;Lee Hoyong;Kim Moonhyun;Suh Myungwon
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.13 no.5
    • /
    • pp.163-170
    • /
    • 2005
  • Urban transit is a complex system that is combined electrically and mechanically, it is necessary to construct maintenance system for securing safety accompanying high-speed driving and maintaining promptly. Expert system is a computer program which uses numerical or non-numerical domain-specific knowledge to solve problems. In this research, we intend to develop the expert system which diagnose failure causes quickly and display measures. For the development of expert system, standardization of failure code classification system and creation of BOM(Bill Of Materials) have been first performed. Through the analysis of failure history and maintenance manuals, knowledge base has been constructed. Also, for retrieving the procedure of failure diagnosis and repair linking with the knowledge base, we have built RBR(Rule Based Reasoning) engine by pattern matching technique and CBR(Case Based Reasoning) engine by similarity search method. This system has been developed based on web to maximize the accessibility.