• Title/Summary/Keyword: Pattern Processing

Search Result 2,356, Processing Time 0.023 seconds

WebPR : A Dynamic Web Page Recommendation Algorithm Based on Mining Frequent Traversal Patterns (WebPR :빈발 순회패턴 탐사에 기반한 동적 웹페이지 추천 알고리즘)

  • Yoon, Sun-Hee;Kim, Sam-Keun;Lee, Chang-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.11B no.2
    • /
    • pp.187-198
    • /
    • 2004
  • The World-Wide Web is the largest distributed Information space and has grown to encompass diverse information resources. However, although Web is growing exponentially, the individual's capacity to read and digest contents is essentially fixed. From the view point of Web users, they can be confused by explosion of Web information, by constantly changing Web environments, and by lack of understanding needs of Web users. In these Web environments, mining traversal patterns is an important problem in Web mining with a host of application domains including system design and Information services. Conventional traversal pattern mining systems use the inter-pages association in sessions with only a very restricted mechanism (based on vector or matrix) for generating frequent k-Pagesets. We develop a family of novel algorithms (termed WebPR - Web Page Recommend) for mining frequent traversal patterns and then pageset to recommend. Our algorithms provide Web users with new page views, which Include pagesets to recommend, so that users can effectively traverse its Web site. The main distinguishing factors are both a point consistently spanning schemes applying inter-pages association for mining frequent traversal patterns and a point proposing the most efficient tree model. Our experimentation with two real data sets, including Lady Asiana and KBS media server site, clearly validates that our method outperforms conventional methods.

A Fast String Matching Scheme without using Buffer for Linux Netfilter based Internet Worm Detection (리눅스 넷필터 기반의 인터넷 웜 탐지에서 버퍼를 이용하지 않는 빠른 스트링 매칭 방법)

  • Kwak, Hu-Keun;Chung, Kyu-Sik
    • The KIPS Transactions:PartC
    • /
    • v.13C no.7 s.110
    • /
    • pp.821-830
    • /
    • 2006
  • As internet worms are spread out worldwide, the detection and filtering of worms becomes one of hot issues in the internet security. As one of implementation methods to detect worms, the Linux Netfilter kernel module can be used. Its basic operation for worm detection is a string matching where coming packet(s) on the network is/are compared with predefined worm signatures(patterns). A worm can appear in a packet or in two (or more) succeeding packets where some part of worm is in the first packet and its remaining part is in its succeeding packet(s). Assuming that the maximum length of a worm pattern is less than 1024 bytes, we need to perform a string matching up to two succeeding packets of 2048 bytes. To do so, Linux Netfilter keeps the previous packet in buffer and performs matching with a combined 2048 byte string of the buffered packet and current packet. As the number of concurrent connections to be handled in the worm detection system increases, the total size of buffer (memory) increases and string matching speed becomes low In this paper, to reduce the memory buffer size and get higher speed of string matching, we propose a string matching scheme without using buffer. The proposed scheme keeps the partial matching result of the previous packet with signatures and has no buffering for previous packet. The partial matching information is used to detect a worm in the two succeeding packets. We implemented the proposed scheme by modifying the Linux Netfilter. Then we compared the modified Linux Netfilter module with the original Linux Netfilter module. Experimental results show that the proposed scheme has 25% lower memory usage and 54% higher speed compared to the original scheme.

Accurate Camera Calibration Method for Multiview Stereoscopic Image Acquisition (다중 입체 영상 획득을 위한 정밀 카메라 캘리브레이션 기법)

  • Kim, Jung Hee;Yun, Yeohun;Kim, Junsu;Yun, Kugjin;Cheong, Won-Sik;Kang, Suk-Ju
    • Journal of Broadcast Engineering
    • /
    • v.24 no.6
    • /
    • pp.919-927
    • /
    • 2019
  • In this paper, we propose an accurate camera calibration method for acquiring multiview stereoscopic images. Generally, camera calibration is performed by using checkerboard structured patterns. The checkerboard pattern simplifies feature point extraction process and utilizes previously recognized lattice structure, which results in the accurate estimation of relations between the point on 2-dimensional image and the point on 3-dimensional space. Since estimation accuracy of camera parameters is dependent on feature matching, accurate detection of checkerboard corner is crucial. Therefore, in this paper, we propose the method that performs accurate camera calibration method through accurate detection of checkerboard corners. Proposed method detects checkerboard corner candidates by utilizing 1-dimensional gaussian filters with succeeding corner refinement process to remove outliers from corner candidates and accurately detect checkerboard corners in sub-pixel unit. In order to verify the proposed method, we check reprojection errors and camera location estimation results to confirm camera intrinsic parameters and extrinsic parameters estimation accuracy.

Development of a Gridded Simulation Support System for Rice Growth Based on the ORYZA2000 Model (ORYZA2000 모델에 기반한 격자형 벼 생육 모의 지원 시스템 개발)

  • Hyun, Shinwoo;Yoo, Byoung Hyun;Park, Jinyu;Kim, Kwang Soo
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.19 no.4
    • /
    • pp.270-279
    • /
    • 2017
  • Regional assessment of crop productivity using a gridded simulation approach could aid policy making and crop management. Still, little effort has been made to develop the systems that allows gridded simulations of crop growth using ORYZA 2000 model, which has been used for predicting rice yield in Korea. The objectives of this study were to develop a series of data processing modules for creating input data files, running the crop model, and aggregating output files in a region of interest using gridded data files. These modules were implemented using C++ and R to make the best use of the features provided by these programming languages. In a case study, 13000 input files in a plain text format were prepared using daily gridded weather data that had spatial resolution of 1km and 12.5 km for the period of 2001-2010. Using the text files as inputs to ORYZA2000 model, crop yield simulations were performed for each grid cell using a scenario of crop management practices. After output files were created for grid cells that represent a paddy rice field in South Korea, each output file was aggregated into an output file in the netCDF format. It was found that the spatial pattern of crop yield was relatively similar to actual distribution of yields in Korea, although there were biases of crop yield depending on regions. It seemed that those differences resulted from uncertainties incurred in input data, e.g., transplanting date, cultivar in an area, as well as weather data. Our results indicated that a set of tools developed in this study would be useful for gridded simulation of different crop models. In the further study, it would be worthwhile to take into account compatibility to a modeling interface library for integrated simulation of an agricultural ecosystem.

An Efficient Clustering Algorithm based on Heuristic Evolution (휴리스틱 진화에 기반한 효율적 클러스터링 알고리즘)

  • Ryu, Joung-Woo;Kang, Myung-Ku;Kim, Myung-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.1_2
    • /
    • pp.80-90
    • /
    • 2002
  • Clustering is a useful technique for grouping data points such that points within a single group/cluster have similar characteristics. Many clustering algorithms have been developed and used in engineering applications including pattern recognition and image processing etc. Recently, it has drawn increasing attention as one of important techniques in data mining. However, clustering algorithms such as K-means and Fuzzy C-means suffer from difficulties. Those are the needs to determine the number of clusters apriori and the clustering results depending on the initial set of clusters which fails to gain desirable results. In this paper, we propose a new clustering algorithm, which solves mentioned problems. In our method we use evolutionary algorithm to solve the local optima problem that clustering converges to an undesirable state starting with an inappropriate set of clusters. We also adopt a new measure that represents how well data are clustered. The measure is determined in terms of both intra-cluster dispersion and inter-cluster separability. Using the measure, in our method the number of clusters is automatically determined as the result of optimization process. And also, we combine heuristic that is problem-specific knowledge with a evolutionary algorithm to speed evolutionary algorithm search. We have experimented our algorithm with several sets of multi-dimensional data and it has been shown that one algorithm outperforms the existing algorithms.

The Integration System for International Procurement Information Processing (국제입찰정보 통합시스템의 설계 및 구현)

  • Yoon, Jong-Wan;Lee, Jong-Woo;Park, Chan-Young
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.8 no.1
    • /
    • pp.71-81
    • /
    • 2002
  • The lack of specialties of the existing commercial web search systems stems from the fact that they have no capabilities to extract and gather the meaningful information from each information domain they cover. We are sure, however, that the necessity for the information integration system, not just search system, will be likely to become larger in the future. In this paper, we propose a design and implementation of an information integration system called TIC(target information collector). TIC is able to extract meaningful information from a specific information area in the internet and integrate them for the commercial service. We also show the evaluation results of our implementation. For the experiments we applied our TIC to the international procurement information area. The international procurement information is publicly and freely announced by each government to the world. To automatically extract common properties from the related source sites, we adopt information pointing technique using inter-HTML tag pattern parsing. And through the information integration framework design, we can easily implement a site-specific information integration engine. By running our TIC for about 8 months, we find out it can remove considerable amount of the duplicated information, and as a result, we can obtain high quality international procurement information. The main contribution of this paper is to present a framework design and it's implementation for extracting the information of a specific area and then integrating them into a meaningful one.

Analysis of Network Traffic with Urban Area Characteristics for Mobile Network Traffic Model (이동통신 네트워크 트래픽 모델을 위한 도시 지역 이동통신 트래픽 특성 분석)

  • Yoon, Young-Hyun
    • The KIPS Transactions:PartC
    • /
    • v.10C no.4
    • /
    • pp.471-478
    • /
    • 2003
  • Traditionally,, analysis, simulation and measurement have all been used to evaluate the performance of network protocols and functional entities that support mobile wireless service. Simulation methods are useful for testing the complex systems which have the very complicate interactions between components. To develop a mobile call simulator which is used to examine, validate, and predict the performance of mobile wireless call procedures must have the teletraffic model, which is to describe the mobile communication environments. Mobile teletraffic model is consists of 2 sub-models, traffic source and network traffic model. In this paper, we analyzed the network traffic data which are gathered from selected Base Stations (BSs) to define the mobile teletraffic model. We defined 4 types of cell location-Residential, Commercial, Industrial, and Afforest zone. We selected some Base Stations (BSs) which are represented cell location types in Seoul city, and gathered real data from them And then, we present the call rate per hour, cail distribution pattern per day, busy hours, loose hours, the maximum number of call, and the minimum number of calls based on defined cell location types. Those parameters are very important to test the mobile communication system´s performance and reliability and are very useful for defining the mobile network traffic model or for working the existed mobile simulation programs as input parameters.

The Construction of Multiform User Profiles Based on Transaction for Effective Recommendation and Segmentation (효과적인 추천과 세분화를 위한 트랜잭션 기반 여러 형태 사용자 프로파일의 구축)

  • Koh, Jae-Jin;An, Hyoung-Keun
    • The KIPS Transactions:PartD
    • /
    • v.13D no.5 s.108
    • /
    • pp.661-670
    • /
    • 2006
  • With the development of e-Commerce and the proliferation of easily accessible information, information filtering systems such as recommender and SDI systems have become popular to prune large information spaces so that users are directed toward those items that best meet their needs and preferences. Until now, many information filtering methods have been proposed to support filtering systems. XML is emerging as a new standard for information. Recently, filtering systems need new approaches in dealing with XML documents. So, in this paper our system suggests a method to create multiform user profiles with XML's ability to represent structure. This system consists of two parts; one is an administrator profile definition part that an administrator defines to analyze users purchase pattern before a transaction such as purchase happens directly. an other is a user profile creation part module which is applied by the defined profile. Administrator profiles are made from DTD information and it is supposed to point the specific part of a document conforming to the DTD. Proposed system builds user's profile more accurately to get adaptability for user's behavior of buying and provide useful product information without inefficient searching based on such user's profile.

A Study on the Most Frequent Diseases of Health Insurance Program and the Primary Care Physicians in Korea (의료보험 다빈도 상병과 1차진료 의사에 관한 연구)

  • 김철환;문옥륜
    • Health Policy and Management
    • /
    • v.3 no.1
    • /
    • pp.124-145
    • /
    • 1993
  • General practitioners, internists, pediatricians, and family physicians are classified as so-called primary care physicians in the United States. We carried out this study for the purpose of answering the following question; "Who are the primary care physicians in Korea\ulcorner" We analyzed the 663, 154 claims which were drawn from the health insurance processing file made during the period of one month, April 1992 on the basis of systemic random sampling technique. The 663, 154 cases were matched with the doctor's file registered at the National Federation Medical Insurance by using the indivisual physician code number and analyzed according to the kind of specialty. If we follow the Geyman's definition of primary care physician in the United States, this study shows that they can take care of 43.2% of the total private clinic's claims in Korea. Provided that general practitioners and family physicians are considered the same way as in the United Kingdom, they could with only 8.3% of the total claims in Korea. The most frequent diseases are those which rank first to 46th in the total private clinic's claims. The proportion of the most frequent diseases was highest for pediatricians(90.4%) and followed by internists(81.4%), otolaryngologists(78.7%) and family physicians(76.5%). The proportion of the most frequent diseases in the most common 46 diseases was highest for radiologists(80.4%) and the next was as follows : general practitioners(78.3%), family physicians(67.4%), and internists(67.4%). We classified the most common 20 diseases of each specialty into 17 categories of ICD-9 and compared it with those of general practitioners. The specialists who had managed a similar disease pattern to those of general practitioners were identified as anesthesiologists, family physicians, general surgeons, and internists. Some specialists practicing at private clinics managed the diseases which were not quite appropriate for their specialties. After we evaluated each specialty by the most common diseases, the most frequent diseases, and the most frequent 20 diseases of each specialty in terms of the 17 categories of ICD-9, a tentative assumption is made that the primary physicians in the Republic of Korea are general practitioners, anesthesiologists, family physicians, internists, and general surgeons. This study has concluded that the categories of the primary care physicians are so diverse that their roles and distributions are distorted accordingly. Vigorous health policy efforts in correcting the malcomposition need to be made for the better provision of primary health care in Korea. in Korea.

  • PDF

A MapReduce-Based Workflow BIG-Log Clustering Technique (맵리듀스기반 워크플로우 빅-로그 클러스터링 기법)

  • Jin, Min-Hyuck;Kim, Kwanghoon Pio
    • Journal of Internet Computing and Services
    • /
    • v.20 no.1
    • /
    • pp.87-96
    • /
    • 2019
  • In this paper, we propose a MapReduce-supported clustering technique for collecting and classifying distributed workflow enactment event logs as a preprocessing tool. Especially, we would call the distributed workflow enactment event logs as Workflow BIG-Logs, because they are satisfied with as well as well-fitted to the 5V properties of BIG-Data like Volume, Velocity, Variety, Veracity and Value. The clustering technique we develop in this paper is intentionally devised for the preprocessing phase of a specific workflow process mining and analysis algorithm based upon the workflow BIG-Logs. In other words, It uses the Map-Reduce framework as a Workflow BIG-Logs processing platform, it supports the IEEE XES standard data format, and it is eventually dedicated for the preprocessing phase of the ${\rho}$-Algorithm that is a typical workflow process mining algorithm based on the structured information control nets. More precisely, The Workflow BIG-Logs can be classified into two types: of activity-based clustering patterns and performer-based clustering patterns, and we try to implement an activity-based clustering pattern algorithm based upon the Map-Reduce framework. Finally, we try to verify the proposed clustering technique by carrying out an experimental study on the workflow enactment event log dataset released by the BPI Challenges.