• Title/Summary/Keyword: Gap 클러스터링

Search Result 13, Processing Time 0.033 seconds

Clustering of Web Objects with Similar Popularity Trends (유사한 인기도 추세를 갖는 웹 객체들의 클러스터링)

  • Loh, Woong-Kee
    • The KIPS Transactions:PartD
    • /
    • v.15D no.4
    • /
    • pp.485-494
    • /
    • 2008
  • Huge amounts of various web items such as keywords, images, and web pages are being made widely available on the Web. The popularities of such web items continuously change over time, and mining temporal patterns in popularities of web items is an important problem that is useful for several web applications. For example, the temporal patterns in popularities of search keywords help web search enterprises predict future popular keywords, enabling them to make price decisions when marketing search keywords to advertisers. However, presence of millions of web items makes it difficult to scale up previous techniques for this problem. This paper proposes an efficient method for mining temporal patterns in popularities of web items. We treat the popularities of web items as time-series, and propose gapmeasure to quantify the similarity between the popularities of two web items. To reduce the computation overhead for this measure, an efficient method using the Fast Fourier Transform (FFT) is presented. We assume that the popularities of web items are not necessarily following any probabilistic distribution or periodic. For finding clusters of web items with similar popularity trends, we propose to use a density-based clustering algorithm based on the gap measure. Our experiments using the popularity trends of search keywords obtained from the Google Trends web site illustrate the scalability and usefulness of the proposed approach in real-world applications.

New Generation Gap Models for Evolutionary Algorithm in Real Parameter Optimization (실수최적화 진화 알고리즘을 위한 새로운 세대차 모델)

  • Choi, Jun-Seok;Seo, Ki-Sung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.1
    • /
    • pp.62-68
    • /
    • 2009
  • Two new generation gap models with modified parent-centric recombination(PCX) operator are proposed. First, the self-adaptation generation gap(SGG) model is a control method that keeps a replaced probability of parents by offspring to a certain level which obtains better performance. Second, virtual cluster generation gap(VCGG) is provided to extend distances among parents using clustering, which causes it to diversify individuals. In this model, distances among parents can be controlled by size of clusters. To demonstrate the effectiveness of our two proposed approaches, experiments for three standard test problems are executed and compared to most competing current approaches, CMA-ES and Generalized Generation Gap(G3) with PCX. It is shown two proposed methods are superior to consistently other approaches in the study.

A Brief Empirical Investigation of Seaport Clustering by Using Meta-Frontier and Cross-efficiency Models (메타프론티어와 교차효율성 모형을 통한 항만 클러스터링의 실증적 검증소고)

  • Park, Ro-Kyung
    • Korea Trade Review
    • /
    • v.41 no.3
    • /
    • pp.27-42
    • /
    • 2016
  • This study is to investigate seaport clustering by using meta-frontier and cross-efficiency models. Data covers the 13 Asian ports during 2009, 2010 and 2013 with 3 inputs(depth, total area, and number of cranes) and 1 output(TEU). Correlations coefficient from cross-efficiency matrix are used for measuring clustering dendrogram. After that, meta-frontier analysis for investigating whether the clustering using cross-efficiency method increases the meta-efficiency. Empirical main results are as follows: First, group efficiencies of Busan, Incheon, and Gwangyang ports are increased. Second, meta and group efficiencies of China ports are greater than those of Korean ports. Third, distortion of technology gap of Gwangyang is lower than that of Busan and Incheon. Fourth, Gwangyang, clustering with Ningbo, Chingtao, Tokyo and Caosung ports in 2009 and with Dubai port in 2013 can increase the efficiency. Fifth, to enhance the efficiency, Busan port should be clustered to group 2 in 2010 and group 1 in 2013, and Incheon port clustered to group 2 in 2010 and 2013. Fifth, it is empirically investigated that Busan, Incheon and Gwangyang ports can increase the efficiency by using Cross-efficiency and Meta-frontier models. Port policy planner should promote the clustering policy for Busan with Hong Kong, Shanghai, and Singapore, Incheon and Gwangyang with Chingtao, Nagoya, Ningbo, Tokyo, and Kaoshung ports.

  • PDF

An Empirical Comparison and Verification Study on the Seaport Clustering Measurement Using Meta-Frontier DEA and Integer Programming Models (메타프론티어 DEA모형과 정수계획모형을 이용한 항만클러스터링 측정에 대한 실증적 비교 및 검증연구)

  • Park, Ro-Kyung
    • Journal of Korea Port Economic Association
    • /
    • v.33 no.2
    • /
    • pp.53-82
    • /
    • 2017
  • The purpose of this study is to show the clustering trend and compare empirical results, as well as to choose the clustering ports for 3 Korean ports (Busan, Incheon, and Gwangyang) by using meta-frontier DEA (Data Envelopment Analysis) and integer models on 38 Asian container ports over the period 2005-2014. The models consider 4 input variables (birth length, depth, total area, and number of cranes) and 1 output variable (container TEU). The main empirical results of the study are as follows. First, the meta-frontier DEA for Chinese seaports identifies as most efficient ports (in decreasing order) Shanghai, Hongkong, Ningbo, Qingdao, and Guangzhou, while efficient Korean seaports are Busan, Incheon, and Gwangyang. Second, the clustering results of the integer model show that the Busan port should cluster with Dubai, Hongkong, Shanghai, Guangzhou, Ningbo, Qingdao, Singapore, and Kaosiung, while Incheon and Gwangyang should cluster with Shahid Rajaee, Haifa, Khor Fakkan, Tanjung Perak, Osaka, Keelong, and Bangkok ports. Third, clustering through the integer model sharply increases the group efficiency of Incheon (401.84%) and Gwangyang (354.25%), but not that of the Busan port. Fourth, the efficiency ranking comparison between the two models before and after the clustering using the Wilcoxon signed-rank test is matched with the average level of group efficiency (57.88 %) and the technology gap ratio (80.93%). The policy implication of this study is that Korean port policy planners should employ meta-frontier DEA, as well as integer models when clustering is needed among Asian container ports for enhancing the efficiency. In addition Korean seaport managers and port authorities should introduce port development and management plans accounting for the reference and clustered seaports after careful analysis.

Decomposition of a Text Block into Words Using Projection Profiles, Gaps and Special Symbols (투영 프로파일, GaP 및 특수 기호를 이용한 텍스트 영역의 어절 단위 분할)

  • Jeong Chang Bu;Kim Soo Hyung
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.9
    • /
    • pp.1121-1130
    • /
    • 2004
  • This paper proposes a method for line and word segmentation for machine-printed text blocks. To separate a text region into the unit of lines, it analyses the horizontal projection profile and performs a recursive projection profile cut method. In the word segmentation, between-word gaps are identified by a hierarchical clustering method after finding gaps in the text line by using a connected component analysis. In addition, a special symbol detection technique is applied to find two types of special symbols tying between words using their morphologic features. An experiment with 84 text regions from English and Korean documents shows that the proposed method achieves 99.92% accuracy of word segmentation, while a commercial OCR software named Armi 6.0 Pro$^{TM}$ has 97.58% accuracy.y.

Fuzzy Controller Modeling for Electromagnetic Levitation Systems based on Clustering Algorithm (클러스터링에 기초한 자기부상시스템의 퍼지제어기 모델링)

  • Kim, Min-Soo;Byun, Yeun-Sub;Lee, Kwan-Sup
    • Proceedings of the KSR Conference
    • /
    • 2006.11a
    • /
    • pp.145-159
    • /
    • 2006
  • This paper describes the development of a clustering based fuzzy controller of an electromagnetic suspension vehicle using gain scheduling method and Kalman filter for a simplified single magnet system. Electromagnetic suspension vehicle systems are highly nonlinear and essentially unstable systems For achieving the levitation control of the DC electromagnetic suspension system, we considered a fuzzy system modeling method based on clustering algorithm which a set of input/output data is collected from the well defined Linear Quadratic Gaussian(LQG) controller. Simulation results show that the proposed clustering based fuzzy controller methodology robustly yields uniform performance with adequate gap response over the mass variation range.

  • PDF

A Theoretical Study of Designing Thesaurus Browser by Clustering Algorithm (클러스터링을 이용한 시소러스 브라우저의 설계에 대한 이론적 연구)

  • Seo, Hwi
    • Journal of Korean Library and Information Science Society
    • /
    • v.30 no.3
    • /
    • pp.427-456
    • /
    • 1999
  • This paper deals with the problems of information retrieval through full-test database which arise from both the deficiency of searching strategies or methods by information searcher and the difficulties of query representation, generation, extension, etc. In oder to solve these problems, we should use automatic retrieval instead of manual retrieval in the past. One of the ways to make the gap narrow between the terms by the writers and query by the searchers is that the query should be searched with the terms which the writers use. Thus, the preconditions which should be taken one accorded way to solve the problems are that all areas of information retrieval such as should taken one accorded way to solve the problems are that all areas of information retrieval such as contents analysis, information structure, query formation, query evaluation, etc. should be solved as a coherence way. We need to deal all the ares of automatic information retrieval for the efficiency of retrieval thought this paper is trying to solve the design of thesaurus browser. Thus, this paper shows the theoretical analyses about the form of information retrieval, automatic indexing, clustering technique, establishing and expressing thesaurus, and information retrieval technique. As the result of analyzing them, this paper shows us theoretical model, that is to say, the thesaurus browser by clustering algorithm. The result in the paper will be a theoretical basis on new retrieval algorithm.

  • PDF

A Study On The Optimum Node Deployment In The Wireless Sensor Network System (무선 센서 네트워크의 최적화 노드배치에 관한 연구)

  • Choi, Weon-Gap;Park, Hyung-Moo
    • Journal of IKEEE
    • /
    • v.11 no.3
    • /
    • pp.100-107
    • /
    • 2007
  • One of the fundamental problems in wireless sensor networks is the efficient deployment of sensor nodes. The Fuzzy C-Means(FCM) clustering algorithm is proposed to determine the optimum location and minimum number of sensor nodes for the specific application space. We performed a simulation and a experiment using two rectangular and one L shape area. We found the minimum number of sensor nodes for the complete coverage of modeled area, and discovered the optimum location of each nodes. The real deploy experiment using sensor nodes shows the 94.6%, 92.2% and 95.7% error free communication rate respectively.

  • PDF

A Method for Determining the Peak Level of Risk in Root Industry Work Environment using Machine Learning (기계학습을 이용한 뿌리산업 작업 환경 위험도 피크레벨 결정방법)

  • Sang-Min Lee;Jun-Yeong Kim;Suk-Chan Kang;Kyung-Jun Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.1
    • /
    • pp.127-136
    • /
    • 2024
  • Because the hazardous working environments and high labor intensity of the root industry can potentially impact the health of workers, current regulations have focused on measuring and controlling environmental factors, on a semi-annual basis. However, there is a lack of quantitative criteria addressing workers' health conditions other than the physical work environment. This gap makes it challenging to prevent occupational diseases resulting from continuous exposure to harmful substances below regulatory thresholds. Therefore, this paper proposes a machine learning-based method for determining the peak level of risk in root industry work environments and enables real-time safety assessment in workplaces utilizing this approach.

A Grading Method for Student′s Achievements Based on the Clustering Technique (클러스터링에 기반한 학업성적의 등급화 방법)

  • Park, Eun-Jin;Chung, Hong;Jang, Duk-Sung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.12 no.2
    • /
    • pp.151-156
    • /
    • 2002
  • There are two methods in evaluation student s achievement. The two evaluation methods are absolute evaluation and relative evaluation. They have much advantages respectively, but also have some limitations such as being too stereotyped or causing overcompetition among learners. This paper suggests a new evaluation method which evaluates student s achievements by considering the score distribution and the frequency The proposed method classifies the scores into several clusters considering the goodness. This approach calculates the goodness by applying the RE(relaxation error), and grades the achievement scores based on the goodness. The suggested method can avoid the problem of grading caused by the narrow gap of scores because it sets a standard for grading by the calculated goodness considering the score distribution and frequency of occurrence. The method can differentiate achievements of a school from those of others, and that it is useful for selecting advanced students and dull ones, and for evaluation of classes based on student s achievement.