• Title/Summary/Keyword: Variable Clustering

Search Result 155, Processing Time 0.024 seconds

An Empirical Comparison and Verification Study on the Seaport Clustering Measurement Using Meta-Frontier DEA and Integer Programming Models (메타프론티어 DEA모형과 정수계획모형을 이용한 항만클러스터링 측정에 대한 실증적 비교 및 검증연구)

  • Park, Ro-Kyung
    • Journal of Korea Port Economic Association
    • /
    • v.33 no.2
    • /
    • pp.53-82
    • /
    • 2017
  • The purpose of this study is to show the clustering trend and compare empirical results, as well as to choose the clustering ports for 3 Korean ports (Busan, Incheon, and Gwangyang) by using meta-frontier DEA (Data Envelopment Analysis) and integer models on 38 Asian container ports over the period 2005-2014. The models consider 4 input variables (birth length, depth, total area, and number of cranes) and 1 output variable (container TEU). The main empirical results of the study are as follows. First, the meta-frontier DEA for Chinese seaports identifies as most efficient ports (in decreasing order) Shanghai, Hongkong, Ningbo, Qingdao, and Guangzhou, while efficient Korean seaports are Busan, Incheon, and Gwangyang. Second, the clustering results of the integer model show that the Busan port should cluster with Dubai, Hongkong, Shanghai, Guangzhou, Ningbo, Qingdao, Singapore, and Kaosiung, while Incheon and Gwangyang should cluster with Shahid Rajaee, Haifa, Khor Fakkan, Tanjung Perak, Osaka, Keelong, and Bangkok ports. Third, clustering through the integer model sharply increases the group efficiency of Incheon (401.84%) and Gwangyang (354.25%), but not that of the Busan port. Fourth, the efficiency ranking comparison between the two models before and after the clustering using the Wilcoxon signed-rank test is matched with the average level of group efficiency (57.88 %) and the technology gap ratio (80.93%). The policy implication of this study is that Korean port policy planners should employ meta-frontier DEA, as well as integer models when clustering is needed among Asian container ports for enhancing the efficiency. In addition Korean seaport managers and port authorities should introduce port development and management plans accounting for the reference and clustered seaports after careful analysis.

Social Network Analysis for the Effective Adoption of Recommender Systems (추천시스템의 효과적 도입을 위한 소셜네트워크 분석)

  • Park, Jong-Hak;Cho, Yoon-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.305-316
    • /
    • 2011
  • Recommender system is the system which, by using automated information filtering technology, recommends products or services to the customers who are likely to be interested in. Those systems are widely used in many different Web retailers such as Amazon.com, Netfix.com, and CDNow.com. Various recommender systems have been developed. Among them, Collaborative Filtering (CF) has been known as the most successful and commonly used approach. CF identifies customers whose tastes are similar to those of a given customer, and recommends items those customers have liked in the past. Numerous CF algorithms have been developed to increase the performance of recommender systems. However, the relative performances of CF algorithms are known to be domain and data dependent. It is very time-consuming and expensive to implement and launce a CF recommender system, and also the system unsuited for the given domain provides customers with poor quality recommendations that make them easily annoyed. Therefore, predicting in advance whether the performance of CF recommender system is acceptable or not is practically important and needed. In this study, we propose a decision making guideline which helps decide whether CF is adoptable for a given application with certain transaction data characteristics. Several previous studies reported that sparsity, gray sheep, cold-start, coverage, and serendipity could affect the performance of CF, but the theoretical and empirical justification of such factors is lacking. Recently there are many studies paying attention to Social Network Analysis (SNA) as a method to analyze social relationships among people. SNA is a method to measure and visualize the linkage structure and status focusing on interaction among objects within communication group. CF analyzes the similarity among previous ratings or purchases of each customer, finds the relationships among the customers who have similarities, and then uses the relationships for recommendations. Thus CF can be modeled as a social network in which customers are nodes and purchase relationships between customers are links. Under the assumption that SNA could facilitate an exploration of the topological properties of the network structure that are implicit in transaction data for CF recommendations, we focus on density, clustering coefficient, and centralization which are ones of the most commonly used measures to capture topological properties of the social network structure. While network density, expressed as a proportion of the maximum possible number of links, captures the density of the whole network, the clustering coefficient captures the degree to which the overall network contains localized pockets of dense connectivity. Centralization reflects the extent to which connections are concentrated in a small number of nodes rather than distributed equally among all nodes. We explore how these SNA measures affect the performance of CF performance and how they interact to each other. Our experiments used sales transaction data from H department store, one of the well?known department stores in Korea. Total 396 data set were sampled to construct various types of social networks. The dependant variable measuring process consists of three steps; analysis of customer similarities, construction of a social network, and analysis of social network patterns. We used UCINET 6.0 for SNA. The experiments conducted the 3-way ANOVA which employs three SNA measures as dependant variables, and the recommendation accuracy measured by F1-measure as an independent variable. The experiments report that 1) each of three SNA measures affects the recommendation accuracy, 2) the density's effect to the performance overrides those of clustering coefficient and centralization (i.e., CF adoption is not a good decision if the density is low), and 3) however though the density is low, the performance of CF is comparatively good when the clustering coefficient is low. We expect that these experiment results help firms decide whether CF recommender system is adoptable for their business domain with certain transaction data characteristics.

A Study of Market Segmentation of Optical Shop Based on Customer's Values (고객의 가치관에 따른 안경원의 시장세분화에 관한 연구)

  • Lee, Jung-Kyu;Cha, Jung-Won
    • Journal of Korean Ophthalmic Optics Society
    • /
    • v.20 no.4
    • /
    • pp.405-414
    • /
    • 2015
  • Purpose: We analyse characteristics of optical shop customer's segmented market by using clustering analysis, and we expect it would be a useful indicator of marketing strategy for optical shops. Methods: Survey was conducted from March 10 to March 31, 2015. The survey asked customers who have visited optical shops in Seoul and Northern Gyeonggi-do regions, and analyzed by utilizing SPSS v.10.0 statistical package program. The analysing methods are frequency analysis, factor analysis about variable of values, clustering analysis for market segmentation, and crosstabs. Results: The market is segmented based on values. In the process of establishing marketing strategy, it is useful to establish strategy by classifying customers into 3 types of cluster; "middle level value oriented cluster", "high level value oriented cluster", "high level value oriented and non-religious cluster". In marketing strategy of progressive lenses, it turned out that the most important strategy is to target self-employed person in "middle level value oriented cluster". Conclusions: As a result of market segmentation by using clustering analysis, it was classified into 3 types of cluster, and we found that most important customer for progressive lenses is self-employed person in "middle level value oriented cluster" who is more than 41 years old.

Comparing MCMC algorithms for the horseshoe prior (Horseshoe 사전분포에 대한 MCMC 알고리듬 비교 연구)

  • Miru Ma;Mingi Kang;Kyoungjae Lee
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.1
    • /
    • pp.103-118
    • /
    • 2024
  • The horseshoe prior is notably one of the most popular priors in sparse regression models, where only a small fraction of coefficients are nonzero. The parameter space of the horseshoe prior is much smaller than that of the spike and slab prior, so it enables us to efficiently explore the parameter space even in high-dimensions. However, on the other hand, the horseshoe prior has a high computational cost for each iteration in the Gibbs sampler. To overcome this issue, various MCMC algorithms for the horseshoe prior have been proposed to reduce the computational burden. Especially, Johndrow et al. (2020) recently proposes an approximate algorithm that can significantly improve the mixing and speed of the MCMC algorithm. In this paper, we compare (1) the traditional MCMC algorithm, (2) the approximate MCMC algorithm proposed by Johndrow et al. (2020) and (3) its variant in terms of computing times, estimation and variable selection performance. For the variable selection, we adopt the sequential clustering-based method suggested by Li and Pati (2017). Practical performances of the MCMC methods are demonstrated via numerical studies.

Comparison of Classification Rate for PD Sources using Different Classification Schemes

  • Park Seong-Hee;Lim Kee-Joe;Kang Seong-Hwa
    • Journal of Electrical Engineering and Technology
    • /
    • v.1 no.2
    • /
    • pp.257-262
    • /
    • 2006
  • Insulation failure in an electrical utility depends on the continuous stress imposed upon it. Monitoring of the insulation condition is a significant issue for safe operation of the electrical power system. In this paper, comparison of recognition rate variable classification scheme of PD (partial discharge) sources that occur within an electrical utility are studied. To acquire PD data, five defective models are made, that is, air discharge, void discharge and three types of treeinging discharge. Furthermore, these statistical distributions are applied to classify PD sources as the input data for the classification tools. ANFIS shows the highest rate, the value of which is 99% and PCA-LDA and ANFIS are superior to BP in regards to other matters.

Speed Control of BLDC Motor Drive Using an Adaptive Fuzzy P+ID Controller (적응 퍼지 P+ID 제어기를 이용한 BLDC 전동기의 속도제어)

  • Kwon, Chung-Jin;Han, Woo-Yang;Sin, Dong-Yang;Kim, Sung-Joong
    • Proceedings of the KIEE Conference
    • /
    • 2002.07b
    • /
    • pp.1172-1174
    • /
    • 2002
  • An adaptive fuzzy P + ID controller for variable speed operation of BLDC motor drives is presented in this paper. Generally, a conventional PID controller is most widely used in industry due to its simple control structure and ease of design. However, the PID controller suffers from the electrical machine parameter variations and disturbances. To improve the tracking performance for parameter and load variations, the controller proposed in this paper is constructed by using an adaptive fuzzy logic controller in place of the proportional term in a conventional PID controller. For implementing this controller, only one additional parameter has to be adjusted in comparison with the PID controller. An adaptive fuzzy controller applied to proportional term to achieve robustness against parameter variations has simple structure and computational simplicity. The controller based on optimal fuzzy logic controller has an self-tuning characteristics with clustering. Computer simulation results show the usefulness of the proposed controller.

  • PDF

Chaotic Features for Traffic Video Classification

  • Wang, Yong;Hu, Shiqiang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.8
    • /
    • pp.2833-2850
    • /
    • 2014
  • This paper proposes a novel framework for traffic video classification based on chaotic features. First, each pixel intensity series in the video is modeled as a time series. Second, the chaos theory is employed to generate chaotic features. Each video is then represented by a feature vector matrix. Third, the mean shift clustering algorithm is used to cluster the feature vectors. Finally, the earth mover's distance (EMD) is employed to obtain a distance matrix by comparing the similarity based on the segmentation results. The distance matrix is transformed into a matching matrix, which is evaluated in the classification task. Experimental results show good traffic video classification performance, with robustness to environmental conditions, such as occlusions and variable lighting.

A Linear Sliding Surface Design Method for a Class of Uncertain Systems with Mismatched Uncertainties (불확실성이 매칭조건을 만족시키지 않는 선형 시스템을 위한 슬라이딩 평면 설계 방법)

  • 최한호
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.9 no.11
    • /
    • pp.861-867
    • /
    • 2003
  • We propose a sliding surface design method for linear systems with mismatched uncertainties in the state space model. In terms of LMIs, we derive a necessary and sufficient condition for the existence of a linear sliding surface such that the reduced-order equivalent sliding mode dynamics restricted to the linear sliding surface is not only stable but completely invariant to mismatched uncertainties. We give an explicit formula of all such linear switching surfaces in terms of solution matrices to the LMI existence condition. We also give a switching feedback control law, together with a design algorithm. Additionally, we give some hints for designing linear switching surfaces guaranteeing pole clustering constraints or linear quadratic performance bound constraints. Finally, we give a design example in order to show the effectiveness of the proposed methodology.

Waste Database Analysis Joined with Local Information Using Decision Tree Techniques

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2005.04a
    • /
    • pp.164-173
    • /
    • 2005
  • Data mining is the method to find useful information for large amounts of data in database. It is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. The methods of data mining are decision tree, association rules, clustering, neural network and so on. The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud detection, data reduction and variable screening, category merging, etc. We analyze waste database united with local information using decision tree techniques for environmental information. We can use these decision tree outputs for environmental preservation and improvement.

  • PDF

A Computer-Aided Statistical Approach to Strategic Information Systems Planning (정보시스템 전략적 계획을 위한 컴퓨터지원 통계적 접근방법)

  • Kim, Jin-Su;Hwang, Cheol-Eon
    • Asia pacific journal of information systems
    • /
    • v.4 no.2
    • /
    • pp.188-213
    • /
    • 1994
  • Strategic information systems planning (SISP) remains a critical issue of many organizations and also the top IS concern of chief executives. Therefore, researchers have investigated SISP practices and tried to improve a methodology. Among the various issues of SISP, systematically determining subject database groupings and fully automating the processes are important aspects. This study presents an alternate methodology using a statistical technique, a variable clustering approach, and systematic rules for determining database groupings, which can be fully automated. This methodology provides a strong theoritical justification as well as systematic and simple criteria for database groupings, enhanced interpretability of the output, and would be easy to include in CASE software application.

  • PDF