• 제목/요약/키워드: Industrial Clustering

검색결과 399건 처리시간 0.02초

융합 인공벌군집 데이터 클러스터링 방법 (Combined Artificial Bee Colony for Data Clustering)

  • 강범수;김성수
    • 산업경영시스템학회지
    • /
    • 제40권4호
    • /
    • pp.203-210
    • /
    • 2017
  • Data clustering is one of the most difficult and challenging problems and can be formally considered as a particular kind of NP-hard grouping problems. The K-means algorithm is one of the most popular and widely used clustering method because it is easy to implement and very efficient. However, it has high possibility to trap in local optimum and high variation of solutions with different initials for the large data set. Therefore, we need study efficient computational intelligence method to find the global optimal solution in data clustering problem within limited computational time. The objective of this paper is to propose a combined artificial bee colony (CABC) with K-means for initialization and finalization to find optimal solution that is effective on data clustering optimization problem. The artificial bee colony (ABC) is an algorithm motivated by the intelligent behavior exhibited by honeybees when searching for food. The performance of ABC is better than or similar to other population-based algorithms with the added advantage of employing fewer control parameters. Our proposed CABC method is able to provide near optimal solution within reasonable time to balance the converged and diversified searches. In this paper, the experiment and analysis of clustering problems demonstrate that CABC is a competitive approach comparing to previous partitioning approaches in satisfactory results with respect to solution quality. We validate the performance of CABC using Iris, Wine, Glass, Vowel, and Cloud UCI machine learning repository datasets comparing to previous studies by experiment and analysis. Our proposed KABCK (K-means+ABC+K-means) is better than ABCK (ABC+K-means), KABC (K-means+ABC), ABC, and K-means in our simulations.

텍스트마이닝을 활용한 산업공학 학술지의 논문 주제어간 연관관계 연구 (Finding Meaningful Pattern of Key Words in IIE Transactions Using Text Mining)

  • 조수곤;김성범
    • 대한산업공학회지
    • /
    • 제38권1호
    • /
    • pp.67-73
    • /
    • 2012
  • Identification of meaningful patterns and trends in large volumes of text data is an important task in various research areas. In the present study we crawled the keywords from the abstracts in IIE Transactions, one of the representative journals in the field of Industrial Engineering from 1969 to 2011. We applied low-dimensional embedding method, clustering analysis, association rule, and social network analysis to find meaningful associative patterns of key words frequently appeared in the paper.

불균형 이분 데이터 분류분석을 위한 데이터마이닝 절차 (A Data Mining Procedure for Unbalanced Binary Classification)

  • 정한나;이정화;전치혁
    • 대한산업공학회지
    • /
    • 제36권1호
    • /
    • pp.13-21
    • /
    • 2010
  • The prediction of contract cancellation of customers is essential in insurance companies but it is a difficult problem because the customer database is large and the target or cancelled customers are a small proportion of the database. This paper proposes a new data mining approach to the binary classification by handling a large-scale unbalanced data. Over-sampling, clustering, regularized logistic regression and boosting are also incorporated in the proposed approach. The proposed approach was applied to a real data set in the area of insurance and the results were compared with some other classification techniques.

서비스 부문의 기술혁신목적별 정부 지원제도의 활용도 분석 연구 (Data Mining for the Effectiveness of Government Support Strategies for Technology Innovation in Service Sectors)

  • 황두현;김우진;손소영
    • 산업공학
    • /
    • 제21권2호
    • /
    • pp.237-246
    • /
    • 2008
  • In today's competitive global environment, technological innovation is an important issue. Many countries are devising national level strategies to further strengthen industrial capacity in support of innovative companies. South Korea is no exception, and multiple strategies are in place to aid innovative development in the private sector. This study postulates that such national level strategies are applied differently depending on the innovation goal pursued by the service sector in Korea. We use data mining methods to test such research hypothesis. Factor analysis is used for clustering of various service companies, while association rule is used in finding the relationship per each cluster. The results show that national level strategies are underutilized and unequally distributed. This may be attributed to the disparity between the demand and needs of the private sector and the opinion of the government, which lead to underutilized and indistinguishable strategies.

모델 축소를 위한 그룹 모델 클러스터링 방법에 대한 연구 (Group Model Clustering Method for Model Downsizing)

  • 박미나;하진영
    • 산업기술연구
    • /
    • 제28권A호
    • /
    • pp.185-189
    • /
    • 2008
  • Practical pattern recognition systems should overcome very large class problem. Sometimes it is almost impossible to build every model for every class due to memory and time constraints. For this case, grouping similar models will be helpful. In this paper, we propose GMC(Group Model Clustering) to build a large class Chinese character recognition system. We built hidden Markov models for 10% of total classes, then classify the rest of classes into already trained group classes. Finally group models are trained using group model clustered data. Recognition is performed using only group models, in order to achieve reduced model size and improved recognition speed.

  • PDF

Industrial Waste Database Analysis Using Data Mining Techniques

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • 제17권2호
    • /
    • pp.455-465
    • /
    • 2006
  • Data mining is the method to find useful information for large amounts of data in database. It is used to find hidden knowledge by massive data, unexpectedly pattern, and relation to new rule. The methods of data mining are decision tree, association rules, clustering, neural network and so on. We analyze industrial waste database using data mining technique. We use k-means algorithm for clustering and C5.0 algorithm for decision tree and Apriori algorithm for association rule. We can use these outputs for environmental preservation and environmental improvement.

  • PDF

Analyzing Offshore Wind Power Patent Portfolios by Using Data Clustering

  • Chang, Shu-Hao;Fan, Chin-Yuan
    • Industrial Engineering and Management Systems
    • /
    • 제13권1호
    • /
    • pp.107-115
    • /
    • 2014
  • Offshore wind power has been extremely popular in recent years, and in the energy technology field, relevant research has been increasingly conducted. However, research regarding patent portfolios is still insufficient. The purpose of this research is to study the status of mainstream offshore wind power technology and patent portfolios and to investigate major assignees and countries to obtain a thorough understanding of the developmental trends of offshore wind power technology. The findings may be used by the government and industry for designing additional strategic development proposals. Data mining methods, such as multiple correspondence analyses and k-means clustering, were implemented to explore the competing technological and strategic-group relationships within the offshore wind power industry. The results indicate that the technological positions and patent portfolios of the countries and manufacturers are different. Additional technological development strategy recommendations were proposed for the offshore wind power industry.

신경망 및 통계적 방법에 의한 클러스터링 성능평가 (A Study on Performance Evaluation of Clustering Algorithms using Neural and Statistical Method)

  • 윤석환;민준영;신용백
    • 산업경영시스템학회지
    • /
    • 제19권37호
    • /
    • pp.41-51
    • /
    • 1996
  • This paper evaluates the clustering performance of a neural network and a statistical method. Algorithms which are used in this paper are the GLVQ(Generalized Learning vector Quantization) for a neural method and the k-means algorithm fer a statistical clustering method. For comparison of two methods, we calculate the Rand's c statistics. As a result, the mean of c value obtained with the GLVQ is higher than that obtained with the k-means algorithm, while standard deviation of c value is lower. Experimental data sets were the Fisher's IRIS data and patterns extracted from handwritten numerals.

  • PDF

Industrial Waste Database Analysis Using Data Mining

  • 조광현;박희창
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 한국데이터정보과학회 2006년도 PROCEEDINGS OF JOINT CONFERENCEOF KDISS AND KDAS
    • /
    • pp.241-251
    • /
    • 2006
  • Data mining is the method to find useful information for large amounts of data in database It is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. The methods of data mining are decision tree, association rules, clustering, neural network and so on. We analyze industrial waste database using data mining technique. We use k-means algorithm for clustering and C5.0 algorithm for decision tree and Apriori algorithm for association rule. We can use these analysis outputs for environmental preservation and environmental improvement.

  • PDF

Clustering Parts Based on the Design and Manufacturing Similarities Using a Genetic Algorithm

  • Lee, Sung-Youl
    • 한국산업정보학회논문지
    • /
    • 제16권4호
    • /
    • pp.119-125
    • /
    • 2011
  • The part family (PF) formation in a cellular manufacturing has been a key issue for the successful implementation of Group Technology (GT). Basically, a part has two different attributes; i.e., design and manufacturing. The respective similarity in both attributes is often conflicting each other. However, the two attributes should be taken into account appropriately in order for the PF to maximize the benefits of the GT implementation. This paper proposes a clustering algorithm which considers the two attributes simultaneously based on pareto optimal theory. The similarity in each attribute can be represented as two individual objective functions. Then, the resulting two objective functions are properly combined into a pareto fitness function which assigns a single fitness value to each solution based on the two objective functions. A GA is used to find the pareto optimal set of solutions based on the fitness function. A set of hypothetical parts are grouped using the proposed system. The results show that the proposed system is very promising in clustering with multiple objectives.