• Title/Summary/Keyword: 2-Step Clustering

Search Result 86, Processing Time 0.024 seconds

Similarity Analysis of Hospitalization using Crowding Distance

  • Jung, Yong Gyu;Choi, Young Jin;Cha, Byeong Heon
    • International journal of advanced smart convergence
    • /
    • v.5 no.2
    • /
    • pp.53-58
    • /
    • 2016
  • With the growing use of big data and data mining, it serves to understand how such techniques can be used to understand various relationships in the healthcare field. This study uses hierarchical methods of data analysis to explore similarities in hospitalization across several New York state counties. The study utilized methods of measuring crowding distance of data for age-specific hospitalization period. Crowding distance is defined as the longest distance, or least similarity, between urban cities. It is expected that the city of Clinton have the greatest distance, while Albany the other cities are closer because they are connected by the shortest distance to each step. Similarities were stronger across hospital stays categorized by age. Hierarchical clustering can be applied to predict the similarity of data across the 10 cities of hospitalization with the measurement of crowding distance. In order to enhance the performance of hierarchical clustering, comparison can be made across congestion distance when crowding distance is applied first through the application of converting text to an attribute vector. Measurements of similarity between two objects are dependent on the measurement method used in clustering but is distinguished from the similarity of the distance; where the smaller the distance value the more similar two things are to one other. By applying this specific technique, it is found that the distance between crowding is reduced consistently in relationship to similarity between the data increases to enhance the performance of the experiments through the application of special techniques. Furthermore, through the similarity by city hospitalization period, when the construction of hospital wards in cities, by referring to results of experiments, or predict possible will land to the extent of the size of the hospital facilities hospital stay is expected to be useful in efficiently managing the patient in a similar area.

Detection and Classification of Leaf Diseases for Phenomics System (피노믹스 시스템을 위한 식물 잎의 질병 검출 및 분류)

  • Gwan Ik, Park;Kyu Dong, Sim;Min Su, Kyeon;Sang Hwa, Lee;Jeong Hyun, Baek;Jong-Il, Park
    • Journal of Broadcast Engineering
    • /
    • v.27 no.6
    • /
    • pp.923-935
    • /
    • 2022
  • This paper deals with detection and classification of leaf diseases for phenomics systems. As the smart farm systems of plants are increased, It is important to determine quickly the abnormal growth of plants without supervisors. This paper considers the color distribution and shape information of leaf diseases, and designs two deep leaning networks in training the leaf diseases. In the first step, color distribution of input image is analyzed for possible diseases. In the second step, the image is first partitioned into small segments using mean shift clustering, and the color information of each segment is inspected by the proposed Color Network. When a segment is determined as disease, the shape parameters of the segment are extracted and inspected by proposed Shape Network to classify the leaf disease types in the third step. According to the experiments with two types of diseases (frogeye/rust and tipburn) for apple leaves and iceberg, the leaf diseases are detected with 92.3% recall for a segment and with 99.3% recall for an input image where there are usually more than two disease segments. The proposed method is useful for detecting leaf diseases quickly in the smart farm environment, and is extendible to various types of new plants and leaf diseases without additional learning.

Hardware Accelerated Design on Bag of Words Classification Algorithm

  • Lee, Chang-yong;Lee, Ji-yong;Lee, Yong-hwan
    • Journal of Platform Technology
    • /
    • v.6 no.4
    • /
    • pp.26-33
    • /
    • 2018
  • In this paper, we propose an image retrieval algorithm for real-time processing and design it as hardware. The proposed method is based on the classification of BoWs(Bag of Words) algorithm and proposes an image search algorithm using bit stream. K-fold cross validation is used for the verification of the algorithm. Data is classified into seven classes, each class has seven images and a total of 49 images are tested. The test has two kinds of accuracy measurement and speed measurement. The accuracy of the image classification was 86.2% for the BoWs algorithm and 83.7% the proposed hardware-accelerated software implementation algorithm, and the BoWs algorithm was 2.5% higher. The image retrieval processing speed of BoWs is 7.89s and our algorithm is 1.55s. Our algorithm is 5.09 times faster than BoWs algorithm. The algorithm is largely divided into software and hardware parts. In the software structure, C-language is used. The Scale Invariant Feature Transform algorithm is used to extract feature points that are invariant to size and rotation from the image. Bit streams are generated from the extracted feature point. In the hardware architecture, the proposed image retrieval algorithm is written in Verilog HDL and designed and verified by FPGA and Design Compiler. The generated bit streams are stored, the clustering step is performed, and a searcher image databases or an input image databases are generated and matched. Using the proposed algorithm, we can improve convenience and satisfaction of the user in terms of speed if we search using database matching method which represents each object.

Facial expression recognition based on pleasure and arousal dimensions (쾌 및 각성차원 기반 얼굴 표정인식)

  • 신영숙;최광남
    • Korean Journal of Cognitive Science
    • /
    • v.14 no.4
    • /
    • pp.33-42
    • /
    • 2003
  • This paper presents a new system for facial expression recognition based in dimension model of internal states. The information of facial expression are extracted to the three steps. In the first step, Gabor wavelet representation extracts the edges of face components. In the second step, sparse features of facial expressions are extracted using fuzzy C-means(FCM) clustering algorithm on neutral faces, and in the third step, are extracted using the Dynamic Model(DM) on the expression images. Finally, we show the recognition of facial expression based on the dimension model of internal states using a multi-layer perceptron. The two dimensional structure of emotion shows that it is possible to recognize not only facial expressions related to basic emotions but also expressions of various emotion.

  • PDF

A Manufacturing Cell Formantion Algorithm Using Neural Networks (신경망을 이용한 제조셀 형성 알고리듬)

  • 이준한;김양렬
    • Korean Management Science Review
    • /
    • v.16 no.1
    • /
    • pp.157-171
    • /
    • 1999
  • In a increasingly competitive marketplace, the manufacturing companies have no choice but looking for ways to improve productivity to sustain their competitiveness and survive in the industry. Recently cellular manufacturing has been under discussion as an option to be easily implemented without burdensome capital investment. The objective of cellular manufacturing is to realize many aspects of efficiencies associated with mass production in the less repetitive job-shop production systems. The very first step for cellular manufacturing is to group the sets of parts having similar processing requirements into part families, and the equipment needed to process a particular part family into machine cells. The underlying problem to determine the part and machine assignments to each manufacturing cell is called the cell formation. The purpose of this study is to develop a clustering algorithm based on the neural network approach which overcomes the drawbacks of ART1 algorithm for cell formation problems. In this paper, a generalized learning vector quantization(GLVQ) algorithm was devised in order to transform a 0/1 part-machine assignment matrix into the matrix with diagonal blocks in such a way to increase clustering performance. Furthermore, an assignment problem model and a rearrangement procedure has been embedded to increase efficiency. The performance of the proposed algorithm has been evaluated using data sets adopted by prior studies on cell formation. The proposed algorithm dominates almost all the cell formation reported so far, based on the grouping index($\alpha$ = 0.2). Among 27 cell formation problems investigated, the result by the proposed algorithm was superior in 11, equal 15, and inferior only in 1.

  • PDF

The similarities analysis of location fishing information through 2 step clustering (2단계 군집분석을 통한 해구별 조업정보의 유사성 분석)

  • Cho, Yong-Jun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.3
    • /
    • pp.551-562
    • /
    • 2009
  • In this paper, I would present a using method for The Fishing Operation Information(FOI) of National Federation of Fisheries Cooperatives(NFFC) through the availabilities analysis and put out the similarities by the section of the sea through classifying characteristics of fishing patterns by their locations. As a result, although the catch of FOI is nothing more than 33% level to National Fishery Production Statistics(NFPS), FOI data is useful in understanding the patterns of fishing operation by the location because both patterns and correlation were very similar in the usability analysis, comparing the FOI data with NFPS. So I classified optimal clusters for catch, the number of fishing days and the number of fishing vessels through 2 step cluster analysis by the big marine zone and divided fishing patterns.

  • PDF

Classification of Wind Sector in Pohang Region Using Similarity of Time-Series Wind Vectors (시계열 풍속벡터의 유사성을 이용한 포항지역 바람권역 분류)

  • Kim, Hyun-Goo;Kim, Jinsol;Kang, Yong-Heack;Park, Hyeong-Dong
    • Journal of the Korean Solar Energy Society
    • /
    • v.36 no.1
    • /
    • pp.11-18
    • /
    • 2016
  • The local wind systems in the Pohang region were categorized into wind sectors. Still, thorough knowledge of wind resource assessment, wind environment analysis, and atmospheric environmental impact assessment was required since the region has outstanding wind resources, it is located on the path of typhoon, and it has large-scale atmospheric pollution sources. To overcome the resolution limitation of meteorological dataset and problems of categorization criteria of the preceding studies, the high-resolution wind resource map of the Korea Institute of Energy Research was used as time-series meteorological data; the 2-step method of determining the clustering coefficient through hierarchical clustering analysis and subsequently categorizing the wind sectors through non-hierarchical K-means clustering analysis was adopted. The similarity of normalized time-series wind vector was proposed as the Euclidean distance. The meteor-statistical characteristics of the mean vector wind distribution and meteorological variables of each wind sector were compared. The comparison confirmed significant differences among wind sectors according to the terrain elevation, mean wind speed, Weibull shape parameter, etc.

Improved LTE Fingerprint Positioning Through Clustering-based Repeater Detection and Outlier Removal

  • Kwon, Jae Uk;Chae, Myeong Seok;Cho, Seong Yun
    • Journal of Positioning, Navigation, and Timing
    • /
    • v.11 no.4
    • /
    • pp.369-379
    • /
    • 2022
  • In weighted k-nearest neighbor (WkNN)-based Fingerprinting positioning step, a process of comparing the requested positioning signal with signal information for each reference point stored in the fingerprint DB is performed. At this time, the higher the number of matched base station identifiers, the higher the possibility that the terminal exists in the corresponding location, and in fact, an additional weight is added to the location in proportion to the number of matching base stations. On the other hand, if the matching number of base stations is small, the selected candidate reference point has high dependence on the similarity value of the signal. But one problem arises here. The positioning signal can be compared with the repeater signal in the signal information stored on the DB, and the corresponding reference point can be selected as a candidate location. The selected reference point is likely to be an outlier, and if a certain weight is applied to the corresponding location, the error of the estimated location information increases. In order to solve this problem, this paper proposes a WkNN technique including an outlier removal function. To this end, it is first determined whether the repeater signal is included in the DB information of the matched base station. If the reference point for the repeater signal is selected as the candidate position, the reference position corresponding to the outlier is removed based on the clustering technique. The performance of the proposed technique is verified through data acquired in Seocho 1 and 2 dongs in Seoul.

Machine Loading by Workload Balancing in Flexible Manufacturing Systems (FMS에서의 작업부하균형을 고려한 기계부하결정)

  • Yun, Yeong-Su;Lee, Sang-Yong
    • Journal of Korean Society for Quality Management
    • /
    • v.20 no.2
    • /
    • pp.129-136
    • /
    • 1992
  • This paper aims to develope an algorithm to minimize the total production time, sum of group formation times and processing times, under the balanced workload among the machines by grouping parts with machine loading in FMS. The algorithm of this study is composed of four step procedures ; (1) Parts grouping by Group Technology(GT) (2) Minimizing total processing time in loading problem (3) Machine workload balancing, including above(2) (4) Group formation time, including above(3) For parts grouping, Rank Order Clustering(ROC) algorithm developed by King(1980) is used and this algorithm is programmed by using the MACRO functions of QUATTRO Pro, one of the spreadsheet packages. The structure for loading model is solved by using the Hyper-LINDO. As a case study, numerical examples are demonstrated to show the effectiveness of the proposed machine loading procedure.

  • PDF

ModifiedFAST: A New Optimal Feature Subset Selection Algorithm

  • Nagpal, Arpita;Gaur, Deepti
    • Journal of information and communication convergence engineering
    • /
    • v.13 no.2
    • /
    • pp.113-122
    • /
    • 2015
  • Feature subset selection is as a pre-processing step in learning algorithms. In this paper, we propose an efficient algorithm, ModifiedFAST, for feature subset selection. This algorithm is suitable for text datasets, and uses the concept of information gain to remove irrelevant and redundant features. A new optimal value of the threshold for symmetric uncertainty, used to identify relevant features, is found. The thresholds used by previous feature selection algorithms such as FAST, Relief, and CFS were not optimal. It has been proven that the threshold value greatly affects the percentage of selected features and the classification accuracy. A new performance unified metric that combines accuracy and the number of features selected has been proposed and applied in the proposed algorithm. It was experimentally shown that the percentage of selected features obtained by the proposed algorithm was lower than that obtained using existing algorithms in most of the datasets. The effectiveness of our algorithm on the optimal threshold was statistically validated with other algorithms.