• 제목/요약/키워드: 2-Step Clustering

검색결과 86건 처리시간 0.026초

Similarity Analysis of Hospitalization using Crowding Distance

  • Jung, Yong Gyu;Choi, Young Jin;Cha, Byeong Heon
    • International journal of advanced smart convergence
    • /
    • 제5권2호
    • /
    • pp.53-58
    • /
    • 2016
  • With the growing use of big data and data mining, it serves to understand how such techniques can be used to understand various relationships in the healthcare field. This study uses hierarchical methods of data analysis to explore similarities in hospitalization across several New York state counties. The study utilized methods of measuring crowding distance of data for age-specific hospitalization period. Crowding distance is defined as the longest distance, or least similarity, between urban cities. It is expected that the city of Clinton have the greatest distance, while Albany the other cities are closer because they are connected by the shortest distance to each step. Similarities were stronger across hospital stays categorized by age. Hierarchical clustering can be applied to predict the similarity of data across the 10 cities of hospitalization with the measurement of crowding distance. In order to enhance the performance of hierarchical clustering, comparison can be made across congestion distance when crowding distance is applied first through the application of converting text to an attribute vector. Measurements of similarity between two objects are dependent on the measurement method used in clustering but is distinguished from the similarity of the distance; where the smaller the distance value the more similar two things are to one other. By applying this specific technique, it is found that the distance between crowding is reduced consistently in relationship to similarity between the data increases to enhance the performance of the experiments through the application of special techniques. Furthermore, through the similarity by city hospitalization period, when the construction of hospital wards in cities, by referring to results of experiments, or predict possible will land to the extent of the size of the hospital facilities hospital stay is expected to be useful in efficiently managing the patient in a similar area.

피노믹스 시스템을 위한 식물 잎의 질병 검출 및 분류 (Detection and Classification of Leaf Diseases for Phenomics System)

  • 박관익;심규동;견민수;이상화;백정현;박종일
    • 방송공학회논문지
    • /
    • 제27권6호
    • /
    • pp.923-935
    • /
    • 2022
  • 본 논문에서는 스마트팜 시스템에서 재배 중인 식물 잎의 질병을 검출하고, 질병 유형을 분류하는 방법을 제안한다. 영상으로부터식물 잎의 컬러 정보와 질병 유형의 형태 정보를 다층 퍼셉트론(MLP) 모델을 이용하여 학습한다. 1단계에서는 입력된 영상의 컬러분포를 분석하여 질병 존재 여부를 판단한다. 1단계의 질병 존재 가능성이 높은 영상에 대하여 2단계에서는 Mean shift clustering을 이용하여 작은 영역으로 분할하고, 각 분할된 영역 단위로 컬러 정보를 추출하여 제안한 Color Network에 의하여 질병 여부를 판별한다. 컬러 분할된 영역이 Color Network에 의하여 질병으로 판별되면, 3단계에서는 그 영역의 형태 정보를 추출하여 제안한 Shape Network를 이용하여 질병의 유형을 분류한다. 사과나무 잎과 서양 양상추(Iceberg)에서 발생하는 두 가지 대분류 유형의 질병에 대하여, 제안한 기법은 작은 영역 단위로는 92.3%의 잎 질병 검출률을 보였으며, 보통 2개 이상의 질병 영역이 존재하는 한 장의 영상 단위로는 99.3% 이상의 검출률을 보였다. 본 논문에서 제안한 방법은 스마트팜 환경에서 잎 식물의 질병 여부를 조기에 발견할 수 있으며, 대상 식물에 따른 추가 학습 없이 다양한 식물과 질병 유형으로 확대 적용이 가능하다.

Hardware Accelerated Design on Bag of Words Classification Algorithm

  • Lee, Chang-yong;Lee, Ji-yong;Lee, Yong-hwan
    • Journal of Platform Technology
    • /
    • 제6권4호
    • /
    • pp.26-33
    • /
    • 2018
  • In this paper, we propose an image retrieval algorithm for real-time processing and design it as hardware. The proposed method is based on the classification of BoWs(Bag of Words) algorithm and proposes an image search algorithm using bit stream. K-fold cross validation is used for the verification of the algorithm. Data is classified into seven classes, each class has seven images and a total of 49 images are tested. The test has two kinds of accuracy measurement and speed measurement. The accuracy of the image classification was 86.2% for the BoWs algorithm and 83.7% the proposed hardware-accelerated software implementation algorithm, and the BoWs algorithm was 2.5% higher. The image retrieval processing speed of BoWs is 7.89s and our algorithm is 1.55s. Our algorithm is 5.09 times faster than BoWs algorithm. The algorithm is largely divided into software and hardware parts. In the software structure, C-language is used. The Scale Invariant Feature Transform algorithm is used to extract feature points that are invariant to size and rotation from the image. Bit streams are generated from the extracted feature point. In the hardware architecture, the proposed image retrieval algorithm is written in Verilog HDL and designed and verified by FPGA and Design Compiler. The generated bit streams are stored, the clustering step is performed, and a searcher image databases or an input image databases are generated and matched. Using the proposed algorithm, we can improve convenience and satisfaction of the user in terms of speed if we search using database matching method which represents each object.

쾌 및 각성차원 기반 얼굴 표정인식 (Facial expression recognition based on pleasure and arousal dimensions)

  • 신영숙;최광남
    • 인지과학
    • /
    • 제14권4호
    • /
    • pp.33-42
    • /
    • 2003
  • 본 논문은 내적상태의 차원모형을 기반으로 한 얼굴 표정인식을 위한 새로운 시스템을 제시한다. 얼굴표정 정보는 3단계로 추출된다. 1단계에서는 Gabor 웨이브렛 표상이 얼굴 요소들의 경계선을 추출한다. 2단계에서는 중립얼굴상에서 얼굴표정의 성긴 특징들이 FCM 군집화 알고리즘을 사용하여 추출된다. 3단계에서는 표정영상에서 동적인 모델을 사용하여 성긴 특징들이 추출된다. 마지막으로 다층 퍼셉트론을 사용하여 내적상태의 차원모델에 기반한 얼굴표정 인식을 보인다. 정서의 이차원 구조는 기본 정서와 관련된 얼굴표정의 인식 뿐만 아니라 다양한 정서의 표정들로 인식할 수 있음을 제시한다.

  • PDF

신경망을 이용한 제조셀 형성 알고리듬 (A Manufacturing Cell Formantion Algorithm Using Neural Networks)

  • 이준한;김양렬
    • 경영과학
    • /
    • 제16권1호
    • /
    • pp.157-171
    • /
    • 1999
  • In a increasingly competitive marketplace, the manufacturing companies have no choice but looking for ways to improve productivity to sustain their competitiveness and survive in the industry. Recently cellular manufacturing has been under discussion as an option to be easily implemented without burdensome capital investment. The objective of cellular manufacturing is to realize many aspects of efficiencies associated with mass production in the less repetitive job-shop production systems. The very first step for cellular manufacturing is to group the sets of parts having similar processing requirements into part families, and the equipment needed to process a particular part family into machine cells. The underlying problem to determine the part and machine assignments to each manufacturing cell is called the cell formation. The purpose of this study is to develop a clustering algorithm based on the neural network approach which overcomes the drawbacks of ART1 algorithm for cell formation problems. In this paper, a generalized learning vector quantization(GLVQ) algorithm was devised in order to transform a 0/1 part-machine assignment matrix into the matrix with diagonal blocks in such a way to increase clustering performance. Furthermore, an assignment problem model and a rearrangement procedure has been embedded to increase efficiency. The performance of the proposed algorithm has been evaluated using data sets adopted by prior studies on cell formation. The proposed algorithm dominates almost all the cell formation reported so far, based on the grouping index($\alpha$ = 0.2). Among 27 cell formation problems investigated, the result by the proposed algorithm was superior in 11, equal 15, and inferior only in 1.

  • PDF

2단계 군집분석을 통한 해구별 조업정보의 유사성 분석 (The similarities analysis of location fishing information through 2 step clustering)

  • 조용준
    • Journal of the Korean Data and Information Science Society
    • /
    • 제20권3호
    • /
    • pp.551-562
    • /
    • 2009
  • 수협의 어선조업정보는 국가 공식 통계가 가지고 있지 못한 위치별 조업정보를 가지고 있다는 장점이 있다. 위치별 조업정보는 해당 지역의 어업피해보상, 자원가치 산출 등을 추정할 수 있어 국가통계자료로의 가치가 매우 높으나 어업인들의 자기 정보의 노출에 대한 기피로 인해 신뢰성이 떨어지는 단점을 지니고 있다. 본 연구는 유용성분석을 통해 이러한 수협의 어선조업정보의 활용을 위한 방안을 제시하고 위치별 조업패턴의 특성을 분류하여 해구별 유사성의 정보를 산출을 목적으로 하였다. 분석결과 수협의 어선조업정보는 정부 생산통계대비 어획량의 약 33% 수준이나 유용성 분석에서 그 패턴과 상관관계가 밀접해 위치별 패턴파악에 유용한 것으로 나타났다. 이를 바탕으로 대해구별 2단계 군집분석을 통해 어획량, 조업일수, 조업척수에 대해 각각 최적의 군집을 구분하고 이를 종합하여 8개의 군집으로 패턴을 구분하였다.

  • PDF

시계열 풍속벡터의 유사성을 이용한 포항지역 바람권역 분류 (Classification of Wind Sector in Pohang Region Using Similarity of Time-Series Wind Vectors)

  • 김현구;김진솔;강용혁;박형동
    • 한국태양에너지학회 논문집
    • /
    • 제36권1호
    • /
    • pp.11-18
    • /
    • 2016
  • The local wind systems in the Pohang region were categorized into wind sectors. Still, thorough knowledge of wind resource assessment, wind environment analysis, and atmospheric environmental impact assessment was required since the region has outstanding wind resources, it is located on the path of typhoon, and it has large-scale atmospheric pollution sources. To overcome the resolution limitation of meteorological dataset and problems of categorization criteria of the preceding studies, the high-resolution wind resource map of the Korea Institute of Energy Research was used as time-series meteorological data; the 2-step method of determining the clustering coefficient through hierarchical clustering analysis and subsequently categorizing the wind sectors through non-hierarchical K-means clustering analysis was adopted. The similarity of normalized time-series wind vector was proposed as the Euclidean distance. The meteor-statistical characteristics of the mean vector wind distribution and meteorological variables of each wind sector were compared. The comparison confirmed significant differences among wind sectors according to the terrain elevation, mean wind speed, Weibull shape parameter, etc.

Improved LTE Fingerprint Positioning Through Clustering-based Repeater Detection and Outlier Removal

  • Kwon, Jae Uk;Chae, Myeong Seok;Cho, Seong Yun
    • Journal of Positioning, Navigation, and Timing
    • /
    • 제11권4호
    • /
    • pp.369-379
    • /
    • 2022
  • In weighted k-nearest neighbor (WkNN)-based Fingerprinting positioning step, a process of comparing the requested positioning signal with signal information for each reference point stored in the fingerprint DB is performed. At this time, the higher the number of matched base station identifiers, the higher the possibility that the terminal exists in the corresponding location, and in fact, an additional weight is added to the location in proportion to the number of matching base stations. On the other hand, if the matching number of base stations is small, the selected candidate reference point has high dependence on the similarity value of the signal. But one problem arises here. The positioning signal can be compared with the repeater signal in the signal information stored on the DB, and the corresponding reference point can be selected as a candidate location. The selected reference point is likely to be an outlier, and if a certain weight is applied to the corresponding location, the error of the estimated location information increases. In order to solve this problem, this paper proposes a WkNN technique including an outlier removal function. To this end, it is first determined whether the repeater signal is included in the DB information of the matched base station. If the reference point for the repeater signal is selected as the candidate position, the reference position corresponding to the outlier is removed based on the clustering technique. The performance of the proposed technique is verified through data acquired in Seocho 1 and 2 dongs in Seoul.

FMS에서의 작업부하균형을 고려한 기계부하결정 (Machine Loading by Workload Balancing in Flexible Manufacturing Systems)

  • 윤영수;이상용
    • 품질경영학회지
    • /
    • 제20권2호
    • /
    • pp.129-136
    • /
    • 1992
  • This paper aims to develope an algorithm to minimize the total production time, sum of group formation times and processing times, under the balanced workload among the machines by grouping parts with machine loading in FMS. The algorithm of this study is composed of four step procedures ; (1) Parts grouping by Group Technology(GT) (2) Minimizing total processing time in loading problem (3) Machine workload balancing, including above(2) (4) Group formation time, including above(3) For parts grouping, Rank Order Clustering(ROC) algorithm developed by King(1980) is used and this algorithm is programmed by using the MACRO functions of QUATTRO Pro, one of the spreadsheet packages. The structure for loading model is solved by using the Hyper-LINDO. As a case study, numerical examples are demonstrated to show the effectiveness of the proposed machine loading procedure.

  • PDF

ModifiedFAST: A New Optimal Feature Subset Selection Algorithm

  • Nagpal, Arpita;Gaur, Deepti
    • Journal of information and communication convergence engineering
    • /
    • 제13권2호
    • /
    • pp.113-122
    • /
    • 2015
  • Feature subset selection is as a pre-processing step in learning algorithms. In this paper, we propose an efficient algorithm, ModifiedFAST, for feature subset selection. This algorithm is suitable for text datasets, and uses the concept of information gain to remove irrelevant and redundant features. A new optimal value of the threshold for symmetric uncertainty, used to identify relevant features, is found. The thresholds used by previous feature selection algorithms such as FAST, Relief, and CFS were not optimal. It has been proven that the threshold value greatly affects the percentage of selected features and the classification accuracy. A new performance unified metric that combines accuracy and the number of features selected has been proposed and applied in the proposed algorithm. It was experimentally shown that the percentage of selected features obtained by the proposed algorithm was lower than that obtained using existing algorithms in most of the datasets. The effectiveness of our algorithm on the optimal threshold was statistically validated with other algorithms.