• Title/Summary/Keyword: EM Clustering

Search Result 65, Processing Time 0.024 seconds

Railway Track Extraction from Mobile Laser Scanning Data (모바일 레이저 스캐닝 데이터로부터 철도 선로 추출에 관한 연구)

  • Yoonseok, Jwa;Gunho, Sohn;Jong Un, Won;Wonchoon, Lee;Nakhyeon, Song
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.33 no.2
    • /
    • pp.111-122
    • /
    • 2015
  • This study purposed on introducing a new automated solution for detecting railway tracks and reconstructing track models from the mobile laser scanning data. The proposed solution completes following procedures; the study initiated with detecting a potential railway region, called Region Of Interest (ROI), and approximating the orientation of railway track trajectory with the raw data. At next, the knowledge-based detection of railway tracks was performed for localizing track candidates in the first strip. In here, a strip -referring the local track search region- is generated in the orthogonal direction to the orientation of track trajectory. Lastly, an initial track model generated over the candidate points, which were detected by GMM-EM (Gaussian Mixture Model-Expectation & Maximization) -based clustering strip- wisely grows to capture all track points of interest and thus converted into geometric track model in the tracking by detection framework. Therefore, the proposed railway track tracking process includes following key features; it is able to reduce the complexity in detecting track points by using a hypothetical track model. Also, it enhances the efficiency of track modeling process by simultaneously capturing track points and modeling tracks that resulted in the minimization of data processing time and cost. The proposed method was developed using the C++ program language and was evaluated by the LiDAR data, which was acquired from MMS over an urban railway track area with a complex railway scene as well.

Segmenting Inpatients by Mixture Model and Analytical Hierarchical Process(AHP) Approach In Medical Service (의료서비스에서 혼합모형(Mixture model) 및 분석적 계층과정(AHP)를 이용한 입원환자의 시장세분화에 관한 연구)

  • 백수경;곽영식
    • Health Policy and Management
    • /
    • v.12 no.2
    • /
    • pp.1-22
    • /
    • 2002
  • Since the early 1980s scholars have applied latent structure and other type of finite mixture models from various academic fields. Although the merits of finite mixture model are well documented, the attempt to apply the mixture model to medical service has been relatively rare. The researchers aim to try to fill this gap by introducing finite mixture model and segmenting inpatients DB from one general hospital. In section 2 finite mixture models are compared with clustering, chi-square analysis, and discriminant analysis based on Wedel and Kamakura(2000)'s segmentation methodology schemata. The mixture model shows the optimal segments number and fuzzy classification for each observation by EM(expectation-maximization algorism). The finite mixture model is to unfix the sample, to Identify the groups, and to estimate the parameters of the density function underlying the observed data within each group. In section 3 and 4 we illustrate results of segmenting 4510 patients data including menial and ratio scales. And then, we show AHP can be identify the attractiveness of each segment, in which the decision maker can select the best target segment.

IDS Model using Improved Bayesian Network to improve the Intrusion Detection Rate (베이지안 네트워크 개선을 통한 탐지율 향상의 IDS 모델)

  • Choi, Bomin;Lee, Jungsik;Han, Myung-Mook
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.5
    • /
    • pp.495-503
    • /
    • 2014
  • In recent days, a study of the intrusion detection system collecting and analyzing network data, packet or logs, has been actively performed to response the network threats in computer security fields. In particular, Bayesian network has advantage of the inference functionality which can infer with only some of provided data, so studies of the intrusion system based on Bayesian network have been conducted in the prior. However, there were some limitations to calculate high detection performance because it didn't consider the problems as like complexity of the relation among network packets or continuos input data processing. Therefore, in this paper we proposed two methodologies based on K-menas clustering to improve detection rate by reforming the problems of prior models. At first, it can be improved by sophisticatedly setting interval range of nodes based on K-means clustering. And for the second, it can be improved by calculating robust CPT through applying weighted-leaning based on K-means clustering, too. We conducted the experiments to prove performance of our proposed methodologies by comparing K_WTAN_EM applied to proposed two methodologies with prior models. As the results of experiment, the detection rate of proposed model is higher about 7.78% than existing NBN(Naive Bayesian Network) IDS model, and is higher about 5.24% than TAN(Tree Augmented Bayesian Network) IDS mode and then we could prove excellence our proposing ideas.

Sensor Selection Strategies for Activity Recognition in a Smart Environment (스마트 환경에서 행위 인식을 위한 센서 선정 기법)

  • Gu, Sungdo;Sohn, Kyung-Ah
    • Journal of KIISE
    • /
    • v.42 no.8
    • /
    • pp.1031-1038
    • /
    • 2015
  • The recent emergence of smart phones, wearable devices, and even the IoT concept made it possible for various objects to interact one another anytime and anywhere. Among many of such smart services, a smart home service typically requires a large number of sensors to recognize the residents' activities. For this reason, the ideas on activity recognition using the data obtained from those sensors are actively discussed and studied these days. Furthermore, plenty of sensors are installed in order to recognize activities and analyze their patterns via data mining techniques. However, if many of these sensors should be installed for IoT smart home service, it raises the issue of cost and energy consumption. In this paper, we proposed a new method for reducing the number of sensors for activity recognition in a smart environment, which utilizes the principal component analysis and clustering techniques, and also show the effect of improvement in terms of the activity recognition by the proposed method.

A Study on Big-5 based Personality Analysis through Analysis and Comparison of Machine Learning Algorithm (머신러닝 알고리즘 분석 및 비교를 통한 Big-5 기반 성격 분석 연구)

  • Kim, Yong-Jun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.4
    • /
    • pp.169-174
    • /
    • 2019
  • In this study, I use surveillance data collection and data mining, clustered by clustering method, and use supervised learning to judge similarity. I aim to use feature extraction algorithms and supervised learning to analyze the suitability of the correlations of personality. After conducting the questionnaire survey, the researchers refine the collected data based on the questionnaire, classify the data sets through the clustering techniques of WEKA, an open source data mining tool, and judge similarity using supervised learning. I then use feature extraction algorithms and supervised learning to determine the suitability of the results for personality. As a result, it was found that the highest degree of similarity classification was obtained by EM classification and supervised learning by Naïve Bayes. The results of feature classification and supervised learning were found to be useful for judging fitness. I found that the accuracy of each Big-5 personality was changed according to the addition and deletion of the items, and analyzed the differences for each personality.

Performance Analysis of User Clustering Algorithms against User Density and Maximum Number of Relays for D2D Advertisement Dissemination (최대 전송횟수 제한 및 사용자 밀집도 변화에 따른 사용자 클러스터링 알고리즘 별 D2D 광고 확산 성능 분석)

  • Han, Seho;Kim, Junseon;Lee, Howon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.4
    • /
    • pp.721-727
    • /
    • 2016
  • In this paper, in order to resolve the problem of reduction for D2D (device to device) advertisement dissemination efficiency of conventional dissemination algorithms, we here propose several clustering algorithms (modified single linkage algorithm (MSL), K-means algorithm, and expectation maximization algorithm with Gaussian mixture model (EM)) based advertisement dissemination algorithms to improve advertisement dissemination efficiency in D2D communication networks. Target areas are clustered in several target groups by the proposed clustering algorithms. Then, D2D advertisements are consecutively distributed by using a routing algorithm based on the geographical distribution of the target areas and a relay selection algorithm based on the distance between D2D sender and D2D receiver. Via intensive MATLAB simulations, we analyze the performance excellency of the proposed algorithms with respect to maximum number of relay transmissions and D2D user density ratio in a target area and a non-target area.

Experimental Evaluation of Distance-based and Probability-based Clustering

  • Kwon, Na Yeon;Kim, Jang Il;Dollein, Richard;Seo, Weon Joon;Jung, Yong Gyu
    • International journal of advanced smart convergence
    • /
    • v.2 no.1
    • /
    • pp.36-41
    • /
    • 2013
  • Decision-making is to extract information that can be executed in the future, it refers to the process of discovering a new data model that is induced in the data. In other words, it is to find out the information to peel off to find the vein to catch the relationship between the hidden patterns in data. The information found here, is a process of finding the relationship between the useful patterns by applying modeling techniques and sophisticated statistical analysis of the data. It is called data mining which is a key technology for marketing database. Therefore, research for cluster analysis of the current is performed actively, which is capable of extracting information on the basis of the large data set without a clear criterion. The EM and K-means methods are used a lot in particular, how the result values of evaluating are come out in experiments, which are depending on the size of the data by the type of distance-based and probability-based data analysis.

Verb Clustering for Defining Relations between Ontology Classes of Technical Terms Using EM Algorithm (EM 알고리즘을 이용한 전문용어 온톨로지 클래스간 관계 정의를 위한 동사 클러스터링)

  • Jin, Meixun;Nam, Sang-Hyob;Lee, Yong-Hoon;Lee, Jong-Hyeok
    • Annual Conference on Human and Language Technology
    • /
    • 2007.10a
    • /
    • pp.233-240
    • /
    • 2007
  • 온톨로지 구축에서 클래스간 관계 설정은 중요한 부분이다. 본 논문에서는 클래스간 상 하위 관계 외의 관계 설정을 위한 클래스간 관계 자동 정의를 목적으로 의존구문분석의 (주어, 용언) (목적어, 용언) 쌍들을 추출하고, 이렇게 추출된 데이터를 이용하여 용언들을 클러스터링 하는 방법을 제안한다. 도메인 전문 코퍼스 데이터 희귀성 문제를 해결하고자, 웹검색을 결합한 방식을 선택하여 도메인 온톨로지 구축 클래스간 관계 자동 설정에 대한 방법론을 제시한다.

  • PDF

Decision of Gaussian Function Threshold for Image Segmentation (영상분할을 위한 혼합 가우시안 함수 임계 값 결정)

  • Jung, Yong-Gyu;Choi, Gyoo-Seok;Heo, Go-Eun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.9 no.5
    • /
    • pp.163-168
    • /
    • 2009
  • Most image segmentation methods are to represent observed feature vectors at each pixel, which are assumed as appropriated probability models. These models can be used by statistical estimating or likelihood clustering algorithms of feature vectors. EM algorithms have some calculation problems of maximum likelihood for unknown parameters from incomplete data and maximum value in post probability distribution. First, the performance is dependent upon starting positions and likelihood functions are converged on local maximum values. To solve these problems, we mixed the Gausian function and histogram at all the level values at the image, which are proposed most suitable image segmentation methods. This proposed algoritms are confirmed to classify most edges clearly and variously, which are implemented to MFC programs.

  • PDF

Analysis of Characteristics of NPS Runoff and Pollution Contribution Rate in Songya-stream Watershed (송야천 유역의 비점오염물질 유출 특성 및 오염기여율 분석)

  • Kang Taeseong;Yu Nayeong;Shin Minhwan;Lim Kyoungjae;Park Minji;Park Baekyung;Kim Jonggun
    • Journal of Korean Society on Water Environment
    • /
    • v.39 no.4
    • /
    • pp.316-328
    • /
    • 2023
  • In this study, the characteristics of nonpoint pollutant outflow and contribution rate of pollution in Songya-stream mainstream and tributaries were analyzed. Further, water pollution management and improvement measures for pollution-oriented rivers were proposed. An on-site investigation was conducted to determine the inflow of major pollutants into the basin, and it was found that pollutants generated from agricultural land and livestock facilities flowed into the river, resulting in a high concentration of turbid water. Based on the analysis results of the pollution load data calculated through actual measurement monitoring (flow and water quality) and the occurrence and emission load data calculated using the national pollution source survey data, the S3 and S6 were selected as the concerned pollution tributaries in the Songya-stream basin. Results of cluster analysis using Pearson correlation coefficient evaluation and Density based spatial clustering of applications with noise (DBSCAN) technique showed that the S3 and S6 were most consistent with the C2 cluster (a cluster of Songya-stream mainstream owned area) corresponding to the mainstream of Songya-stream. The analysis results of the major pollutants in the concerned pollution tributaries showed that livestock and land pollutants were the major pollutants. Consequently, optimal management techniques such as fertilizer management, water gate management in paddy, vegetated filter strip and livestock manure public treatment were proposed to reduce livestock and land pollutants.