• Title/Summary/Keyword: K-means algorithm

Search Result 1,363, Processing Time 0.038 seconds

Health Risk Management using Feature Extraction and Cluster Analysis considering Time Flow (시간흐름을 고려한 특징 추출과 군집 분석을 이용한 헬스 리스크 관리)

  • Kang, Ji-Soo;Chung, Kyungyong;Jung, Hoill
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.1
    • /
    • pp.99-104
    • /
    • 2021
  • In this paper, we propose health risk management using feature extraction and cluster analysis considering time flow. The proposed method proceeds in three steps. The first is the pre-processing and feature extraction step. It collects user's lifelog using a wearable device, removes incomplete data, errors, noise, and contradictory data, and processes missing values. Then, for feature extraction, important variables are selected through principal component analysis, and data similar to the relationship between the data are classified through correlation coefficient and covariance. In order to analyze the features extracted from the lifelog, dynamic clustering is performed through the K-means algorithm in consideration of the passage of time. The new data is clustered through the similarity distance measurement method based on the increment of the sum of squared errors. Next is to extract information about the cluster by considering the passage of time. Therefore, using the health decision-making system through feature clusters, risks able to managed through factors such as physical characteristics, lifestyle habits, disease status, health care event occurrence risk, and predictability. The performance evaluation compares the proposed method using Precision, Recall, and F-measure with the fuzzy and kernel-based clustering. As a result of the evaluation, the proposed method is excellently evaluated. Therefore, through the proposed method, it is possible to accurately predict and appropriately manage the user's potential health risk by using the similarity with the patient.

Influence of Self-driving Data Set Partition on Detection Performance Using YOLOv4 Network (YOLOv4 네트워크를 이용한 자동운전 데이터 분할이 검출성능에 미치는 영향)

  • Wang, Xufei;Chen, Le;Li, Qiutan;Son, Jinku;Ding, Xilong;Song, Jeongyoung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.6
    • /
    • pp.157-165
    • /
    • 2020
  • Aiming at the development of neural network and self-driving data set, it is also an idea to improve the performance of network model to detect moving objects by dividing the data set. In Darknet network framework, the YOLOv4 (You Only Look Once v4) network model was used to train and test Udacity data set. According to 7 proportions of the Udacity data set, it was divided into three subsets including training set, validation set and test set. K-means++ algorithm was used to conduct dimensional clustering of object boxes in 7 groups. By adjusting the super parameters of YOLOv4 network for training, Optimal model parameters for 7 groups were obtained respectively. These model parameters were used to detect and compare 7 test sets respectively. The experimental results showed that YOLOv4 can effectively detect the large, medium and small moving objects represented by Truck, Car and Pedestrian in the Udacity data set. When the ratio of training set, validation set and test set is 7:1.5:1.5, the optimal model parameters of the YOLOv4 have highest detection performance. The values show mAP50 reaching 80.89%, mAP75 reaching 47.08%, and the detection speed reaching 10.56 FPS.

A Study on the Cerber-Type Ransomware Detection Model Using Opcode and API Frequency and Correlation Coefficient (Opcode와 API의 빈도수와 상관계수를 활용한 Cerber형 랜섬웨어 탐지모델에 관한 연구)

  • Lee, Gye-Hyeok;Hwang, Min-Chae;Hyun, Dong-Yeop;Ku, Young-In;Yoo, Dong-Young
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.10
    • /
    • pp.363-372
    • /
    • 2022
  • Since the recent COVID-19 Pandemic, the ransomware fandom has intensified along with the expansion of remote work. Currently, anti-virus vaccine companies are trying to respond to ransomware, but traditional file signature-based static analysis can be neutralized in the face of diversification, obfuscation, variants, or the emergence of new ransomware. Various studies are being conducted for such ransomware detection, and detection studies using signature-based static analysis and behavior-based dynamic analysis can be seen as the main research type at present. In this paper, the frequency of ".text Section" Opcode and the Native API used in practice was extracted, and the association between feature information selected using K-means Clustering algorithm, Cosine Similarity, and Pearson correlation coefficient was analyzed. In addition, Through experiments to classify and detect worms among other malware types and Cerber-type ransomware, it was verified that the selected feature information was specialized in detecting specific ransomware (Cerber). As a result of combining the finally selected feature information through the above verification and applying it to machine learning and performing hyper parameter optimization, the detection rate was up to 93.3%.

MORPHEUS: A More Scalable Comparison-Shopping Agent (MORPHEUS: 확장성이 있는 비교 쇼핑 에이전트)

  • Yang, Jae-Yeong;Kim, Tae-Hyeong;Choe, Jung-Min
    • Journal of KIISE:Software and Applications
    • /
    • v.28 no.2
    • /
    • pp.179-191
    • /
    • 2001
  • Comparison shopping is a merchant brokering process that finds the best price for the desired product from several Web-based online stores. To get a scalable comparison shopper, we need an agent that automatically constructs a simple information extraction procedure, called a wrapper, for each semi-structured store. Automatic construction of wrappers for HTML-based Web stores is difficult because HTML only defines how information is to be displayed, not what it means, and different stores employ different ways of manipulating customer queries and different presentation formats for displaying product descriptions. Wrapper induction has been suggested as a promising strategy for overcoming this heterogeneity. However, previous scalable comparison-shoppers such as ShopBot rely on a strong bias in the product descriptions, and as a result, many stores that do not confirm to this bias were unable to be recognized. This paper proposes a more scalable comparison-shopping agent named MORPHEUS. MORPHEUS presents a simple but robust inductive learning algorithm that antomatically constructs wrappers. The main idea of the proposed algorithm is to recognize the position and the structure of a product description unit by finding the most frequent pattern from the sequence of logical line information in output HTML pages. MORPHEUS successfully constructs correct wtappers for most stores by weakening a bias assumed in previous systems. It also tolerates some noises that might be present in production descriptions such as missing attributes. MORPHEUS generates the wrappers rapidly by excluding the pre-processing phase of removing redundant fragments in a page such as a header, a tailer, and advertisements. Eventually, MORPHEUS provides a framework from which a customized comparison-shopping agent can be organized for a user by facilitating the dynamic addition of new stores.

  • PDF

A Study on the Effects of Airborne LiDAR Data-Based DEM-Generating Techniques on the Quality of the Final Products for Forest Areas - Focusing on GroundFilter and GridsurfaceCreate in FUSION Software - (항공 LiDAR 자료기반 DEM 생성기법의 산림지역 최종산출물 품질에 미치는 영향에 관한 연구 - FUSION Software의 GroundFilter 및 GridsurfaceCreate 알고리즘을 중심으로 -)

  • PARK, Joo-Won;CHOI, Hyung-Tae;CHO, Seung-Wan
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.19 no.1
    • /
    • pp.154-166
    • /
    • 2016
  • This study aims to contribute to better understanding the effects of the changes in the parameter values of GroundFilter algorithm(GF), which performs filtering process, and of GridsurfaceCreate algorithm(GC), which creates regular grid, provided in Fusion software on the accuracy of elevation of the final LiDAR-DEM products through comparative analysis. In order to test whether there are significant effects on the accuracy of the final LiDAR-DEM products due to the changes of GF(1, 3, 5, 7, 9) parameter levels and GC(1, 3, 5, 7, 9) parameter levels, two-way ANOVA is conducted based on residuals. The residuals are calculated using the differences between each sample plot's paired field-measured and DEM-derived elevation values given each individual GF and GC level. After that, Tukey HSD test is conducted as a post hoc test for grouping the levels. As a result of two-way ANOVA test, it is found that the change in the GF levels significantly affects the accuracy of LiDAR-DEM elevations(F-value : 27.340, p < 0.01), while the change in the GC levels does not significantly affect the accuracy of LiDAR-DEM elevations(F-value : 0.457). It is also found that the interaction effect between GF and GC levels is not likely to exist(F-value : 0.247). From the results of the Tukey HSD test in the GF levels, GF levels can be divided into two groups('7', '5', '9', '3' vs '1') by the differences of means of residuals. Given the current conditions, LiDAR-DEM can achieve the best accuracy when the level '7' and '3' are given as GF and GC level, respectively.

Identification of Nonstationary Time Varying EMG Signal in the DCT Domain and a Real Time Implementation Using Parallel Processing Computer (DCT 평면에서의 비정상 시변 근전도 신호의 인식과 병렬처리컴퓨터를 이용한 실시간 구현)

  • Lee, Young-Seock;Lee, Jin;Kim, Sung-Hwan
    • Journal of Biomedical Engineering Research
    • /
    • v.16 no.4
    • /
    • pp.507-516
    • /
    • 1995
  • The nonstationary identifier in the DCT domain is suggested in this study for the identification of AR parameters of above-lesion upper-trunk electromyographic (EMG) signals as a means of developing a reliable real time signal to control functional electrical stimulation (FES) in paraplegics to enable primitive walking. As paraplegic shifts his posture from one attitude to another, there is transition period where the signal is clearly nonstationary. Also as muscle fatigues, nonstationarities become more prevalent even during stable postures. So, it requires a develpment of time varying nonstationary EMG signal identifier. In this paper, time varying nonstationary EMG signals are transformed into DCT domain and the transformed EMG signals are modeled and analyzed in the transform domain. In the DCT domain, we verified reduction of condition number and increment of the smallest eigenvalue of input correlation matrix that influences numerical properties and mean square error were compared with SLS algorithm, and the proposed algorithm is implemented using IMS T-805 parallel processing computer for real time application.

  • PDF

The Reactive Power Compensation for a Feeder by Control of the Power Factor of PWM Converter Trains (PWM 컨버터 차량의 역률 제어를 통한 급전선로의 무효전력 보상)

  • Kim, Ronny Yongho;Kim, Baik
    • Journal of the Korean Society for Railway
    • /
    • v.17 no.3
    • /
    • pp.171-177
    • /
    • 2014
  • PWM converter trains exhibit excellent load characteristics in comparison with conventional phase-controlled trains with low power factors, as they can be operated at power factors which are close to unity by means of a voltage vector control method. However, in the case of a high track density or extended feeding, significant line losses and voltage drops can occur. Instead of operating these trains at a fixed unity power factor, this paper suggests a continuous optimal power factor control scheme for each train in an effort to minimize line losses and improve voltage drops according to varying load conditions. The proposed method utilizes the steepest descent algorithm targeting each car in the same feeding section to establish the optimized reactive power compensation levels that can minimize the reactive power loss of the feeder. The results from a simulation of a sample system show that voltage drops can be improved and line losses decreased.

Unmanned Ground Vehicle Control and Modeling for Lane Tracking and Obstacle Avoidance (충돌회피 및 차선추적을 위한 무인자동차의 제어 및 모델링)

  • Yu, Hwan-Shin;Kim, Sang-Gyum
    • Journal of Advanced Navigation Technology
    • /
    • v.11 no.4
    • /
    • pp.359-370
    • /
    • 2007
  • Lane tracking and obstacle avoidance are considered two of the key technologies on an unmanned ground vehicle system. In this paper, we propose a method of lane tracking and obstacle avoidance, which can be expressed as vehicle control, modeling, and sensor experiments. First, obstacle avoidance consists of two parts: a longitudinal control system for acceleration and deceleration and a lateral control system for steering control. Each system is used for unmanned ground vehicle control, which notes the vehicle's location, recognizes obstacles surrounding it, and makes a decision how fast to proceed according to circumstances. During the operation, the control strategy of the vehicle can detect obstacle and perform obstacle avoidance on the road, which involves vehicle velocity. Second, we explain a method of lane tracking by means of a vision system, which consists of two parts: First, vehicle control is included in the road model through lateral and longitudinal control. Second, the image processing method deals with the lane tracking method, the image processing algorithm, and the filtering method. Finally, in this paper, we propose a method for vehicle control, modeling, lane tracking, and obstacle avoidance, which are confirmed through vehicles tests.

  • PDF

A Study on Recommendation Technique Using Mining and Clustering of Weighted Preference based on FRAT (마이닝과 FRAT기반 가중치 선호도 군집을 이용한 추천 기법에 관한 연구)

  • Park, Wha-Beum;Cho, Young-Sung;Ko, Hyung-Hwa
    • Journal of Digital Contents Society
    • /
    • v.14 no.4
    • /
    • pp.419-428
    • /
    • 2013
  • Real-time accessibility and agility are required in u-commerce under ubiquitous computing environment. Most of the existing recommendation techniques adopt the method of evaluation based on personal profile, which has been identified with difficulties in accurately analyzing the customers' level of interest and tendencies, as well as the problems of cost, consequently leaving customers unsatisfied. Researches have been conducted to improve the accuracy of information such as the level of interest and tendencies of the customers. However, the problem lies not in the preconstructed database, but in generating new and diverse profiles that are used for the evaluation of the existing data. Also it is difficult to use the unique recommendation method with hierarchy of each customer who has various characteristics in the existing recommendation techniques. Accordingly, this dissertation used the implicit method without onerous question and answer to the users based on the data from purchasing, unlike the other evaluation techniques. We applied FRAT technique which can analyze the tendency of the various personalization and the exact customer.

A Study on Development of Disney Animation's Box-office Prediction AI Model Based on Brain Science (뇌과학 기반의 디즈니 애니메이션 흥행 예측 AI 모형 개발 연구)

  • Lee, Jong-Eun;Yang, Eun-Young
    • Journal of Digital Convergence
    • /
    • v.16 no.9
    • /
    • pp.405-412
    • /
    • 2018
  • When a film company decides whether to invest or not in a scenario is the appropriate time to predict box office success. In response to market demands, AI based scenario analysis service has been launched, yet the algorithm is by no means perfect. The purpose of this study is to present a prediction model of movie scenario's box office hit based on human brain processing mechanism. In order to derive patterns of visual, auditory, and cognitive stimuli on the time spectrum of box office animation hit, this study applied Weber's law and brain mechanism. The results are as follow. First, the frequency of brain stimulation in the biggest box office movies was 1.79 times greater than that in the failure movies. Second, in the box office success, the cognitive stimuli codes are spread evenly, whereas in the failure, concentrated among few intervals. Third, in the box office success movie, cognitive stimuli which have big cognition load appeared alone, whereas visual and auditory stimuli which have little cognitive load appeared simultaneously.