• Title/Summary/Keyword: K-means 클러스터링

Search Result 367, Processing Time 0.024 seconds

Automatic Detection of Foreign Body through Template Matching in Industrial CT Volume Data (산업용 CT 볼륨데이터에서 템플릿 매칭을 통한 이물질 자동 검출)

  • Ji, Hye-Rim;Hong, Helen
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.12
    • /
    • pp.1376-1384
    • /
    • 2013
  • In this paper, we propose an automaticdetection method of foreign bodies through template matching in industrial CT volume data. Our method is composed of three main steps. First,Indown-sampling data, the product region is separated from background after noise reduction and initial foreign-body candidates are extracted using mean and standard deviation of the product region. Then foreign-body candidates are extracted using K-means clustering. Second, the foreign body with different intensity of product region is detected using template matching. At this time, the template matching is performed by evaluating SSD orjoint entropy according to the size of detected foreign-body candidates. Third, to improve thedetection rate of foreign body in original volume data, final foreign bodiesare detected using percolation method. For the performance evaluation of our method, industrial CT volume data and simulation data are used. Then visual inspection and accuracy assessment are performed and processing time is measured. For accuracy assessment, density-based detection method is used as comparative method and Dice's coefficient is measured.

Human Action Recognition in Still Image Using Weighted Bag-of-Features and Ensemble Decision Trees (가중치 기반 Bag-of-Feature와 앙상블 결정 트리를 이용한 정지 영상에서의 인간 행동 인식)

  • Hong, June-Hyeok;Ko, Byoung-Chul;Nam, Jae-Yeal
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38A no.1
    • /
    • pp.1-9
    • /
    • 2013
  • This paper propose a human action recognition method that uses bag-of-features (BoF) based on CS-LBP (center-symmetric local binary pattern) and a spatial pyramid in addition to the random forest classifier. To construct the BoF, an image divided into dense regular grids and extract from each patch. A code word which is a visual vocabulary, is formed by k-means clustering of a random subset of patches. For enhanced action discrimination, local BoF histogram from three subdivided levels of a spatial pyramid is estimated, and a weighted BoF histogram is generated by concatenating the local histograms. For action classification, a random forest, which is an ensemble of decision trees, is built to model the distribution of each action class. The random forest combined with the weighted BoF histogram is successfully applied to Standford Action 40 including various human action images, and its classification performance is better than that of other methods. Furthermore, the proposed method allows action recognition to be performed in near real-time.

Track-Before-Detect Algorithm for Multiple Target Detection (다수 표적 탐지를 위한 Track-Before-Detect 알고리듬 연구)

  • Won, Dae-Yeon;Shim, Sang-Wook;Kim, Keum-Seong;Tahk, Min-Jea;Seong, Kie-Jeong;Kim, Eung-Tai
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.39 no.9
    • /
    • pp.848-857
    • /
    • 2011
  • Vision-based collision avoidance system for air traffic management requires a excellent multiple target detection algorithm under low signal-to-noise ratio (SNR) levels. The track-before-detect (TBD) approaches have significant applications such as detection of small and dim targets from an image sequence. In this paper, two detection algorithms with the TBD approaches are proposed to satisfy the multiple target detection requirements. The first algorithm, based on a dynamic programming approach, is designed to classify multiple targets by using a k-means clustering algorithm. In the second approach, a hidden Markov model (HMM) is slightly modified for detecting multiple targets sequentially. Both of the proposed approaches are used in numerical simulations with variations in target appearance properties to provide satisfactory performance as multiple target detection methods.

Scalable Cluster Overlay Source Routing Protocol (확장성을 갖는 클러스터 기반의 라우팅 프로토콜)

  • Jang, Kwang-Soo;Yang, Hyo-Sik
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.3
    • /
    • pp.83-89
    • /
    • 2010
  • Scalable routing is one of the key challenges in designing and operating large scale MANETs. Performance of routing protocols proposed so far is only guaranteed under various limitation, i.e., dependent of the number of nodes in the network or needs the location information of destination node. Due to the dependency to the number of nodes in the network, as the number of nodes increases the performance of previous routing protocols degrade dramatically. We propose Cluster Overlay Dynamic Source Routing (CODSR) protocol. We conduct performance analysis by means of computer simulation under various conditions - diameter scaling and density scaling. Developed algorithm outperforms the DSR algorithm, e.g., more than 90% improvement as for the normalized routing load. Operation of CODSR is very simple and we show that the message and time complexity of CODSR is independent of the number of nodes in the network which makes CODSR highly scalable.

Multivariate Outlier Removing for the Risk Prediction of Gas Leakage based Methane Gas (메탄 가스 기반 가스 누출 위험 예측을 위한 다변량 특이치 제거)

  • Dashdondov, Khongorzul;Kim, Mi-Hye
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.12
    • /
    • pp.23-30
    • /
    • 2020
  • In this study, the relationship between natural gas (NG) data and gas-related environmental elements was performed using machine learning algorithms to predict the level of gas leakage risk without directly measuring gas leakage data. The study was based on open data provided by the server using the IoT-based remote control Picarro gas sensor specification. The naturel gas leaks into the air, it is a big problem for air pollution, environment and the health. The proposed method is multivariate outlier removing method based Random Forest (RF) classification for predicting risk of NG leak. After, unsupervised k-means clustering, the experimental dataset has done imbalanced data. Therefore, we focusing our proposed models can predict medium and high risk so best. In this case, we compared the receiver operating characteristic (ROC) curve, accuracy, area under the ROC curve (AUC), and mean standard error (MSE) for each classification model. As a result of our experiments, the evaluation measurements include accuracy, area under the ROC curve (AUC), and MSE; 99.71%, 99.57%, and 0.0016 for MOL_RF respectively.

Forecasting the Growth of Smartphone Market in Mongolia Using Bass Diffusion Model (Bass Diffusion 모델을 활용한 스마트폰 시장의 성장 규모 예측: 몽골 사례)

  • Anar Bataa;KwangSup Shin
    • The Journal of Bigdata
    • /
    • v.7 no.1
    • /
    • pp.193-212
    • /
    • 2022
  • The Bass Diffusion Model is one of the most successful models in marketing research, and management science in general. Since its publication in 1969, it has guided marketing research on diffusion. This paper illustrates the usage of the Bass diffusion model, using mobile cellular subscription diffusion as a context. We fit the bass diffusion model to three large developed markets, South Korea, Japan, and China, and the emerging markets of Vietnam, Thailand, Kazakhstan, and Mongolia. We estimate the parameters of the bass diffusion model using the nonlinear least square method. The diffusion of mobile cellular subscriptions does follow an S-curve in every case. After acquiring m, p, and q parameters we use k-Means Cluster Analysis for grouping countries into three groups. By clustering countries, we suggest that diffusion rates and patterns are similar, where countries with emerging markets can follow in the footsteps of countries with developed markets. The purpose was to predict the timing and the magnitude of the market maturity and to determine whether the data follow the typical diffusion curve of innovations from the Bass model.

A Study on Clustering of Core Competencies to Deploy in and Develop Courseworks for New Digital Technology (카드소팅을 활용한 디지털 신기술 과정 핵심역량 군집화에 관한 연구)

  • Ji-Woon Lee;Ho Lee;Joung-Huem Kwon
    • Journal of Practical Engineering Education
    • /
    • v.14 no.3
    • /
    • pp.565-572
    • /
    • 2022
  • Card sorting is a useful data collection method for understanding users' perceptions of relationships between items. In general, card sorting is an intuitive and cost-effective technique that is very useful for user research and evaluation. In this study, the core competencies of each field were used as competency cards used in the next stage of card sorting for course development, and the clustering results were derived by applying the K-means algorithm to cluster the results. As a result of card sorting, competency clustering for core competencies for each occupation in each field was verified based on Participant-Centric Analysis (PCA). For the number of core competency cards for each occupation, the number of participants who agreed appropriately for clustering and the degree of card similarity were derived compared to the number of sorting participants.

Prompt engineering to improve the performance of teaching and learning materials Recommendation of Generative Artificial Intelligence

  • Soo-Hwan Lee;Ki-Sang Song
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.8
    • /
    • pp.195-204
    • /
    • 2023
  • In this study, prompt engineering that improves prompts was explored to improve the performance of teaching and learning materials recommendations using generative artificial intelligence such as GPT and Stable Diffusion. Picture materials were used as the types of teaching and learning materials. To explore the impact of the prompt composition, a Zero-Shot prompt, a prompt containing learning target grade information, a prompt containing learning goals, and a prompt containing both learning target grades and learning goals were designed to collect responses. The collected responses were embedded using Sentence Transformers, dimensionalized to t-SNE, and visualized, and then the relationship between prompts and responses was explored. In addition, each response was clustered using the k-means clustering algorithm, then the adjacent value of the widest cluster was selected as a representative value, imaged using Stable Diffusion, and evaluated by 30 elementary school teachers according to the criteria for evaluating teaching and learning materials. Thirty teachers judged that three of the four picture materials recommended were of educational value, and two of them could be used for actual classes. The prompt that recommended the most valuable picture material appeared as a prompt containing both the target grade and the learning goal.

Application of Environmental Planning Considering the Trend of PM10 in Ambient Air (미세먼지(PM10) 추세를 고려한 환경계획 적용 방향 제안)

  • Yoon, Eun Joo
    • Journal of Environmental Impact Assessment
    • /
    • v.29 no.3
    • /
    • pp.210-218
    • /
    • 2020
  • Even though PM10 in ambient air has been steadily reduced, the perception of it has been deteriorated. Forthatreason, first, it can still be mentioned the annual average concentration of PM10 exceeding WHO standards, second, an increase in the number of high concentration days of PM10, and third, lack of consideration for differences in causes and phenomena of PM10 by regions. Therefore, this study was aimed to suggest management types for PM10 in ambient air by clustering 69 cities based on the trends and current levels of PM10. In addition, we proposed complementary measures such as the green infrastructure, ventilation corridors and adaptation measures (limit of exposure) for type III (distribution in the central inner region) and IV (metropolitan city, south-east coast region) where improvement of PM10 was insufficient. Although this study did not consider the cause of PM10 together, there is a significance that the scientific basis for responding to the near future is conducted based on past trends of PM10.

Keyword Network Analysis for Technology Forecasting (기술예측을 위한 특허 키워드 네트워크 분석)

  • Choi, Jin-Ho;Kim, Hee-Su;Im, Nam-Gyu
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.227-240
    • /
    • 2011
  • New concepts and ideas often result from extensive recombination of existing concepts or ideas. Both researchers and developers build on existing concepts and ideas in published papers or registered patents to develop new theories and technologies that in turn serve as a basis for further development. As the importance of patent increases, so does that of patent analysis. Patent analysis is largely divided into network-based and keyword-based analyses. The former lacks its ability to analyze information technology in details while the letter is unable to identify the relationship between such technologies. In order to overcome the limitations of network-based and keyword-based analyses, this study, which blends those two methods, suggests the keyword network based analysis methodology. In this study, we collected significant technology information in each patent that is related to Light Emitting Diode (LED) through text mining, built a keyword network, and then executed a community network analysis on the collected data. The results of analysis are as the following. First, the patent keyword network indicated very low density and exceptionally high clustering coefficient. Technically, density is obtained by dividing the number of ties in a network by the number of all possible ties. The value ranges between 0 and 1, with higher values indicating denser networks and lower values indicating sparser networks. In real-world networks, the density varies depending on the size of a network; increasing the size of a network generally leads to a decrease in the density. The clustering coefficient is a network-level measure that illustrates the tendency of nodes to cluster in densely interconnected modules. This measure is to show the small-world property in which a network can be highly clustered even though it has a small average distance between nodes in spite of the large number of nodes. Therefore, high density in patent keyword network means that nodes in the patent keyword network are connected sporadically, and high clustering coefficient shows that nodes in the network are closely connected one another. Second, the cumulative degree distribution of the patent keyword network, as any other knowledge network like citation network or collaboration network, followed a clear power-law distribution. A well-known mechanism of this pattern is the preferential attachment mechanism, whereby a node with more links is likely to attain further new links in the evolution of the corresponding network. Unlike general normal distributions, the power-law distribution does not have a representative scale. This means that one cannot pick a representative or an average because there is always a considerable probability of finding much larger values. Networks with power-law distributions are therefore often referred to as scale-free networks. The presence of heavy-tailed scale-free distribution represents the fundamental signature of an emergent collective behavior of the actors who contribute to forming the network. In our context, the more frequently a patent keyword is used, the more often it is selected by researchers and is associated with other keywords or concepts to constitute and convey new patents or technologies. The evidence of power-law distribution implies that the preferential attachment mechanism suggests the origin of heavy-tailed distributions in a wide range of growing patent keyword network. Third, we found that among keywords that flew into a particular field, the vast majority of keywords with new links join existing keywords in the associated community in forming the concept of a new patent. This finding resulted in the same outcomes for both the short-term period (4-year) and long-term period (10-year) analyses. Furthermore, using the keyword combination information that was derived from the methodology suggested by our study enables one to forecast which concepts combine to form a new patent dimension and refer to those concepts when developing a new patent.