• Title/Summary/Keyword: image clustering

Search Result 599, Processing Time 0.026 seconds

Defect Extraction of Ceramic Image using Fuzzy Clustering Based Enhanced Fuzzy Binarization (퍼지 클러스터링 기반 개선된 Fuzzy Binarization 기법을 이용한 세라믹 영상에서의 결함 추출)

  • Choi, Cheol Ho;Lee, Jin Yu;Park, Heon Sung;Kim, Kwang Baek
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.01a
    • /
    • pp.23-26
    • /
    • 2019
  • 본 논문에서는 X-Ray 영상에서 용접한 부분의 기공이나 균열 등의 결함 영역을 추출하는 새로운 방법을 제안한다. 제안된 방법은 세라믹 X-Ray 영상에서 비등방성 확산 필터를 적용하여 영상의 잡음을 제거하고, 수직 및 수평 히스토그램을 각각 적용하여 용접 영역을 추출한 후, 최소 자승법을 적용하여 배경 밝기를 제거하고, 사다리꼴 형태의 Fuzzy Stretching기법을 적용하여 명암 값을 강조하여 결함 영역과 그 외의 영역간의 명암 대비를 강조한다. 그리고 Fuzzy C_Means 알고리즘을 적용하여 결함 영역을 세분화한 후, Fuzzy C_Means을 적용하여 생성된 클러스터들의 중심 명암 값을 이용하여 ${\alpha}_-cut$을 설정한 후에 임계구간을 구하고 영상을 이진화하여 최종적으로 결함 영역을 추출한다. 제안된 방법의 결함 추출 성능을 확인하기 위하여 세라믹 X-Ray 영상을 대상으로 실험한 결과, 기존의 방법보다 결함 영역이 정확히 추출되는 것을 확인할 수 있었다.

  • PDF

Method for Estimating Intramuscular Fat Percentage of Hanwoo(Korean Traditional Cattle) Using Convolutional Neural Networks in Ultrasound Images

  • Kim, Sang Hyun
    • International journal of advanced smart convergence
    • /
    • v.10 no.1
    • /
    • pp.105-116
    • /
    • 2021
  • In order to preserve the seeds of excellent Hanwoo(Korean traditional cattle) and secure quality competitiveness in the infinite competition with foreign imported beef, production of high-quality Hanwoo beef is absolutely necessary. %IMF (Intramuscular Fat Percentage) is one of the most important factors in evaluating the value of high-quality meat, although standards vary according to food culture and industrial conditions by country. Therefore, it is required to develop a %IMF estimation algorithm suitable for Hanwoo. In this study, we proposed a method of estimating %IMF of Hanwoo using CNN in ultrasound images. First, the proposed method classified the chemically measured %IMF into 10 classes using k-means clustering method to apply CNN. Next, ROI images were obtained at regular intervals from each ultrasound image and used for CNN training and estimation. The proposed CNN model is composed of three stages of convolution layer and fully connected layer. As a result of the experiment, it was confirmed that the %IMF of Hanwoo was estimated with an accuracy of 98.2%. The correlation coefficient between the estimated %IMF and the real %IMF by the proposed method is 0.97, which is about 10% better than the 0.88 of the previous method.

A Fine Dust Measurement Technique using K-means and Sobel-mask Edge Detection Method (K-means와 Sobel-mask 윤곽선 검출 기법을 이용한 미세먼지 측정 방법)

  • Lee, Won-Hyeung;Seo, Ju-Wan;Kim, Ki-Yeon;Lin, Chi-Ho
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.2
    • /
    • pp.97-101
    • /
    • 2022
  • In this paper, we propose a method of measuring Fine dust in images using K-means and Sobel-mask based edge detection techniques using CCTV. The proposed algorithm collects images using a CCTV camera and designates an image range through a region of interest. When clustering is completed by applying the K-means algorithm, outline is detected through Sobel-mask, edge strength is measured, and the concentration of fine dust is determined based on the measured data. The proposed method extracts the contour of the mountain range using the characteristics of Sobel-mask, which has an advantage in diagonal measurement, and shows the difference in detection according to the concentration of fine dust as an experimental result.

Similarity measurement based on Min-Hash for Preserving Privacy

  • Cha, Hyun-Jong;Yang, Ho-Kyung;Song, You-Jin
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.240-245
    • /
    • 2022
  • Because of the importance of the information, encryption algorithms are heavily used. Raw data is encrypted and secure, but problems arise when the key for decryption is exposed. In particular, large-scale Internet sites such as Facebook and Amazon suffer serious damage when user data is exposed. Recently, research into a new fourth-generation encryption technology that can protect user-related data without the use of a key required for encryption is attracting attention. Also, data clustering technology using encryption is attracting attention. In this paper, we try to reduce key exposure by using homomorphic encryption. In addition, we want to maintain privacy through similarity measurement. Additionally, holistic similarity measurements are time-consuming and expensive as the data size and scope increases. Therefore, Min-Hash has been studied to efficiently estimate the similarity between two signatures Methods of measuring similarity that have been studied in the past are time-consuming and expensive as the size and area of data increases. However, Min-Hash allowed us to efficiently infer the similarity between the two sets. Min-Hash is widely used for anti-plagiarism, graph and image analysis, and genetic analysis. Therefore, this paper reports privacy using homomorphic encryption and presents a model for efficient similarity measurement using Min-Hash.

Automatic Poster Generation System Using Protagonist Face Analysis

  • Yeonhwi You;Sungjung Yong;Hyogyeong Park;Seoyoung Lee;Il-Young Moon
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.4
    • /
    • pp.287-293
    • /
    • 2023
  • With the rapid development of domestic and international over-the-top markets, a large amount of video content is being created. As the volume of video content increases, consumers tend to increasingly check data concerning the videos before watching them. To address this demand, video summaries in the form of plot descriptions, thumbnails, posters, and other formats are provided to consumers. This study proposes an approach that automatically generates posters to effectively convey video content while reducing the cost of video summarization. In the automatic generation of posters, face recognition and clustering are used to gather and classify character data, and keyframes from the video are extracted to learn the overall atmosphere of the video. This study used the facial data of the characters and keyframes as training data and employed technologies such as DreamBooth, a text-to-image generation model, to automatically generate video posters. This process significantly reduces the time and cost of video-poster production.

Water resources monitoring technique using multi-source satellite image data fusion (다종 위성영상 자료 융합 기반 수자원 모니터링 기술 개발)

  • Lee, Seulchan;Kim, Wanyub;Cho, Seongkeun;Jeon, Hyunho;Choi, Minhae
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.8
    • /
    • pp.497-508
    • /
    • 2023
  • Agricultural reservoirs are crucial structures for water resources monitoring especially in Korea where the resources are seasonally unevenly distributed. Optical and Synthetic Aperture Radar (SAR) satellites, being utilized as tools for monitoring the reservoirs, have unique limitations in that optical sensors are sensitive to weather conditions and SAR sensors are sensitive to noises and multiple scattering over dense vegetations. In this study, we tried to improve water body detection accuracy through optical-SAR data fusion, and quantitatively analyze the complementary effects. We first detected water bodies at Edong, Cheontae reservoir using the Compact Advanced Satellite 500(CAS500), Kompsat-3/3A, and Sentinel-2 derived Normalized Difference Water Index (NDWI), and SAR backscattering coefficient from Sentinel-1 by K-means clustering technique. After that, the improvements in accuracies were analyzed by applying K-means clustering to the 2-D grid space consists of NDWI and SAR. Kompsat-3/3A was found to have the best accuracy (0.98 at both reservoirs), followed by Sentinel-2(0.83 at Edong, 0.97 at Cheontae), Sentinel-1(both 0.93), and CAS500(0.69, 0.78). By applying K-means clustering to the 2-D space at Cheontae reservoir, accuracy of CAS500 was improved around 22%(resulting accuracy: 0.95) with improve in precision (85%) and degradation in recall (14%). Precision of Kompsat-3A (Sentinel-2) was improved 3%(5%), and recall was degraded 4%(7%). More precise water resources monitoring is expected to be possible with developments of high-resolution SAR satellites including CAS500-5, developments of image fusion and water body detection techniques.

A Two-Stage Learning Method of CNN and K-means RGB Cluster for Sentiment Classification of Images (이미지 감성분류를 위한 CNN과 K-means RGB Cluster 이-단계 학습 방안)

  • Kim, Jeongtae;Park, Eunbi;Han, Kiwoong;Lee, Junghyun;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.139-156
    • /
    • 2021
  • The biggest reason for using a deep learning model in image classification is that it is possible to consider the relationship between each region by extracting each region's features from the overall information of the image. However, the CNN model may not be suitable for emotional image data without the image's regional features. To solve the difficulty of classifying emotion images, many researchers each year propose a CNN-based architecture suitable for emotion images. Studies on the relationship between color and human emotion were also conducted, and results were derived that different emotions are induced according to color. In studies using deep learning, there have been studies that apply color information to image subtraction classification. The case where the image's color information is additionally used than the case where the classification model is trained with only the image improves the accuracy of classifying image emotions. This study proposes two ways to increase the accuracy by incorporating the result value after the model classifies an image's emotion. Both methods improve accuracy by modifying the result value based on statistics using the color of the picture. When performing the test by finding the two-color combinations most distributed for all training data, the two-color combinations most distributed for each test data image were found. The result values were corrected according to the color combination distribution. This method weights the result value obtained after the model classifies an image's emotion by creating an expression based on the log function and the exponential function. Emotion6, classified into six emotions, and Artphoto classified into eight categories were used for the image data. Densenet169, Mnasnet, Resnet101, Resnet152, and Vgg19 architectures were used for the CNN model, and the performance evaluation was compared before and after applying the two-stage learning to the CNN model. Inspired by color psychology, which deals with the relationship between colors and emotions, when creating a model that classifies an image's sentiment, we studied how to improve accuracy by modifying the result values based on color. Sixteen colors were used: red, orange, yellow, green, blue, indigo, purple, turquoise, pink, magenta, brown, gray, silver, gold, white, and black. It has meaning. Using Scikit-learn's Clustering, the seven colors that are primarily distributed in the image are checked. Then, the RGB coordinate values of the colors from the image are compared with the RGB coordinate values of the 16 colors presented in the above data. That is, it was converted to the closest color. Suppose three or more color combinations are selected. In that case, too many color combinations occur, resulting in a problem in which the distribution is scattered, so a situation fewer influences the result value. Therefore, to solve this problem, two-color combinations were found and weighted to the model. Before training, the most distributed color combinations were found for all training data images. The distribution of color combinations for each class was stored in a Python dictionary format to be used during testing. During the test, the two-color combinations that are most distributed for each test data image are found. After that, we checked how the color combinations were distributed in the training data and corrected the result. We devised several equations to weight the result value from the model based on the extracted color as described above. The data set was randomly divided by 80:20, and the model was verified using 20% of the data as a test set. After splitting the remaining 80% of the data into five divisions to perform 5-fold cross-validation, the model was trained five times using different verification datasets. Finally, the performance was checked using the test dataset that was previously separated. Adam was used as the activation function, and the learning rate was set to 0.01. The training was performed as much as 20 epochs, and if the validation loss value did not decrease during five epochs of learning, the experiment was stopped. Early tapping was set to load the model with the best validation loss value. The classification accuracy was better when the extracted information using color properties was used together than the case using only the CNN architecture.

Video Analysis System for Action and Emotion Detection by Object with Hierarchical Clustering based Re-ID (계층적 군집화 기반 Re-ID를 활용한 객체별 행동 및 표정 검출용 영상 분석 시스템)

  • Lee, Sang-Hyun;Yang, Seong-Hun;Oh, Seung-Jin;Kang, Jinbeom
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.89-106
    • /
    • 2022
  • Recently, the amount of video data collected from smartphones, CCTVs, black boxes, and high-definition cameras has increased rapidly. According to the increasing video data, the requirements for analysis and utilization are increasing. Due to the lack of skilled manpower to analyze videos in many industries, machine learning and artificial intelligence are actively used to assist manpower. In this situation, the demand for various computer vision technologies such as object detection and tracking, action detection, emotion detection, and Re-ID also increased rapidly. However, the object detection and tracking technology has many difficulties that degrade performance, such as re-appearance after the object's departure from the video recording location, and occlusion. Accordingly, action and emotion detection models based on object detection and tracking models also have difficulties in extracting data for each object. In addition, deep learning architectures consist of various models suffer from performance degradation due to bottlenects and lack of optimization. In this study, we propose an video analysis system consists of YOLOv5 based DeepSORT object tracking model, SlowFast based action recognition model, Torchreid based Re-ID model, and AWS Rekognition which is emotion recognition service. Proposed model uses single-linkage hierarchical clustering based Re-ID and some processing method which maximize hardware throughput. It has higher accuracy than the performance of the re-identification model using simple metrics, near real-time processing performance, and prevents tracking failure due to object departure and re-emergence, occlusion, etc. By continuously linking the action and facial emotion detection results of each object to the same object, it is possible to efficiently analyze videos. The re-identification model extracts a feature vector from the bounding box of object image detected by the object tracking model for each frame, and applies the single-linkage hierarchical clustering from the past frame using the extracted feature vectors to identify the same object that failed to track. Through the above process, it is possible to re-track the same object that has failed to tracking in the case of re-appearance or occlusion after leaving the video location. As a result, action and facial emotion detection results of the newly recognized object due to the tracking fails can be linked to those of the object that appeared in the past. On the other hand, as a way to improve processing performance, we introduce Bounding Box Queue by Object and Feature Queue method that can reduce RAM memory requirements while maximizing GPU memory throughput. Also we introduce the IoF(Intersection over Face) algorithm that allows facial emotion recognized through AWS Rekognition to be linked with object tracking information. The academic significance of this study is that the two-stage re-identification model can have real-time performance even in a high-cost environment that performs action and facial emotion detection according to processing techniques without reducing the accuracy by using simple metrics to achieve real-time performance. The practical implication of this study is that in various industrial fields that require action and facial emotion detection but have many difficulties due to the fails in object tracking can analyze videos effectively through proposed model. Proposed model which has high accuracy of retrace and processing performance can be used in various fields such as intelligent monitoring, observation services and behavioral or psychological analysis services where the integration of tracking information and extracted metadata creates greate industrial and business value. In the future, in order to measure the object tracking performance more precisely, there is a need to conduct an experiment using the MOT Challenge dataset, which is data used by many international conferences. We will investigate the problem that the IoF algorithm cannot solve to develop an additional complementary algorithm. In addition, we plan to conduct additional research to apply this model to various fields' dataset related to intelligent video analysis.

Statistical Analysis of Projection-Based Face Recognition Algorithms (투사에 기초한 얼굴 인식 알고리즘들의 통계적 분석)

  • 문현준;백순화;전병민
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.5A
    • /
    • pp.717-725
    • /
    • 2000
  • Within the last several years, there has been a large number of algorithms developed for face recognition. The majority of these algorithms have been view- and projection-based algorithms. Our definition of projection is not restricted to projecting the image onto an orthogonal basis the definition is expansive and includes a general class of linear transformation of the image pixel values. The class includes correlation, principal component analysis, clustering, gray scale projection, and matching pursuit filters. In this paper, we perform a detailed analysis of this class of algorithms by evaluating them on the FERET database of facial images. In our experiments, a projection-based algorithms consists of three steps. The first step is done off-line and determines the new basis for the images. The bases is either set by the algorithm designer or is learned from a training set. The last two steps are on-line and perform the recognition. The second step projects an image onto the new basis and the third step recognizes a face in an with a nearest neighbor classifier. The classification is performed in the projection space. Most evaluation methods report algorithm performance on a single gallery. This does not fully capture algorithm performance. In our study, we construct set of independent galleries. This allows us to see how individual algorithm performance varies over different galleries. In addition, we report on the relative performance of the algorithms over the different galleries.

  • PDF

Automatic Tumor Segmentation Method using Symmetry Analysis and Level Set Algorithm in MR Brain Image (대칭성 분석과 레벨셋을 이용한 자기공명 뇌영상의 자동 종양 영역 분할 방법)

  • Kim, Bo-Ram;Park, Keun-Hye;Kim, Wook-Hyun
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.12 no.4
    • /
    • pp.267-273
    • /
    • 2011
  • In this paper, we proposed the method to detect brain tumor region in MR images. Our method is composed of 3 parts, detection of tumor slice, detection of tumor region and tumor boundary detection. In the tumor slice detection step, a slice which contains tumor regions is distinguished using symmetric analysis in 3D brain volume. The tumor region detection step is the process to segment the tumor region in the slice distinguished as a tumor slice. And tumor region is finally detected, using spatial feature and symmetric analysis based on the cluster information. The process for detecting tumor slice and tumor region have advantages which are robust for noise and requires less computational time, using the knowledge of the brain tumor and cluster-based on symmetric analysis. And we use the level set method with fast marching algorithm to detect the tumor boundary. It is performed to find the tumor boundary for all other slices using the initial seeds derived from the previous or later slice until the tumor region is vanished. It requires less computational time because every procedure is not performed for all slices.