• 제목/요약/키워드: Visual Classification

검색결과 585건 처리시간 0.023초

A Study on Visual Emotion Classification using Balanced Data Augmentation (균형 잡힌 데이터 증강 기반 영상 감정 분류에 관한 연구)

  • Jeong, Chi Yoon;Kim, Mooseop
    • Journal of Korea Multimedia Society
    • /
    • 제24권7호
    • /
    • pp.880-889
    • /
    • 2021
  • In everyday life, recognizing people's emotions from their frames is essential and is a popular research domain in the area of computer vision. Visual emotion has a severe class imbalance in which most of the data are distributed in specific categories. The existing methods do not consider class imbalance and used accuracy as the performance metric, which is not suitable for evaluating the performance of the imbalanced dataset. Therefore, we proposed a method for recognizing visual emotion using balanced data augmentation to address the class imbalance. The proposed method generates a balanced dataset by adopting the random over-sampling and image transformation methods. Also, the proposed method uses the Focal loss as a loss function, which can mitigate the class imbalance by down weighting the well-classified samples. EfficientNet, which is the state-of-the-art method for image classification is used to recognize visual emotion. We compare the performance of the proposed method with that of conventional methods by using a public dataset. The experimental results show that the proposed method increases the F1 score by 40% compared with the method without data augmentation, mitigating class imbalance without loss of classification accuracy.

Web Image Classification using Semantically Related Tags and Image Content (의미적 연관태그와 이미지 내용정보를 이용한 웹 이미지 분류)

  • Cho, Soo-Sun
    • Journal of Internet Computing and Services
    • /
    • 제11권3호
    • /
    • pp.15-24
    • /
    • 2010
  • In this paper, we propose an image classification which combines semantic relations of tags with contents of images to improve the satisfaction of image retrieval on application domains as huge image sharing sites. To make good use of image retrieval or classification algorithms on huge image sharing sites as Flickr, they are applicable to real tagged Web images. To classify the Web images by 'bag of visual word' based image content, our algorithm includes training the category model by utilizing the preliminary retrieved images with semantically related tags as training data and classifying the test images based on PLSA. In the experimental results on the Flickr Web images, the proposed method produced the better precision and recall rates than those from the existing method using tag information.

Coating defect classification method for steel structures with vision-thermography imaging and zero-shot learning

  • Jun Lee;Kiyoung Kim;Hyeonjin Kim;Hoon Sohn
    • Smart Structures and Systems
    • /
    • 제33권1호
    • /
    • pp.55-64
    • /
    • 2024
  • This paper proposes a fusion imaging-based coating-defect classification method for steel structures that uses zero-shot learning. In the proposed method, a halogen lamp generates heat energy on the coating surface of a steel structure, and the resulting heat responses are measured by an infrared (IR) camera, while photos of the coating surface are captured by a charge-coupled device (CCD) camera. The measured heat responses and visual images are then analyzed using zero-shot learning to classify the coating defects, and the estimated coating defects are visualized throughout the inspection surface of the steel structure. In contrast to older approaches to coating-defect classification that relied on visual inspection and were limited to surface defects, and older artificial neural network (ANN)-based methods that required large amounts of data for training and validation, the proposed method accurately classifies both internal and external defects and can classify coating defects for unobserved classes that are not included in the training. Additionally, the proposed model easily learns about additional classifying conditions, making it simple to add classes for problems of interest and field application. Based on the results of validation via field testing, the defect-type classification performance is improved 22.7% of accuracy by fusing visual and thermal imaging compared to using only a visual dataset. Furthermore, the classification accuracy of the proposed method on a test dataset with only trained classes is validated to be 100%. With word-embedding vectors for the labels of untrained classes, the classification accuracy of the proposed method is 86.4%.

A classification techiniques of J-lead solder joint using neural network (신경 회로망을 이용한 J-리드 납땜 상태 분류)

  • Yu, Chang-Mok;Lee, Joong-Ho;Cha, Young-Yeup
    • Journal of Institute of Control, Robotics and Systems
    • /
    • 제5권8호
    • /
    • pp.995-1000
    • /
    • 1999
  • This paper presents a optic system and a visual inspection algorithm looking for solder joint defects of J-lead chip which are more integrate and smaller than ones with Gull-wing on PCBs(Printed Circuit Boards). The visual inspection system is composed of three sections : host PC, imaging and driving parts. The host PC part controls the inspection devices and executes the inspection algorithm. The imaging part acquires and processes image data. And the driving part controls XY-table for automatic inspection. In this paper, the most important five features are extracted from input images to categorize four classes of solder joint defects in the case of J-lead chip and utilized to a back-propagation network for classification. Consequently, good accuracy of classification performance and effectiveness of chosen five features are examined by experiment using proposed inspection algorithm.

  • PDF

Deep Adversarial Residual Convolutional Neural Network for Image Generation and Classification

  • Haque, Md Foysal;Kang, Dae-Seong
    • Journal of Advanced Information Technology and Convergence
    • /
    • 제10권1호
    • /
    • pp.111-120
    • /
    • 2020
  • Generative adversarial networks (GANs) achieved impressive performance on image generation and visual classification applications. However, adversarial networks meet difficulties in combining the generative model and unstable training process. To overcome the problem, we combined the deep residual network with upsampling convolutional layers to construct the generative network. Moreover, the study shows that image generation and classification performance become more prominent when the residual layers include on the generator. The proposed network empirically shows that the ability to generate images with higher visual accuracy provided certain amounts of additional complexity using proper regularization techniques. Experimental evaluation shows that the proposed method is superior to image generation and classification tasks.

A New Method to Find Bars

  • Lee, Yun Hee;Ann, Hong Bae;Park, Myeong-Gu
    • The Bulletin of The Korean Astronomical Society
    • /
    • 제39권1호
    • /
    • pp.40.1-40.1
    • /
    • 2014
  • We have classified barred galaxies for 418 RC3 sample galaxies within z < 0.01 from SDSS DR7 using the visual inspection, ellipse fitting method and Fourier analysis. We found the bar fraction to be ~60%, 43% and 70% for each method and that the ellipse fitting method tends to miss the bar when a large bulge hides the transition from bar to disk in early spirals. We also confirmed that the Fourier analysis cannot distinguish between a bar and spiral arm structure. These systematic difficulties may have produced the long-time controversy about bar fraction dependence on Hubble sequence, mass and color. We designed a new method to fine bars by analyzing the ratio map of bar strength in polar coordinates, which yields the bar fraction of ~27% and ~32% for SAB and SB, respectively. The consistency with visual inspection reaches around 70%, and roughly 90% of visual strong bar are classified as SAB and SB in our classification. Although our method also has a weakness that a large bulge lowers the value of bar strength, the missing bar fraction in early spirals is reduced to the level of ~1/4 compared to the ellipse fitting method. Our method can make up for the demerits of the previous automatic classifications and provide a quantitative bar classification that agrees with visual classification.

  • PDF

Deep Image Annotation and Classification by Fusing Multi-Modal Semantic Topics

  • Chen, YongHeng;Zhang, Fuquan;Zuo, WanLi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권1호
    • /
    • pp.392-412
    • /
    • 2018
  • Due to the semantic gap problem across different modalities, automatically retrieval from multimedia information still faces a main challenge. It is desirable to provide an effective joint model to bridge the gap and organize the relationships between them. In this work, we develop a deep image annotation and classification by fusing multi-modal semantic topics (DAC_mmst) model, which has the capacity for finding visual and non-visual topics by jointly modeling the image and loosely related text for deep image annotation while simultaneously learning and predicting the class label. More specifically, DAC_mmst depends on a non-parametric Bayesian model for estimating the best number of visual topics that can perfectly explain the image. To evaluate the effectiveness of our proposed algorithm, we collect a real-world dataset to conduct various experiments. The experimental results show our proposed DAC_mmst performs favorably in perplexity, image annotation and classification accuracy, comparing to several state-of-the-art methods.

Bag of Visual Words Method based on PLSA and Chi-Square Model for Object Category

  • Zhao, Yongwei;Peng, Tianqiang;Li, Bicheng;Ke, Shengcai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제9권7호
    • /
    • pp.2633-2648
    • /
    • 2015
  • The problem of visual words' synonymy and ambiguity always exist in the conventional bag of visual words (BoVW) model based object category methods. Besides, the noisy visual words, so-called "visual stop-words" will degrade the semantic resolution of visual dictionary. In view of this, a novel bag of visual words method based on PLSA and chi-square model for object category is proposed. Firstly, Probabilistic Latent Semantic Analysis (PLSA) is used to analyze the semantic co-occurrence probability of visual words, infer the latent semantic topics in images, and get the latent topic distributions induced by the words. Secondly, the KL divergence is adopt to measure the semantic distance between visual words, which can get semantically related homoionym. Then, adaptive soft-assignment strategy is combined to realize the soft mapping between SIFT features and some homoionym. Finally, the chi-square model is introduced to eliminate the "visual stop-words" and reconstruct the visual vocabulary histograms. Moreover, SVM (Support Vector Machine) is applied to accomplish object classification. Experimental results indicated that the synonymy and ambiguity problems of visual words can be overcome effectively. The distinguish ability of visual semantic resolution as well as the object classification performance are substantially boosted compared with the traditional methods.

Classification of Respiratory States based on Visual Information using Deep Learning (심층학습을 이용한 영상정보 기반 호흡신호 분류)

  • Song, Joohyun;Lee, Deokwoo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • 제22권5호
    • /
    • pp.296-302
    • /
    • 2021
  • This paper proposes an approach to the classification of respiratory states of humans based on visual information. An ultra-wide-band radar sensor acquired respiration signals, and the respiratory states were classified based on two-dimensional (2D) images instead of one-dimensional (1D) vectors. The 1D vector-based classification of respiratory states has limitations in cases of various types of normal respiration. The deep neural network model was employed for the classification, and the model learned the 2D images of respiration signals. Conventional classification methods use the value of the quantified respiration values or a variation of them based on regression or deep learning techniques. This paper used 2D images of the respiration signals, and the accuracy of the classification showed a 10% improvement compared to the method based on a 1D vector representation of the respiration signals. In the classification experiment, the respiration states were categorized into three classes, normal-1, normal-2, and abnormal respiration.

Model Development and Appraisal by Visual Simulation about Soundproof Grove Types of Street Side (도로변 방음 수림대 유형별 시뮬레이션 모형개발 및 평가)

  • Kim, Sung-Kyun;Jeong, Tae-Young
    • Journal of Korean Society of Rural Planning
    • /
    • 제11권2호
    • /
    • pp.59-69
    • /
    • 2005
  • Because of increasing numbers of cars many highways are being constructed lively, and the noise of passing cars has influenced surrounding areas. In consideration of this, some alternatives and researches for soundproof facilities are proceeding, but aesthetic approach hasn't been considered. Therefore, this research is focused on soundproof effects for each types, effectual simulation methods, visual assessment and estimation between the landscape before simulation and the landscape after. Soundproof facilities are divided largely by the soundproof barrier, the soundproof mounding, the soundproof grove. The soundproof grove has three main function. First, leaves and branches absorbs sound vibrations. Second, leaves absorbs sound, and branches obstruct sounds. Third, by means of sounds of shaking leaves, forest can offset noises. This research was proceeded by means of classification of soundproof grove types and investigation of visual simulation methods. We made visual simulation for each types, and estimated the landscape for each types.