• Title/Summary/Keyword: Video Data Classification

Search Result 136, Processing Time 0.026 seconds

Surficial Sediment Classification using Backscattered Amplitude Imagery of Multibeam Echo Sounder(300 kHz) (다중빔 음향 탐사시스템(300 kHz)의 후방산란 자료를 이용한 해저면 퇴적상 분류에 관한 연구)

  • Park, Yo-Sup;Lee, Sin-Je;Seo, Won-Jin;Gong, Gee-Soo;Han, Hyuk-Soo;Park, Soo-Chul
    • Economic and Environmental Geology
    • /
    • v.41 no.6
    • /
    • pp.747-761
    • /
    • 2008
  • In order to experiment the acoustic remote classification of seabed sediment, we achieved ground-truth data(i.e. video and grab samples, etc.) and developed post-processing for automatic classification procedure on the basis of 300 kHz MultiBeam Echo Sounder(MBES) backscattering data, which was acquired using KONGBERG Simrad EM3000 at Sock-Cho Port, East Sea of South Korea. Sonar signal and its classification performance were identified with geo-referenced video imagery with the aid of GIS (Geographic Information System). The depth range of research site was from 5 m to 22.7 m, and the backscattering amplitude showed from -36dB to -15dB. The mean grain sizes of sediment from equi-distanced sampling site(50 m interval) varied from 2.86$(\phi)$ to 0.88(\phi). To acquire the main feature for the seabed classification from backscattering amplitude of MBES, we evaluated the correlation factors between the backscattering amplitude and properties of sediment samples. The performance of seabed remote classification proposed was evaluated with comparing the correlation of human expert segmentation to automatic algorithm results. The cross-model perception error ratio on automatic classification algorithm shows 8.95% at rocky bottoms, and 2.06% at the area representing low mean grain size.

Recognition of Occupants' Cold Discomfort-Related Actions for Energy-Efficient Buildings

  • Song, Kwonsik;Kang, Kyubyung;Min, Byung-Cheol
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.426-432
    • /
    • 2022
  • HVAC systems play a critical role in reducing energy consumption in buildings. Integrating occupants' thermal comfort evaluation into HVAC control strategies is believed to reduce building energy consumption while minimizing their thermal discomfort. Advanced technologies, such as visual sensors and deep learning, enable the recognition of occupants' discomfort-related actions, thus making it possible to estimate their thermal discomfort. Unfortunately, it remains unclear how accurate a deep learning-based classifier is to recognize occupants' discomfort-related actions in a working environment. Therefore, this research evaluates the classification performance of occupants' discomfort-related actions while sitting at a computer desk. To achieve this objective, this study collected RGB video data on nine college students' cold discomfort-related actions and then trained a deep learning-based classifier using the collected data. The classification results are threefold. First, the trained classifier has an average accuracy of 93.9% for classifying six cold discomfort-related actions. Second, each discomfort-related action is recognized with more than 85% accuracy. Third, classification errors are mostly observed among similar discomfort-related actions. These results indicate that using human action data will enable facility managers to estimate occupants' thermal discomfort and, in turn, adjust the operational settings of HVAC systems to improve the energy efficiency of buildings in conjunction with their thermal comfort levels.

  • PDF

Video and Film Rating Algorithm using EEG Response Measurement to Content: Focus on Sexuality

  • Kwon, Mahnwoo
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.7
    • /
    • pp.862-869
    • /
    • 2020
  • This study attempted to analyze human brain responses toward visual content through EEG signals and intended to measure brain wave reactions of different age groups to determine the sexuality level of the media. The experimental stimuli consist of three different video footage (rated ages 12, 15, and 18) to analyze how subjects react in situations where they actually watch sexual content. For measuring and analyzing brain wave reactions, EEG equipment records alpha, beta, and gamma wave responses of the subjects' left and right frontal lobes, temporal lobes, and occipital lobes. The subjects of this study were 28 total and they are divided into two groups. The experiment configures a sexual content classification scale with age or gender as a discriminating variable and brain region-specific response frequencies (left/right, frontal/temporal/occipital, alpha/beta/gamma waves) as independent variables. The experimental results showed the possibility of distinguishing gender and age differences. The apparent differences in brain wave response areas and bands among high school girls, high school boys, and college students are found. Using these brain wave response data, this study explored the potential of developing algorithm for measurement of age-specific responses to sexual content and apply it as a film rating.

Multi-Modal based ViT Model for Video Data Emotion Classification (영상 데이터 감정 분류를 위한 멀티 모달 기반의 ViT 모델)

  • Yerim Kim;Dong-Gyu Lee;Seo-Yeong Ahn;Jee-Hyun Kim
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.01a
    • /
    • pp.9-12
    • /
    • 2023
  • 최근 영상 콘텐츠를 통해 영상물의 메시지뿐 아니라 메시지의 형식을 통해 전달된 감정이 시청하는 사람의 심리 상태에 영향을 주고 있다. 이에 따라, 영상 콘텐츠의 감정을 분류하는 연구가 활발히 진행되고 있고 본 논문에서는 대중적인 영상 스트리밍 플랫폼 중 하나인 유튜브 영상을 7가지의 감정 카테고리로 분류하는 여러 개의 영상 데이터 중 각 영상 데이터에서 오디오와 이미지 데이터를 각각 추출하여 학습에 이용하는 멀티 모달 방식 기반의 영상 감정 분류 모델을 제안한다. 사전 학습된 VGG(Visual Geometry Group)모델과 ViT(Vision Transformer) 모델을 오디오 분류 모델과 이미지 분류 모델에 이용하여 학습하고 본 논문에서 제안하는 병합 방법을 이용하여 병합 후 비교하였다. 본 논문에서는 기존 영상 데이터 감정 분류 방식과 다르게 영상 속에서 화자를 인식하지 않고 감정을 분류하여 최고 48%의 정확도를 얻었다.

  • PDF

Affective Computing in Education: Platform Analysis and Academic Emotion Classification

  • So, Hyo-Jeong;Lee, Ji-Hyang;Park, Hyun-Jin
    • International journal of advanced smart convergence
    • /
    • v.8 no.2
    • /
    • pp.8-17
    • /
    • 2019
  • The main purpose of this study isto explore the potential of affective computing (AC) platforms in education through two phases ofresearch: Phase I - platform analysis and Phase II - classification of academic emotions. In Phase I, the results indicate that the existing affective analysis platforms can be largely classified into four types according to the emotion detecting methods: (a) facial expression-based platforms, (b) biometric-based platforms, (c) text/verbal tone-based platforms, and (c) mixed methods platforms. In Phase II, we conducted an in-depth analysis of the emotional experience that a learner encounters in online video-based learning in order to establish the basis for a new classification system of online learner's emotions. Overall, positive emotions were shown more frequently and longer than negative emotions. We categorized positive emotions into three groups based on the facial expression data: (a) confidence; (b) excitement, enjoyment, and pleasure; and (c) aspiration, enthusiasm, and expectation. The same method was used to categorize negative emotions into four groups: (a) fear and anxiety, (b) embarrassment and shame, (c) frustration and alienation, and (d) boredom. Drawn from the results, we proposed a new classification scheme that can be used to measure and analyze how learners in online learning environments experience various positive and negative emotions with the indicators of facial expressions.

Arousal and Valence Classification Model Based on Long Short-Term Memory and DEAP Data for Mental Healthcare Management

  • Choi, Eun Jeong;Kim, Dong Keun
    • Healthcare Informatics Research
    • /
    • v.24 no.4
    • /
    • pp.309-316
    • /
    • 2018
  • Objectives: Both the valence and arousal components of affect are important considerations when managing mental healthcare because they are associated with affective and physiological responses. Research on arousal and valence analysis, which uses images, texts, and physiological signals that employ deep learning, is actively underway; research investigating how to improve the recognition rate is needed. The goal of this research was to design a deep learning framework and model to classify arousal and valence, indicating positive and negative degrees of emotion as high or low. Methods: The proposed arousal and valence classification model to analyze the affective state was tested using data from 40 channels provided by a dataset for emotion analysis using electrocardiography (EEG), physiological, and video signals (the DEAP dataset). Experiments were based on 10 selected featured central and peripheral nervous system data points, using long short-term memory (LSTM) as a deep learning method. Results: The arousal and valence were classified and visualized on a two-dimensional coordinate plane. Profiles were designed depending on the number of hidden layers, nodes, and hyperparameters according to the error rate. The experimental results show an arousal and valence classification model accuracy of 74.65 and 78%, respectively. The proposed model performed better than previous other models. Conclusions: The proposed model appears to be effective in analyzing arousal and valence; specifically, it is expected that affective analysis using physiological signals based on LSTM will be possible without manual feature extraction. In a future study, the classification model will be adopted in mental healthcare management systems.

Egocentric Vision for Human Activity Recognition Using Deep Learning

  • Malika Douache;Badra Nawal Benmoussat
    • Journal of Information Processing Systems
    • /
    • v.19 no.6
    • /
    • pp.730-744
    • /
    • 2023
  • The topic of this paper is the recognition of human activities using egocentric vision, particularly captured by body-worn cameras, which could be helpful for video surveillance, automatic search and video indexing. This being the case, it could also be helpful in assistance to elderly and frail persons for revolutionizing and improving their lives. The process throws up the task of human activities recognition remaining problematic, because of the important variations, where it is realized through the use of an external device, similar to a robot, as a personal assistant. The inferred information is used both online to assist the person, and offline to support the personal assistant. With our proposed method being robust against the various factors of variability problem in action executions, the major purpose of this paper is to perform an efficient and simple recognition method from egocentric camera data only using convolutional neural network and deep learning. In terms of accuracy improvement, simulation results outperform the current state of the art by a significant margin of 61% when using egocentric camera data only, more than 44% when using egocentric camera and several stationary cameras data and more than 12% when using both inertial measurement unit (IMU) and egocentric camera data.

Ontology Modeling and Rule-based Reasoning for Automatic Classification of Personal Media (미디어 영상 자동 분류를 위한 온톨로지 모델링 및 규칙 기반 추론)

  • Park, Hyun-Kyu;So, Chi-Seung;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.43 no.3
    • /
    • pp.370-379
    • /
    • 2016
  • Recently personal media were produced in a variety of ways as a lot of smart devices have been spread and services using these data have been desired. Therefore, research has been actively conducted for the media analysis and recognition technology and we can recognize the meaningful object from the media. The system using the media ontology has the disadvantage that can't classify the media appearing in the video because of the use of a video title, tags, and script information. In this paper, we propose a system to automatically classify video using the objects shown in the media data. To do this, we use a description logic-based reasoning and a rule-based inference for event processing which may vary in order. Description logic-based reasoning system proposed in this paper represents the relation of the objects in the media as activity ontology. We describe how to another rule-based reasoning system defines an event according to the order of the inference activity and order based reasoning system automatically classify the appropriate event to the category. To evaluate the efficiency of the proposed approach, we conducted an experiment using the media data classified as a valid category by the analysis of the Youtube video.

Sea Ice Extents and global warming in Okhotsk Sea and surrounding Ocean - sea ice concentration using airborne microwave radiometer -

  • Nishio, Fumihiko
    • Proceedings of the KSRS Conference
    • /
    • 1998.09a
    • /
    • pp.76-82
    • /
    • 1998
  • Increase of greenhouse gas due to $CO_2$ and CH$_4$ gases would cause the global warming in the atmosphere. According to the global circulation model, it is pointed out in the Okhotsk Sea that the large increase of atmospheric temperature might be occurredin this region by global warming due to the doubling of greenhouse effectgases. Therefore, it is very important to monitor the sea ice extents in the Okhotsk Sea. To improve the sea ice extents and concentration with more highly accuracy, the field experiments have begun to comparewith Airborne Microwave Radiometer (AMR) and video images installed on the aircraft (Beach-200). The sea ice concentration is generally proportional to the brightness temperature and accurate retrieval of sea ice concentration from the brightness temperature is important because of the sensitivity of multi-channel data with the amount of open water in the sea ice pack. During the field experiments of airborned AMR the multi-frequency data suggest that the sea ice concentration is slightly dependending on the sea ice types since the brightness temperature is different between the thin and small piece of sea ice floes, and a large ice flow with different surface signatures. On the basis of classification of two sea ice types, it is cleary distinguished between the thin ice and the large ice floe in the scatter plot of 36.5 and 89.0GHz, but it does not become to make clear of the scatter plot of 18.7 and 36.5GHz Two algorithms that have been used for deriving sea ice concentrations from airbomed multi-channel data are compared. One is the NASA Team Algorithm and the other is the Bootstrap Algorithm. Intrercomparison on both algorithms with the airborned data and sea ice concentration derived from video images bas shown that the Bootstrap Algorithm is more consistent with the binary maps of video images.

  • PDF

Wild Bird Sound Classification Scheme using Focal Loss and Ensemble Learning (Focal Loss와 앙상블 학습을 이용한 야생조류 소리 분류 기법)

  • Jaeseung Lee;Jehyeok Rew
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.2
    • /
    • pp.15-25
    • /
    • 2024
  • For effective analysis of animal ecosystems, technology that can automatically identify the current status of animal habitats is crucial. Specifically, animal sound classification, which identifies species based on their sounds, is gaining great attention where video-based discrimination is impractical. Traditional studies have relied on a single deep learning model to classify animal sounds. However, sounds collected in outdoor settings often include substantial background noise, complicating the task for a single model. In addition, data imbalance among species may lead to biased model training. To address these challenges, in this paper, we propose an animal sound classification scheme that combines predictions from multiple models using Focal Loss, which adjusts penalties based on class data volume. Experiments on public datasets have demonstrated that our scheme can improve recall by up to 22.6% compared to an average of single models.