• Title/Summary/Keyword: 영상 특징추출

Search Result 2,333, Processing Time 0.029 seconds

Event Cognition-based Daily Activity Prediction Using Wearable Sensors (웨어러블 센서를 이용한 사건인지 기반 일상 활동 예측)

  • Lee, Chung-Yeon;Kwak, Dong Hyun;Lee, Beom-Jin;Zhang, Byoung-Tak
    • Journal of KIISE
    • /
    • v.43 no.7
    • /
    • pp.781-785
    • /
    • 2016
  • Learning from human behaviors in the real world is essential for human-aware intelligent systems such as smart assistants and autonomous robots. Most of research focuses on correlations between sensory patterns and a label for each activity. However, human activity is a combination of several event contexts and is a narrative story in and of itself. We propose a novel approach of human activity prediction based on event cognition. Egocentric multi-sensor data are collected from an individual's daily life by using a wearable device and smartphone. Event contexts about location, scene and activities are then recognized, and finally the users" daily activities are predicted from a decision rule based on the event contexts. The proposed method has been evaluated on a wearable sensor data collected from the real world over 2 weeks by 2 people. Experimental results showed improved recognition accuracies when using the proposed method comparing to results directly using sensory features.

Real-Time License Plate Detection Based on Faster R-CNN (Faster R-CNN 기반의 실시간 번호판 검출)

  • Lee, Dongsuk;Yoon, Sook;Lee, Jaehwan;Park, Dong Sun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.11
    • /
    • pp.511-520
    • /
    • 2016
  • Automatic License Plate Detection (ALPD) is a key technology for a efficient traffic control. It is used to improve work efficiency in many applications such as toll payment systems and parking and traffic management. Until recently, the hand-crafted features made for image processing are used to detect license plates in most studies. It has the advantage in speed. but can degrade the detection rate with respect to various environmental changes. In this paper, we propose a way to utilize a Faster Region based Convolutional Neural Networks (Faster R-CNN) and a Conventional Convolutional Neural Networks (CNN), which improves the computational speed and is robust against changed environments. The module based on Faster R-CNN is used to detect license plate candidate regions from images and is followed by the module based on CNN to remove False Positives from the candidates. As a result, we achieved a detection rate of 99.94% from images captured under various environments. In addition, the average operating speed is 80ms/image. We implemented a fast and robust Real-Time License Plate Detection System.

Voice Assistant for Visually Impaired People (시각장애인을 위한 음성 도우미 장치)

  • Chae, Jun-Gy;Jang, Ji-Woo;Kim, Dong-Wan;Jung, Su-Jin;Lee, Ik Hyun
    • The Journal of Korean Institute of Information Technology
    • /
    • v.17 no.4
    • /
    • pp.131-136
    • /
    • 2019
  • People with compromised visual ability suffer from many inconveniences in daily life, such as distinguishing colors, identifying currency notes and realizing the atmospheric temperature. Therefore, to assist the visually impaired people, we propose a system by utilizing optical and infrared cameras. In the proposed system, an optical camera is used to collect features related to colors and currency notes while an infrared camera is utilized to get temperature information. The user is enabled to select the desired service by pushing the button and the appreciate voice information are provided through the speaker. The device can distinguish 16 kinds of colors, four different currency notes, and temperature information in four steps and the current accuracy is around 90%. It can be improved further through block-wise input image, machine learning, and a higher version of the infrared camera. In addition, it will be attached to the stick for easy carrying and to use it more conveniently.

Deep Learning-based system for plant disease detection and classification (딥러닝 기반 작물 질병 탐지 및 분류 시스템)

  • YuJin Ko;HyunJun Lee;HeeJa Jeong;Li Yu;NamHo Kim
    • Smart Media Journal
    • /
    • v.12 no.7
    • /
    • pp.9-17
    • /
    • 2023
  • Plant diseases and pests affect the growth of various plants, so it is very important to identify pests at an early stage. Although many machine learning (ML) models have already been used for the inspection and classification of plant pests, advances in deep learning (DL), a subset of machine learning, have led to many advances in this field of research. In this study, disease and pest inspection of abnormal crops and maturity classification were performed for normal crops using YOLOX detector and MobileNet classifier. Through this method, various plant pest features can be effectively extracted. For the experiment, image datasets of various resolutions related to strawberries, peppers, and tomatoes were prepared and used for plant pest classification. According to the experimental results, it was confirmed that the average test accuracy was 84% and the maturity classification accuracy was 83.91% in images with complex background conditions. This model was able to effectively detect 6 diseases of 3 plants and classify the maturity of each plant in natural conditions.

A Study on the River Zone Determination Method by River Type Based on 3D DSM Data (3차원 DSM 자료 기반 하천유형별 정밀 하천구역 결정기법 개발)

  • Lim, Dong Hwa;Lee, Choon Ho;Lee, Tae Geun;Sim, Gyoo Seong
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.472-472
    • /
    • 2021
  • 우리나라는 하천법 제10조, 소하천정비법 제3조에 하천기본계획 수립 또는 하천의 지정 및 변경고시 시 하천구역을 결정하도록 정의되어 있다. 하천구역 설정 시 일반적으로 하천의 제방이 위치하는 부지 및 제방하심측 토지경계를 하천구역으로 지정하고 있으나, 제방계획이 없거나 무제부 구간으로 활용되고 있는 구간의 경우 하천법 제10조 3항에서 5항까지 3가지 항목을 기준으로 계획하폭에 해당하는 토지, 댐·하구둑·홍수조절지·저류지의 계획홍수위 아래에 해당하는 토지경계, 철도·도로 등 제방의 역할을 하는 선형공작물 하심측 토지경계로 구분하고 있다. 하천구역의 경계설정의 경우 불연속적인 특징을 갖는 하천의 횡단측점 자료의 특성상 정확한 평면상의 경계를 파악하기 어렵고, 철도·도로 등 선형공작물 경계를 하천구역으로 설정 시 편입용지의 보상이 상이하고 모호한 기준으로 인해 다량의 민원이 발생하는 실정이다. 본 연구에서는 부산시에 위치한 지방하천 대천천을 대상지로 설정하였으며, 계획홍수위를 기반으로 홍수범람예상도를 작성하여 정밀계획홍수위선을 산출하고 이를 지형자료와 중첩하여 계획홍수위 경계를 추출하였다. 또한 무제부 구간 내 드론촬영을 실시하여 대상지 드론영상 기반 3차원 정밀 지형자료를 구축하고 이를 앞서 산정한 계획홍수위 경계자료와 중첩하여 정밀 하천구역을 설정하였다. 대상지 정밀 하천구역 산정 결과를 기반으로 도심지내 하천과 도시외곽 하천으로 구분하고 다시 도심지내하천은 암거(복개)구간과 개거구간, 도시외곽하천은 유제부와 무제부 구간으로 구분하여 정밀 하천구역 결정기준을 수립하였다. 본 연구를 통해 대천천유역을 대상으로 실시한 무제부 구간 하천구역 결정과정을 기반으로 하천유형별 3차원 하천구역 산정기법을 정립하였다. 향후 해당기법을 실무에 적용하여 하천구역 산정 시 모호한 하천경계부분 또는 토지소유주와 담당부처 간 하천구역 논의 시 기반자료로 활용될 수 있을 것으로 사료되며, 기본계획 수립 시 해당 기법 적용을 통해 보다 정확한 하천구역 경계 수립이 될 수 있을 것으로 기대된다.

  • PDF

Transfer Learning-Based Vibration Fault Diagnosis for Ball Bearing (전이학습을 이용한 볼베어링의 진동진단)

  • Subin Hong;Youngdae Lee;Chanwoo Moon
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.3
    • /
    • pp.845-850
    • /
    • 2023
  • In this paper, we propose a method for diagnosing ball bearing vibration using transfer learning. STFT, which can analyze vibration signals in time-frequency, was used as input to CNN to diagnose failures. In order to rapidly learn CNN-based deep artificial neural networks and improve diagnostic performance, we proposed a transfer learning-based deep learning learning technique. For transfer learning, the feature extractor and classifier were selectively learned using a VGG-based image classification model, the data set for learning was publicly available ball bearing vibration data provided by Case Western Reserve University, and performance was evaluated by comparing the proposed method with the existing CNN model. Experimental results not only prove that transfer learning is useful for condition diagnosis in ball bearing vibration data, but also allow other industries to use transfer learning to improve condition diagnosis.

Lip and Voice Synchronization Using Visual Attention (시각적 어텐션을 활용한 입술과 목소리의 동기화 연구)

  • Dongryun Yoon;Hyeonjoong Cho
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.4
    • /
    • pp.166-173
    • /
    • 2024
  • This study explores lip-sync detection, focusing on the synchronization between lip movements and voices in videos. Typically, lip-sync detection techniques involve cropping the facial area of a given video, utilizing the lower half of the cropped box as input for the visual encoder to extract visual features. To enhance the emphasis on the articulatory region of lips for more accurate lip-sync detection, we propose utilizing a pre-trained visual attention-based encoder. The Visual Transformer Pooling (VTP) module is employed as the visual encoder, originally designed for the lip-reading task, predicting the script based solely on visual information without audio. Our experimental results demonstrate that, despite having fewer learning parameters, our proposed method outperforms the latest model, VocaList, on the LRS2 dataset, achieving a lip-sync detection accuracy of 94.5% based on five context frames. Moreover, our approach exhibits an approximately 8% superiority over VocaList in lip-sync detection accuracy, even on an untrained dataset, Acappella.

Physical Offset of UAVs Calibration Method for Multi-sensor Fusion (다중 센서 융합을 위한 무인항공기 물리 오프셋 검보정 방법)

  • Kim, Cheolwook;Lim, Pyeong-chae;Chi, Junhwa;Kim, Taejung;Rhee, Sooahm
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_1
    • /
    • pp.1125-1139
    • /
    • 2022
  • In an unmanned aerial vehicles (UAVs) system, a physical offset can be existed between the global positioning system/inertial measurement unit (GPS/IMU) sensor and the observation sensor such as a hyperspectral sensor, and a lidar sensor. As a result of the physical offset, a misalignment between each image can be occurred along with a flight direction. In particular, in a case of multi-sensor system, an observation sensor has to be replaced regularly to equip another observation sensor, and then, a high cost should be paid to acquire a calibration parameter. In this study, we establish a precise sensor model equation to apply for a multiple sensor in common and propose an independent physical offset estimation method. The proposed method consists of 3 steps. Firstly, we define an appropriate rotation matrix for our system, and an initial sensor model equation for direct-georeferencing. Next, an observation equation for the physical offset estimation is established by extracting a corresponding point between a ground control point and the observed data from a sensor. Finally, the physical offset is estimated based on the observed data, and the precise sensor model equation is established by applying the estimated parameters to the initial sensor model equation. 4 region's datasets(Jeon-ju, Incheon, Alaska, Norway) with a different latitude, longitude were compared to analyze the effects of the calibration parameter. We confirmed that a misalignment between images were adjusted after applying for the physical offset in the sensor model equation. An absolute position accuracy was analyzed in the Incheon dataset, compared to a ground control point. For the hyperspectral image, root mean square error (RMSE) for X, Y direction was calculated for 0.12 m, and for the point cloud, RMSE was calculated for 0.03 m. Furthermore, a relative position accuracy for a specific point between the adjusted point cloud and the hyperspectral images were also analyzed for 0.07 m, so we confirmed that a precise data mapping is available for an observation without a ground control point through the proposed estimation method, and we also confirmed a possibility of multi-sensor fusion. From this study, we expect that a flexible multi-sensor platform system can be operated through the independent parameter estimation method with an economic cost saving.

Clustering-based Hierarchical Scene Structure Construction for Movie Videos (영화 비디오를 위한 클러스터링 기반의 계층적 장면 구조 구축)

  • Choi, Ick-Won;Byun, Hye-Ran
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.5
    • /
    • pp.529-542
    • /
    • 2000
  • Recent years, the use of multimedia information is rapidly increasing, and the video media is the most rising one than any others, and this field Integrates all the media into a single data stream. Though the availability of digital video is raised largely, it is very difficult for users to make the effective video access, due to its length and unstructured video format. Thus, the minimal interaction of users and the explicit definition of video structure is a key requirement in the lately developing image and video management systems. This paper defines the terms and hierarchical video structure, and presents the system, which construct the clustering-based video hierarchy, which facilitate users by browsing the summary and do a random access to the video content. Instead of using a single feature and domain-specific thresholds, we use multiple features that have complementary relationship for each other and clustering-based methods that use normalization so as to interact with users minimally. The stage of shot boundary detection extracts multiple features, performs the adaptive filtering process for each features to enhance the performance by eliminating the false factors, and does k-means clustering with two classes. The shot list of a result after the proposed procedure is represented as the video hierarchy by the intelligent unsupervised clustering technique. We experimented the static and the dynamic movie videos that represent characteristics of various video types. In the result of shot boundary detection, we had almost more than 95% good performance, and had also rood result in the video hierarchy.

  • PDF

Sea Cucumber (Stichopus japonicus) Grading System Based on Morphological Features during Rehydration Process (수화 시의 형태학적 특징에 따른 건해삼의 등급 분류 시스템 개발)

  • Lee, Choong Uk;Yoon, Won Byong
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.46 no.3
    • /
    • pp.374-380
    • /
    • 2017
  • Image analysis and k-mean clustering were conducted to develop a grading system of dried sea cucumber (SC) based on rehydration rate. The SC images were obtained by taking pictures in a box under controlled light conditions. The region of interest was extracted to depict the shape of the SC in a 2D graph, and those 2D shapes were rendered to build a 3D model. The results from the image analysis provided the morphological features of the SC, including length, width, surface area, and volume, to obtain the parameters of the k-mean clustering weight. The k-mean clustering classified the SC samples into three different grades. Each SC sample was rehydrated at $30^{\circ}C$ for 40 h. During rehydration, the flux of each grade was analyzed. Our study demonstrates that the mass transfer rate of SC increased as the surface area increased, and the grade of SC was classified based on rehydration rate. This study suggests that the optimal rehydration process for SC can be achieved by applying a suitable grading system.