• Title/Summary/Keyword: 학습영상

Search Result 2,567, Processing Time 0.029 seconds

Multi-View 3D Human Pose Estimation Based on Transformer (트랜스포머 기반의 다중 시점 3차원 인체자세추정)

  • Seoung Wook Choi;Jin Young Lee;Gye Young Kim
    • Smart Media Journal
    • /
    • v.12 no.11
    • /
    • pp.48-56
    • /
    • 2023
  • The technology of Three-dimensional human posture estimation is used in sports, motion recognition, and special effects of video media. Among various methods for this, multi-view 3D human pose estimation is essential for precise estimation even in complex real-world environments. But Existing models for multi-view 3D human posture estimation have the disadvantage of high order of time complexity as they use 3D feature maps. This paper proposes a method to extend an existing monocular viewpoint multi-frame model based on Transformer with lower time complexity to 3D human posture estimation for multi-viewpoints. To expand to multi-viewpoints our proposed method first generates an 8-dimensional joint coordinate that connects 2-dimensional joint coordinates for 17 joints at 4-vieiwpoints acquired using the 2-dimensional human posture detector, CPN(Cascaded Pyramid Network). This paper then converts them into 17×32 data with patch embedding, and enters the data into a transformer model, finally. Consequently, the MLP(Multi-Layer Perceptron) block that outputs the 3D-human posture simultaneously updates the 3D human posture estimation for 4-viewpoints at every iteration. Compared to Zheng[5]'s method the number of model parameters of the proposed method was 48.9%, MPJPE(Mean Per Joint Position Error) was reduced by 20.6 mm (43.8%) and the average learning time per epoch was more than 20 times faster.

  • PDF

Segmentation Foundation Model-based Automated Yard Management Algorithm (의미론적 분할 기반 모델을 이용한 조선소 사외 적치장 객체 자동 관리 기술)

  • Mingyu Jeong;Jeonghyun Noh;Janghyun Kim;Seongheon Ha;Taeseon Kang;Byounghak Lee;Kiryong Kang;Junhyeon Kim;Jinsun Park
    • Smart Media Journal
    • /
    • v.13 no.2
    • /
    • pp.52-61
    • /
    • 2024
  • In the shipyard, aerial images are acquired at regular intervals using Unmanned Aerial Vehicles (UAVs) for the management of external storage yards. These images are then investigated by humans to manage the status of the storage yards. This method requires a significant amount of time and manpower especially for large areas. In this paper, we propose an automated management technology based on a semantic segmentation foundation model to address these challenges and accurately assess the status of external storage yards. In addition, as there is insufficient publicly available dataset for external storage yards, we collected a small-scale dataset for external storage yards objects and equipment. Using this dataset, we fine-tune an object detector and extract initial object candidates. They are utilized as prompts for the Segment Anything Model(SAM) to obtain precise semantic segmentation results. Furthermore, to facilitate continuous storage yards dataset collection, we propose a training data generation pipeline using SAM. Our proposed method has achieved 4.00%p higher performance compared to those of previous semantic segmentation methods on average. Specifically, our method has achieved 5.08% higher performance than that of SegFormer.

Research on APC Verification for Disaster Victims and Vulnerable Facilities (재난약자 및 취약시설에 대한 APC실증에 관한 연구)

  • Seungyong Kim;Incheol Hwang;Dongsik Kim;Jungjae Shin;Seunggap Yong
    • Journal of the Society of Disaster Information
    • /
    • v.20 no.1
    • /
    • pp.199-205
    • /
    • 2024
  • Purpose: This study aims to improve the recognition rate of Auto People Counting (APC) in accurately identifying and providing information on remaining evacuees in disaster-vulnerable facilities such as nursing homes to firefighting and other response agencies in the event of a disaster. Methods: In this study, a baseline model was established using CNN (Convolutional Neural Network) models to improve the algorithm for recognizing images of incoming and outgoing individuals through cameras installed in actual disaster-vulnerable facilities operating APC systems. Various algorithms were analyzed, and the top seven candidates were selected. The research was conducted by utilizing transfer learning models to select the optimal algorithm with the best performance. Results: Experiment results confirmed the precision and recall of Densenet201 and Resnet152v2 models, which exhibited the best performance in terms of time and accuracy. It was observed that both models demonstrated 100% accuracy for all labels, with Densenet201 model showing superior performance. Conclusion: The optimal algorithm applicable to APC among various artificial intelligence algorithms was selected. Further research on algorithm analysis and learning is required to accurately identify the incoming and outgoing individuals in disaster-vulnerable facilities in various disaster situations such as emergencies in the future.

Real-Time Traffic Information and Road Sign Recognitions of Circumstance on Expressway for Vehicles in C-ITS Environments (C-ITS 환경에서 차량의 고속도로 주행 시 주변 환경 인지를 위한 실시간 교통정보 및 안내 표지판 인식)

  • Im, Changjae;Kim, Daewon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.1
    • /
    • pp.55-69
    • /
    • 2017
  • Recently, the IoT (Internet of Things) environment is being developed rapidly through network which is linked to intellectual objects. Through the IoT, it is possible for human to intercommunicate with objects and objects to objects. Also, the IoT provides artificial intelligent service mixed with knowledge of situational awareness. One of the industries based on the IoT is a car industry. Nowadays, a self-driving vehicle which is not only fuel-efficient, smooth for traffic, but also puts top priority on eventual safety for humans became the most important conversation topic. Since several years ago, a research on the recognition of the surrounding environment for self-driving vehicles using sensors, lidar, camera, and radar techniques has been progressed actively. Currently, based on the WAVE (Wireless Access in Vehicular Environment), the research is being boosted by forming networking between vehicles, vehicle and infrastructures. In this paper, a research on the recognition of a traffic signs on highway was processed as a part of the awareness of the surrounding environment for self-driving vehicles. Through the traffic signs which have features of fixed standard and installation location, we provided a learning theory and a corresponding results of experiment about the way that a vehicle is aware of traffic signs and additional informations on it.

A Benchmark of Open Source Data Mining Package for Thermal Environment Modeling in Smart Farm(R, OpenCV, OpenNN and Orange) (스마트팜 열환경 모델링을 위한 Open source 기반 Data mining 기법 분석)

  • Lee, Jun-Yeob;Oh, Jong-wo;Lee, DongHoon
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 2017.04a
    • /
    • pp.168-168
    • /
    • 2017
  • ICT 융합 스마트팜 내의 환경계측 센서, 영상 및 사양관리 시스템의 증가에도 불구하고 이들 장비에서 확보되는 데이터를 적절히 유효하게 활용하는 기술이 미흡한 실정이다. 돈사의 경우 가축의 복지수준, 성장 변화를 실시간으로 모니터링 및 예측할 수 있는 데이터 분석 및 모델링 기술 확보가 필요하다. 이를 위해선 가축의 생리적 변화 및 행동적 변화를 조기에 감지하고 가축의 복지수준을 실시간으로 감시하고 분석 및 예측 기술이 필요한데 이를 위한 대표적인 정보 통신 공학적 접근법 중에 하나가 Data mining 이다. Data mining에 대한 연구 수행에 필요한 다양한 소프트웨어 중에서 Open source로 제공이 되는 4가지 도구를 비교 분석하였다. 스마트 돈사 내에서 열환경 모델링을 목표로 한 데이터 분석에서 고려해야할 요인으로 데이터 분석 알고리즘 도출 시간, 시각화 기능, 타 라이브러리와 연계 기능 등을 중점 적으로 분석하였다. 선정된 4가지 분석 도구는 1) R(https://cran.r-project.org), 2) OpenCV(http://opencv.org), 3) OpenNN (http://www.opennn.net), 4) Orange(http://orange.biolab.si) 이다. 비교 분석을 수행한 운영체제는 Linux-Ubuntu 16.04.4 LTS(X64)이며, CPU의 클럭속도는 3.6 Ghz, 메모리는 64 Gb를 설치하였다. 개발언어 측면에서 살펴보면 1) R 스크립트, 2) C/C++, Python, Java, 3) C++, 4) C/C++, Python, Cython을 지원하여 C/C++ 언어와 Python 개발 언어가 상대적으로 유리하였다. 데이터 분석 알고리즘의 경우 소스코드 범위에서 라이브러리를 제공하는 경우 Cross-Platform 개발이 가능하여 여러 운영체제에서 개발한 결과를 별도의 Porting 과정을 거치지 않고 사용할 수 있었다. 빌트인 라이브러리 경우 순서대로 R 의 경우 가장 많은 수의 Data mining 알고리즘을 제공하고 있다. 이는 R 운영 환경 자체가 개방형으로 되어 있어 온라인에서 추가되는 새로운 라이브러리를 클라우드를 통하여 공유하기 때문인 것으로 판단되었다. OpenCV의 경우 영상 처리에 강점이 있었으며, OpenNN은 신경망학습과 관련된 라이브러리를 소스코드 레벨에서 공개한 것이 강점이라 할 수 있다. Orage의 경우 라이브러리 집합을 제공하는 것에 중점을 둔 다른 패키지와 달리 시각화 기능 및 망 구성 등 사용자 인터페이스를 통합하여 운영한 것이 강점이라 할 수 있다. 열환경 모델링에 요구되는 시간 복잡도에 대응하기 위한 부가 정보 처리 기술에 대한 연구를 수행하여 스마트팜 열환경 모델링을 실시간으로 구현할 수 있는 방안 연구를 수행할 것이다.

  • PDF

Estimating the Stand Level Vegetation Structure Map Using Drone Optical Imageries and LiDAR Data based on an Artificial Neural Networks (ANNs) (인공신경망 기반 드론 광학영상 및 LiDAR 자료를 활용한 임분단위 식생층위구조 추정)

  • Cha, Sungeun;Jo, Hyun-Woo;Lim, Chul-Hee;Song, Cholho;Lee, Sle-Gee;Kim, Jiwon;Park, Chiyoung;Jeon, Seong-Woo;Lee, Woo-Kyun
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.5_1
    • /
    • pp.653-666
    • /
    • 2020
  • Understanding the vegetation structure is important to manage forest resources for sustainable forest development. With the recent development of technology, it is possible to apply new technologies such as drones and deep learning to forests and use it to estimate the vegetation structure. In this study, the vegetation structure of Gongju, Samchuk, and Seoguipo area was identified by fusion of drone-optical images and LiDAR data using Artificial Neural Networks(ANNs) with the accuracy of 92.62% (Kappa value: 0.59), 91.57% (Kappa value: 0.53), and 86.00% (Kappa value: 0.63), respectively. The vegetation structure analysis technology using deep learning is expected to increase the performance of the model as the amount of information in the optical and LiDAR increases. In the future, if the model is developed with a high-complexity that can reflect various characteristics of vegetation and sufficient sampling, it would be a material that can be used as a reference data to Korea's policies and regulations by constructing a country-level vegetation structure map.

A Study on Abalone Young Shells Counting System using Machine Vision (머신비전을 이용한 전복 치패 계수에 관한 연구)

  • Park, Kyung-min;Ahn, Byeong-Won;Park, Young-San;Bae, Cherl-O
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.23 no.4
    • /
    • pp.415-420
    • /
    • 2017
  • In this paper, an algorithm for object counting via a conveyor system using machine vision is suggested. Object counting systems using image processing have been applied in a variety of industries for such purposes as measuring floating populations and traffic volume, etc. The methods of object counting mainly used involve template matching and machine learning for detecting and tracking. However, operational time for these methods should be short for detecting objects on quickly moving conveyor belts. To provide this characteristic, this algorithm for image processing is a region-based method. In this experiment, we counted young abalone shells that are similar in shape, size and color. We applied a characteristic conveyor system that operated in one direction. It obtained information on objects in the region of interest by comparing a second frame that continuously changed according to the information obtained with reference to objects in the first region. Objects were counted if the information between the first and second images matched. This count was exact when young shells were evenly spaced without overlap and missed objects were calculated using size information when objects moved without extra space. The proposed algorithm can be applied for various object counting controls on conveyor systems.

An Adaptive Multi-Level Thresholding and Dynamic Matching Unit Selection for IC Package Marking Inspection (IC 패키지 마킹검사를 위한 적응적 다단계 이진화와 정합단위의 동적 선택)

  • Kim, Min-Ki
    • The KIPS Transactions:PartB
    • /
    • v.9B no.2
    • /
    • pp.245-254
    • /
    • 2002
  • IC package marking inspection system using machine vision locates and identifies the target elements from input image, and decides the quality of marking by comparing the extracted target elements with the standard patterns. This paper proposes an adaptive multi-level thresholding (AMLT) method which is suitable for a series of operations such as locating the target IC package, extracting the characters, and detecting the Pinl dimple. It also proposes a dynamic matching unit selection (DMUS) method which is robust to noises as well as effective to catch out the local marking errors. The main idea of the AMLT method is to restrict the inputs of Otsu's thresholding algorithm within a specified area and a partial range of gray values. Doing so, it can adapt to the specific domain. The DMUS method dynamically selects the matching unit according to the result of character extraction and layout analysis. Therefore, in spite of the various erroneous situation occurred in the process of character extraction and layout analysis, it can select minimal matching unit in any environment. In an experiment with 280 IC package images of eight types, the correct extracting rate of IC package and Pinl dimple was 100% and the correct decision rate of marking quality was 98.8%. This result shows that the proposed methods are effective to IC package marking inspection.

Images of Decomposition of Hydrogen Peroxide Demonstration Represented in New Media Contents: Focusing on Simulacra and Simulation (뉴미디어 콘텐츠에서 재현되는 과산화수소 분해 실험의 이미지 -시뮬라크르와 시뮬라시옹을 중심으로-)

  • Shin, Sein;Ha, Minsu;Lee, Jun-Ki
    • Journal of The Korean Association For Science Education
    • /
    • v.40 no.1
    • /
    • pp.13-28
    • /
    • 2020
  • This study attempted to understand the characteristics of images of scientific experiments represented and consumed on YouTube, a representative of today's new media. In particular, this paper analyzes the case studies of YouTube's hydrogen peroxide decomposition experiment based on Baudrillard's theory of Simulation and Simulacra, which discusses the strong status of images and the ambiguity of the boundary between virtual and reality. A total of 14 YouTube videos related to hydrogen peroxide decomposition experiments were analyzed. In those videos, hydrogen peroxide decomposition experiments were typically conducted with several signs representing scientific experiments, but the most important sign in the videos were bubbles produced through experiments. For more public consumption of the content, the bubbles resulted from hydrogen peroxide decomposition reproduced in YouTube have been transformed into a more spectacular image as 'super-huge' and 'explosive' bubble. Considering the influence of new media that can be accessed by students anytime and anywhere, it is positive that science experiments in new media enhance students' intimacy and access to science. At the same time, however, it is also important to note the danger that the purpose of scientific experiments will be limited to only 'showing specular images', due to the nature of new media, which mainly deals with immediate and superficial images. Furthermore, this study argues that improving students' science media literacy is required to critically examine the science-related images represented in the new media based on understanding the characteristics and limitations of new media that deeply affect daily life.

Design and Implementation of MPEG-21 Testbed (MPEG-21 Testbed의 설계 및 구현)

  • 손정화;권혁민;손현식;조영란;김만배
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2002.11a
    • /
    • pp.139-143
    • /
    • 2002
  • 1990 년대 후반부터 다양한 디지털 통신망을 이용하여 멀티미디어 컨텐츠 서비스가 가능하게 되었다. 하지만, 멀티미디어 컨텐츠의 전달 및 이용을 위한 기반 구조들의 독자적 발전 및 다양한 통합적 관리 체계 시스템으로 인해, 멀티미디어 컨텐츠 표현 방식의 호환성 문제, 혼재하는 네트워크 전달 방식과 단말 방식의 호환성 문제 등의 잠재적인 문제점이 발생한다. 이런 문제의 대안으로 현재 존재하는 기술 및 기반 구조들 사이의 연동을 통한 큰 프레임워크인 MPEG-21이 진행 중이다. MPEG-21 의 목표는 표준화 목표를 구체화하는 것부터 진행하여, 최종적으로 “다양한 네트워크 환경과 단말기에 있어서, 투명하고 통합적으로 멀티미디어 자원의 이용을 가능하게 하는 것”이다. 본 논문에서는 현재 표준화 작업이 진행 중인 MPEG-21 을 기반으로 하는 Testbed를 제안한다. Testbed는 server, client, DIA(Digital Item Adaptation) 의 세 모듈로 구성된다. Server 의 역할은 멀티미디어 컨텐츠를 Digital Item(DI)으로 생성하고, client 가 DI를 요구할 경우 DIA 모듈을 통해서 변환된 DI를 client 에게 제공한다. DIA 모듈은 server 에서 동작되며 client로부터 요청된 DI를 분석하고 client로부터 전송된 환경 정보를 이용하여 client 환경에 적합하게 변환된 (adapted) DI를 생성하는 것이 주 기능이다. Client 는 server 에 저장되어 있는 DI를 선택하고 user preference, terminal capability 등의 필요한 정보를 server로 전송한다. Testbed 에서는 스포츠 경기의 동영상, 정지 영상, 경기 내용 역사를 기록한 파일 등의 DI를 이용한다. 표현 언어는 XML이며, HTTP 기반의 Web 환경에서 구동되도록 설계된다.스템 사이에 의미 있는 데이터 전송, 지식 획득을 위해 정보 기술 분야에서 활용해야 할 영역으로 XML Web Services, Multi-agent Systems, 전문가 컴뮤니티를 위한 그룹웨어 연구 개발에 관해 사례 중심으로 발표한다.다 신선한 공기를 넣어 주었을 때는 배의 발달이 많이 늦어져 배양 3주째에 다른 처리보다 배의 수가 훨씬 적었다. 체세포배가 발달하는 동안에는 산소를 많이 요구하지 않으나 성숙하는 동안에는 산소를 많이 요구하는 것으로 생각된다.적인 것으로 나타났다. 다만, 곡선형은 물론 직선형에서도 열교환 튜브의 배치밀도, 튜브 길이 및 두께 등의 변화에 따른 최적화 연구가 수반되어야 할 것으로 판단된다.에서 제공된 API는 객체기반 제작/편집 도구에 응용되어 다양한 멀티미디어 컨텐츠 제작에 사용되었다.x factorization (NMF), generative topographic mapping (GTM)의 구조와 학습 및 추론알고리즘을소개하고 이를 DNA칩 데이터 분석 평가 대회인 CAMDA-2000과 CAMDA-2001에서 사용된cancer diagnosis 문제와 gene-drug dependency analysis 문제에 적용한 결과를 살펴본다.0$\mu$M이 적당하며, 초기배발달을 유기할 때의 효과적인 cysteamine의 농도는 25~50$\mu$M인 것으로 판단된다.N)A(N)/N을 제시하였다(A(N)=N에 대한 A값). 위의 실험식을 사용하여 헝가리산 Zempleni 시료(15%$S_{XRD}$)의 기본입자분포로부터 %$S_{XRD}$를 계산한 결과, 16%$S_{XRD}$의 결과값을 얻을 수 있었다. 따라서, 본 연구에서 도출한 관계식들이 유효함을 확인할 수 있었다.계식들이 유효함을 확인할 수 있었다.할 때 약간의 증가

  • PDF