• Title/Summary/Keyword: AI Video

Search Result 156, Processing Time 0.033 seconds

Neural Network-Based Intra Prediction Considering Multiple Transform Selection in Versatile Video Coding (VVC 의 다중 변환 선택을 고려한 신경망 기반 화면내 예측)

  • Dohyeon Park;Gihwa Moon;Sung-Chang Lim;Jae-Gon Kim
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.11a
    • /
    • pp.8-9
    • /
    • 2022
  • 최근 VVC(Versatile Video Coding) 표준 완료 이후 JVET(Joint Video Experts Team)에서는 NNVC(Neural Network-based Video Coding) EE(Exploration Experiment)를 통하여 화면내 예측을 포함한 신경망 기반의 부호화 기술들을 탐색하고 검증하고 있다. 본 논문에서는 VVC 에 채택되어 있는 다중 변환 선택(MTS: Multiple Transform Selection)에 따라서 적절한 예측 블록을 선택할 수 있는 TDIP(Transform-Dependent Intra Prediction) 모델을 제안한다. 실험결과 제안기법은 VVC 의 AI(All Intra) 부호화 환경에서 VTM(VVC Test Model) 대비 Y, U, V 에 각각 0.87%, 0.87%, 0.99%의 BD-rate 절감의 비디오 부호화 성능 향상을 보였다.

  • PDF

A Fast Decision Method of Quadtree plus Binary Tree (QTBT) Depth in JEM (차세대 비디오 코덱(JEM)의 고속 QTBT 분할 깊이 결정 기법)

  • Yoon, Yong-Uk;Park, Do-Hyun;Kim, Jae-Gon
    • Journal of Broadcast Engineering
    • /
    • v.22 no.5
    • /
    • pp.541-547
    • /
    • 2017
  • The Joint Exploration Model (JEM), which is a reference SW codec of the Joint Video Exploration Team (JVET) exploring the future video standard technology, provides a recursive Quadtree plus Binary Tree (QTBT) block structure. QTBT can achieve enhanced coding efficiency by adding new block structures at the expense of largely increased computational complexity. In this paper, we propose a fast decision algorithm of QTBT block partitioning depth that uses the rate-distortion (RD) cost of the upper and current depth to reduce the complexity of the JEM encoder. Experimental results showed that the computational complexity of JEM 5.0 can be reduced up to 21.6% and 11.0% with BD-rate increase of 0.7% and 1.2% in AI (All Intra) and RA (Random Access), respectively.

Adaptive TBC in Intra Prediction on Versatile Video Coding (VVC의 화면 내 예측에서 적응적 TBC를 사용하는 방법)

  • Lee, Won Jun;Park, Gwang Hoon
    • Journal of Broadcast Engineering
    • /
    • v.25 no.1
    • /
    • pp.109-112
    • /
    • 2020
  • VVC uses 67 modes in intra prediction. Most probable mode (MPM) is used to reduce data for the representation of intra prediction mode. If the mode to send exists in the MPM candidate, the index of the MPM list is transmitted. If it does not exist in the MPM candidate, TBC encoding is applied. When TBC is applied in intra prediction, three are selected in order of low number mode and coded into 5 bits. The remaining modes except the mode encoded by 5 bits are encoded by 6 bits. In this paper, we examine the limitations of the TBC used in VVC intra prediction and propose an adaptive method that can encode more efficiently than conventional methods when using TBC in intra prediction. As a result, the coding efficiency of the overall coding performance is 0.01% and 0.04% in AI and RA, respectively, compared with the conventional coding method.

What Concerns Does ChatGPT Raise for Us?: An Analysis Centered on CTM (Correlated Topic Modeling) of YouTube Video News Comments (ChatGPT는 우리에게 어떤 우려를 초래하는가?: 유튜브 영상 뉴스 댓글의 CTM(Correlated Topic Modeling) 분석을 중심으로)

  • Song, Minho;Lee, Soobum
    • Informatization Policy
    • /
    • v.31 no.1
    • /
    • pp.3-31
    • /
    • 2024
  • This study aimed to examine public concerns in South Korea considering the country's unique context, triggered by the advent of generative artificial intelligence such as ChatGPT. To achieve this, comments from 102 YouTube video news related to ethical issues were collected using a Python scraper, and morphological analysis and preprocessing were carried out using Textom on 15,735 comments. These comments were then analyzed using a Correlated Topic Model (CTM). The analysis identified six primary topics within the comments: "Legal and Ethical Considerations"; "Intellectual Property and Technology"; "Technological Advancement and the Future of Humanity"; "Potential of AI in Information Processing"; "Emotional Intelligence and Ethical Regulations in AI"; and "Human Imitation."Structuring these topics based on a correlation coefficient value of over 10% revealed 3 main categories: "Legal and Ethical Considerations"; "Issues Related to Data Generation by ChatGPT (Intellectual Property and Technology, Potential of AI in Information Processing, and Human Imitation)"; and "Fear for the Future of Humanity (Technological Advancement and the Future of Humanity, Emotional Intelligence, and Ethical Regulations in AI)."The study confirmed the coexistence of various concerns along with the growing interest in generative AI like ChatGPT, including worries specific to the historical and social context of South Korea. These findings suggest the need for national-level efforts to ensure data fairness.

Object Detection Network Feature Map Compression using CompressAI (CompressAI 를 활용한 객체 검출 네트워크 피쳐 맵 압축)

  • Do, Jihoon;Lee, Jooyoung;Kim, Younhee;Choi, Jin Soo;Jeong, Se Yoon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2021.06a
    • /
    • pp.7-9
    • /
    • 2021
  • 본 논문은 Detectron2 [1]에서 지원하는 객체 검출 임무 수행 네트워크의 과정 중에서 추출한 피쳐 맵을 신경망 기반으로 압축하는 방법을 제안한다. 이를 위해, 신경 망 기반 영상 압축을 지원하는 공개 소프트웨어인 CompressAI [2] 모델 중 하나인 bmshj2018-hyperprior 의 압축 네트워크를 활용하여 임무 수행 네트워크의 과정 중 스탬 레이어(stem layer)에서 추출된 피쳐 맵을 압축하도록 학습시켰다. 또한, 압축 네트워크의 입력 피쳐 맵의 너비와 높이 크기가 64 의 배수가 되도록 객체 검출 네트워크의 입력 영상 보간 값을 조정하는 방법도 제안한다. 제안하는 신경망 기반 피쳐 맵 압축 방법은 피쳐 맵을 최근 표준이 완료된 차세대 압축 표준 방법인 VVC(Versatile Video Coding, [3])로 압축한 결과에 비해 큰 성능 향상을 보이고, VCM 앵커와 유사한 성능을 보인다.

  • PDF

A Study on Deep learning algorithm comparison for Block AI virus using thermal video and IoT (열영상과 IoT를 이용한 AI 바이러스 차단을 위한 딥러닝 알고리즘 비교에 대한 연구)

  • No, Seunghyun;seo, hojun;kim, hyein;Kim, Jeong-Min
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.1097-1100
    • /
    • 2021
  • 열영상과 IoT를 이용한 AI 바이러스 차단 시스템 개발에 필요한 열화상 체온 측정기의 열 측정 정확도 향상과 얼굴 인식 시간 단축을 위해 열화상에 사용되는 딥러닝 알고리즘을 비교하며 효율적인 알고리즘 발굴 및 열영상을 이용한 바이러스 차단 시스템에 적합한 열영상 알고리즘 보완 방법을 찾는 연구이다.

Visual Verb and ActionNet Database for Semantic Visual Understanding (동영상 시맨틱 이해를 위한 시각 동사 도출 및 액션넷 데이터베이스 구축)

  • Bae, Changseok;Kim, Bo Kyeong
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.14 no.5
    • /
    • pp.19-30
    • /
    • 2018
  • Visual information understanding is known as one of the most difficult and challenging problems in the realization of machine intelligence. This paper proposes deriving visual verb and construction of ActionNet database as a video database for video semantic understanding. Even though development AI (artificial intelligence) algorithms have contributed to the large part of modern advances in AI technologies, huge amount of database for algorithm development and test plays a great role as well. As the performance of object recognition algorithms in still images are surpassing human's ability, research interests shifting to semantic understanding of video contents. This paper proposes candidates of visual verb requiring in the construction of ActionNet as a learning and test database for video understanding. In order to this, we first investigate verb taxonomy in linguistics, and then propose candidates of visual verb from video description database and frequency of verbs. Based on the derived visual verb candidates, we have defined and constructed ActionNet schema and database. According to expanding usability of ActionNet database on open environment, we expect to contribute in the development of video understanding technologies.

Real-time Camera and Video Streaming Through Optimized Settings of Ethernet AVB in Vehicle Network System

  • An, Byoungman;Kim, Youngseop
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.8
    • /
    • pp.3025-3047
    • /
    • 2021
  • This paper presents the latest Ethernet standardization of in-vehicle network and the future trends of automotive Ethernet technology. The proposed system provides design and optimization algorithms for automotive networking technology related to AVB (Audio Video Bridge) technology. We present a design of in-vehicle network system as well as the optimization of AVB for automotive. A proposal of Reduced Latency of Machine to Machine (RLMM) plays an outstanding role in reducing the latency among devices. RLMM's approach to real-world experimental cases indicates a reduction in latency of around 41.2%. The setup optimized for the automotive network environment is expected to significantly reduce the time in the development and design process. The results obtained in the study of image transmission latency are trustworthy because average values were collected over a long period of time. It is necessary to analyze a latency between multimedia devices within limited time which will be of considerable benefit to the industry. Furthermore, the proposed reliable camera and video streaming through optimized AVB device settings would provide a high level of support in the real-time comprehension and analysis of images with AI (Artificial Intelligence) algorithms in autonomous driving.

Exploring Service Improvement Opportunities through Analysis of OTT App Reviews (OTT 앱 리뷰 분석을 통한 서비스 개선 기회 발굴 방안 연구)

  • Joongmin Lee;Chie Hoon Song
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.27 no.2_2
    • /
    • pp.445-456
    • /
    • 2024
  • This study aims to suggest service improvement opportunities by analyzing user review data of the top three OTT service apps(Netflix, Coupang Play, and TVING) on Google Play Store. To achieve this objective, we proposed a framework for uncovering service opportunities through the analysis of negative user reviews from OTT service providers. The framework involves automating the labeling of identified topics and generating service improvement opportunities using topic modeling and prompt engineering, leveraging GPT-4, a generative AI model. Consequently, we pinpointed five dissatisfaction topics for Netflix and TVING, and nine for Coupang Play. Common issues include "video playback errors", "app installation and update errors", "subscription and payment" problems, and concerns regarding "content quality". The commonly identified service enhancement opportunities include "enhancing and diversifying content quality". "optimizing video quality and data usage", "ensuring compatibility with external devices", and "streamlining payment and cancellation processes". In contrast to prior research, this study introduces a novel research framework leveraging generative AI to label topics and propose improvement strategies based on the derived topics. This is noteworthy as it identifies actionable service opportunities aimed at enhancing service competitiveness and satisfaction, instead of merely outlining topics.

Separate Scale for Position Dependent Intra Prediction Combination of VVC

  • Yoon, Yong-Uk;Park, Dohyeon;Kim, Jae-Gon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2019.11a
    • /
    • pp.20-21
    • /
    • 2019
  • The Joint Video Experts Team (JVET) has been working on the development of next generation of video coding standard called Versatile Video Coding (VVC). Position Dependent Intra Prediction Combination (PDPC) which is one of the major tools for intra prediction refines the prediction through a linear combination between the reconstructed samples and the predicted samples according to the sample position. In VVC WD6, nScale which is shift value that adjusts the weight is determined by the width and height of the current block. It may cause that PDPC is applied to regions that do not fit the characteristics of the current intra prediction mode. In this paper, we define nScale for each width and height so that the weight can be applied independently to the left and top reference samples, respectively. Experimental results show that, compared to VTM 6.0, the proposed method gives -0.01%, -0.04% and 0.01% Bjotegaard-Delta (BD)-rate performance, for Y, Cb, and Cr components, respectively, in All-Intra (AI) configuration.

  • PDF