• Title/Summary/Keyword: Video translation

Search Result 38, Processing Time 0.03 seconds

A Study on Multi-stage Management and Spatio-Temporal Search of Video Features for a Surveillance System (감시 시스템을 위한 동영상 데이터의 다단계 관리 및 시공간 검색 기법 연구)

  • 이희정;이원석
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1999.10a
    • /
    • pp.12-14
    • /
    • 1999
  • 오늘날 멀티미디어 및 인터넷 서비스가 눈에 띄게 증가하면서 다양한 응용분야에서의 동영상 데이터 활용을 급증하였고 이에 사용자가 원하는 동영상 데이터를 빠르고 정확하게 검색하기 위한 내용기반 검색기법이 필수적이다. 본 논문은 high-level features와 더불어 동영상의 고유 내용 속성에 속하는 low-level features를 자동 일반화(generalization)하여 다단계 관리하고 features에 대한 가중치 적용질의를 제공함으로써 기존 내용기반 검색 연구와는 뚜렷한 차별성을 갖는다. 또한 low-level features와 high-level features간의 자동변환(translation)을 가능하게 함으로써 동영상 데이터베이스의 사용자 접근 효율을 한단계 높이고 보다 의미구조화된 동영상 관리 및 내용기반 검색을 지원한다.

  • PDF

Enhanced Stereo Matching Algorithm based on 3-Dimensional Convolutional Neural Network (3차원 합성곱 신경망 기반 향상된 스테레오 매칭 알고리즘)

  • Wang, Jian;Noh, Jackyou
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.16 no.5
    • /
    • pp.179-186
    • /
    • 2021
  • For stereo matching based on deep learning, the design of network structure is crucial to the calculation of matching cost, and the time-consuming problem of convolutional neural network in image processing also needs to be solved urgently. In this paper, a method of stereo matching using sparse loss volume in parallax dimension is proposed. A sparse 3D loss volume is constructed by using a wide step length translation of the right view feature map, which reduces the video memory and computing resources required by the 3D convolution module by several times. In order to improve the accuracy of the algorithm, the nonlinear up-sampling of the matching loss in the parallax dimension is carried out by using the method of multi-category output, and the training model is combined with two kinds of loss functions. Compared with the benchmark algorithm, the proposed algorithm not only improves the accuracy but also shortens the running time by about 30%.

A Study on Artificial Intelligence Based Business Models of Media Firms

  • Song, Minzheong
    • International journal of advanced smart convergence
    • /
    • v.8 no.2
    • /
    • pp.56-67
    • /
    • 2019
  • The aim of this study is to develop Artificial Intelligence (AI) based business models of media firms. We define AI and discuss 'AI activity model'. The practices of the efficiency model are home equipment-based personalization and media content recommendation. The practices of the expert model are media content commissioning, content rights negotiation, copyright infringement, and promotion. The practices of the effectiveness model are photo & video auto-tagging and auto subtitling & simultaneous translation. The practices of the innovation model are content script creation and metadata management. The related use cases from 2012 to 2017 are introduced along the four activity models of AI. In conclusion, we propose for media companies to fully utilize the AI for transforming from traditional to successful digital media firms.

Camera Motion Estimation using Geometrically Symmetric Points in Subsequent Video Frames (인접 영상 프레임에서 기하학적 대칭점을 이용한 카메라 움직임 추정)

  • Jeon, Dae-Seong;Mun, Seong-Heon;Park, Jun-Ho;Yun, Yeong-U
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.39 no.2
    • /
    • pp.35-44
    • /
    • 2002
  • The translation and the rotation of camera occur global motion which affects all over the frame in video sequence. With the video sequences containing global motion, it is practically impossible to extract exact video objects and to calculate genuine object motions. Therefore, high compression ratio cannot be achieved due to the large motion vectors. This problem can be solved when the global motion compensated frames are used. The existing camera motion estimation methods for global motion compensation have a large amount of computations in common. In this paper, we propose a simple global motion estimation algorithm that consists of linear equations without any repetition. The algorithm uses information .of symmetric points in the frame of the video sequence. The discriminant conditions to distinguish regions belonging to distant view from foreground in the frame are presented. Only for the distant view satisfying the discriminant conditions, the linear equations for the panning, tilting, and zooming parameters are applied. From the experimental results using the MPEG test sequences, we can confirm that the proposed algorithm estimates correct global motion parameters. Moreover the real-time capability of the proposed technique can be applicable to many MPEG-4 and MPEG-7 related areas.

Considerations for Applying Korean Natural Language Processing Technology in Records Management (기록관리 분야에서 한국어 자연어 처리 기술을 적용하기 위한 고려사항)

  • Haklae, Kim
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.22 no.4
    • /
    • pp.129-149
    • /
    • 2022
  • Records have temporal characteristics, including the past and present; linguistic characteristics not limited to a specific language; and various types categorized in a complex way. Processing records such as text, video, and audio in the life cycle of records' creation, preservation, and utilization entails exhaustive effort and cost. Primary natural language processing (NLP) technologies, such as machine translation, document summarization, named-entity recognition, and image recognition, can be widely applied to electronic records and analog digitization. In particular, Korean deep learning-based NLP technologies effectively recognize various record types and generate record management metadata. This paper provides an overview of Korean NLP technologies and discusses considerations for applying NLP technology in records management. The process of using NLP technologies, such as machine translation and optical character recognition for digital conversion of records, is introduced as an example implemented in the Python environment. In contrast, a plan to improve environmental factors and record digitization guidelines for applying NLP technology in the records management field is proposed for utilizing NLP technology.

Reliable Camera Pose Estimation from a Single Frame with Applications for Virtual Object Insertion (가상 객체 합성을 위한 단일 프레임에서의 안정된 카메라 자세 추정)

  • Park, Jong-Seung;Lee, Bum-Jong
    • The KIPS Transactions:PartB
    • /
    • v.13B no.5 s.108
    • /
    • pp.499-506
    • /
    • 2006
  • This Paper describes a fast and stable camera pose estimation method for real-time augmented reality systems. From the feature tracking results of a marker on a single frame, we estimate the camera rotation matrix and the translation vector. For the camera pose estimation, we use the shape factorization method based on the scaled orthographic Projection model. In the scaled orthographic factorization method, all feature points of an object are assumed roughly at the same distance from the camera, which means the selected reference point and the object shape affect the accuracy of the estimation. This paper proposes a flexible and stable selection method for the reference point. Based on the proposed method, we implemented a video augmentation system that inserts virtual 3D objects into the input video frames. Experimental results showed that the proposed camera pose estimation method is fast and robust relative to the previous methods and it is applicable to various augmented reality applications.

Study on the meaning and delivery of caption recording in mass media - On the function of caption recording TV mass media and video art - (미디어에 있어서의 자막기록의 의미와 전달성 - 공중파방송과 비디오 아트에서의 자막기록을 중심으로 -)

  • Rhee, Ji-Young
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.3 no.2
    • /
    • pp.78-96
    • /
    • 2003
  • Nowadays, mass media innovates and has the great power of revolution of our lives. Marshall MacLuhan says new media is the new method of language and also, it is connecting to the real world possibly. The letters make media world a big different. At the end of voiceless age, the caption not only delivers means of the contents but also provides for the composition of the screen itself. In these kinds of composition elements contain explanations such as aesthetic, entertainment, and revival aspects. The caption as translation that used to use was as changing as new way of exploring method. To deliver means of contents, the letters of inside screen has extremely big changes and meaning as well. The design of lettering is the new aesthetic method of media world. Also, the elements of lettering is approaching as the new way of lives. Therefore, this study is to provide the aspects of the lettering to the mass media respectively.

An integrated visual-inertial technique for structural displacement and velocity measurement

  • Chang, C.C.;Xiao, X.H.
    • Smart Structures and Systems
    • /
    • v.6 no.9
    • /
    • pp.1025-1039
    • /
    • 2010
  • Measuring displacement response for civil structures is very important for assessing their performance, safety and integrity. Recently, video-based techniques that utilize low-cost high-resolution digital cameras have been developed for such an application. These techniques however have relatively low sampling frequency and the results are usually contaminated with noises. In this study, an integrated visual-inertial measurement method that combines a monocular videogrammetric displacement measurement technique and a collocated accelerometer is proposed for displacement and velocity measurement of civil engineering structures. The monocular videogrammetric technique extracts three-dimensional translation and rotation of a planar target from an image sequence recorded by one camera. The obtained displacement is then fused with acceleration measured from a collocated accelerometer using a multi-rate Kalman filter with smoothing technique. This data fusion not only can improve the accuracy and the frequency bandwidth of displacement measurement but also provide estimate for velocity. The proposed measurement technique is illustrated by a shake table test and a pedestrian bridge test. Results show that the fusion of displacement and acceleration can mitigate their respective limitations and produce more accurate displacement and velocity responses with a broader frequency bandwidth.

Changes of the Kinetic Energy of Putter Head and Ball Movements during the Process of Impact (퍼팅 스트로크의 충돌과정에서 나타난 퍼터헤드와 볼의 운동에너지 변화 분석)

  • Park, Jin
    • Korean Journal of Applied Biomechanics
    • /
    • v.13 no.2
    • /
    • pp.175-183
    • /
    • 2003
  • The purpose of this study was to analyze the kinetic energy of putter head and ball movements during the process of impact. Highly skilled 5 golfers(less than 1 handicap) participated in this study and the target distance was 3 m. Movements of ball and putter head were recorded with 2 VHS video cameras(60 Hz, 1/500 s shutter speed). Small control object($18.5{\times}18.5{\times}78.5\;cm$) was used in this sdtuldy. Analyzing the process of impact, putter was digitized before 0.0835 s and after 0.0835 s of impact. Ball was digitized 0.1336 s after impact. The results showed that the maximum speed was appeared at Impact and prolonged for a while. Contact point of the club head was within 0.7 cm to the z axis. After contacting the club head, the ball was moved above the ground level(slide) and returned to the ground with sliding and rolling. After contacting the ground, the speed of ball was relied on the surface of the ground. During impact, 70% of kinetic energy of club head has been transferred to the ball.

Meta's Metaverse Platform Design in the Pre-launch and Ignition Life Stage

  • Song, Minzheong
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.4
    • /
    • pp.121-131
    • /
    • 2022
  • We look at the initial stage of Meta (previous Facebook)'s new metaverse platform and investigate its platform design in pre-launch and ignition life stage. From the Rocket Model (RM)'s theoretical logic, the results reveal that Meta firstly focuses on investing in key content developers by acquiring virtual reality (VR), video, music content firms and offering production support platform of the augmented reality (AR) content, 'Spark AR' last three years (2019~2021) for attracting high-potential developers and users. In terms of three matching criteria, Meta develops an Artificial Intelligence (AI) powered translation software, partners with Microsoft (MS) for cloud computing and AI, and develops an AI platform for realistic avatar, MyoSuite. In 'connect' function, Meta curates the game concept submitted by game developers, welcomes other game and SNS based metaverse apps, and expands Horizon Worlds (HW) on VR devices to PCs and mobile devices. In 'transact' function, Meta offers 'HW Creator Funding' program for metaverse, launches the first commercialized Meta Avatar Store on Meta's conventional SNS and Messaging apps by inviting all fashion creators to design and sell clothing in this store. Mata also launches an initial test of non-fungible token (NFT) display on Instagram and expands it to Facebook in the US. Lastly, regarding optimization, especially in the face of recent data privacy issues that have adversely affected corporate key performance indicators (KPIs), Meta assures not to collect any new data and to make its privacy policy easier to understand and update its terms of service more user friendly.