• Title/Summary/Keyword: Frame Extraction

검색결과 324건 처리시간 0.024초

Automatic melody extraction algorithm using a convolutional neural network

  • Lee, Jongseol;Jang, Dalwon;Yoon, Kyoungro
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제11권12호
    • /
    • pp.6038-6053
    • /
    • 2017
  • In this study, we propose an automatic melody extraction algorithm using deep learning. In this algorithm, feature images, generated using the energy of frequency band, are extracted from polyphonic audio files and a deep learning technique, a convolutional neural network (CNN), is applied on the feature images. In the training data, a short frame of polyphonic music is labeled as a musical note and a classifier based on CNN is learned in order to determine a pitch value of a short frame of audio signal. We want to build a novel structure of melody extraction, thus the proposed algorithm has a simple structure and instead of using various signal processing techniques for melody extraction, we use only a CNN to find a melody from a polyphonic audio. Despite of simple structure, the promising results are obtained in the experiments. Compared with state-of-the-art algorithms, the proposed algorithm did not give the best result, but comparable results were obtained and we believe they could be improved with the appropriate training data. In this paper, melody extraction and the proposed algorithm are introduced first, and the proposed algorithm is then further explained in detail. Finally, we present our experiment and the comparison of results follows.

영상 초록 구현을 위한 키프레임 추출 알고리즘의 설계와 성능 평가 (Design and Evaluation of the Key-Frame Extraction Algorithm for Constructing the Virtual Storyboard Surrogates)

  • 김현희
    • 정보관리학회지
    • /
    • 제25권4호
    • /
    • pp.131-148
    • /
    • 2008
  • 본 연구에서는 비디오의 의미를 잘 표현하고 있는 키프레임들을 추출하는 알고리즘을 설계하고 평가하였다. 구체적으로 영상 초록의 키프레임 선정을 위한 이론 체계를 수립하기 위해서 선행 연구와 이용자들의 키프레임 인식 패턴을 조사하여 분석해 보았다. 그런 다음 이러한 이론 체계를 기초로 하여 하이브리드 방식으로 비디오에서 키프레임을 추출하는 알고리즘을 설계한 후 실험을 통해서 그 효율성을 평가해 보았다. 끝으로 이러한 실험 결과를 디지털 도서관과 인터넷 환경의 비디오 검색과 브라우징에 활용할 수 있는 방안을 제안하였다.

An Efficient Implementation of Key Frame Extraction and Sharing in Android for Wireless Video Sensor Network

  • Kim, Kang-Wook
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제9권9호
    • /
    • pp.3357-3376
    • /
    • 2015
  • Wireless sensor network is an important research topic that has attracted a lot of attention in recent years. However, most of the interest has focused on wireless sensor network to gather scalar data such as temperature, humidity and vibration. Scalar data are insufficient for diverse applications such as video surveillance, target recognition and traffic monitoring. However, if we use camera sensors in wireless sensor network to collect video data which are vast in information, they can provide important visual information. Video sensor networks continue to gain interest due to their ability to collect video information for a wide range of applications in the past few years. However, how to efficiently store the massive data that reflect environmental state of different times in video sensor network and how to quickly search interested information from them are challenging issues in current research, especially when the sensor network environment is complicated. Therefore, in this paper, we propose a fast algorithm for extracting key frames from video and describe the design and implementation of key frame extraction and sharing in Android for wireless video sensor network.

대화문에서의 이벤트 추출을 위한 프레임 논항 역할 분류기 (Frame Arguments Role Labeling for Event extraction in Dialogue)

  • 허철훈;노영빈;함영균;최기선
    • 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
    • /
    • 한국정보과학회언어공학연구회 2020년도 제32회 한글 및 한국어 정보처리 학술대회
    • /
    • pp.119-123
    • /
    • 2020
  • 이벤트 추출은 텍스트에서 구조화된 이벤트를 분석하는 것이다. 본 논문은 대화문에서 발생하는 다양한 종류의 이벤트를 다루기 위해 이벤트 스키마를 프레임넷으로 정한다. 대화문에서의 이벤트 논항은 이벤트가 발생하는 문장 뿐만 아니라 다른 문장 또는 대화에 참여하는 발화자에서 발생할 수 있다. 대화문 주석 데이터의 부재로 대화문에서의 프레임 파싱 연구는 진행되지 않았다. 본 논문이 제안하는 모델은 대화문에서의 이벤트 논항 구간이 주어졌을 때, 논항 구간의 역할을 식별하는 모델이다. 해당 모델은 이벤트를 유발한 어휘, 논항 구간, 논항 역할 간의 관계를 학습한다. 대화문 주석 데이터의 부족을 극복하기 위해 문어체 주석 데이터인 한국어 프레임넷을 활용하여 전이학습을 진행한다. 이를 통해 정확도 51.21%를 달성한다.

  • PDF

Feature Extraction Based on Speech Attractors in the Reconstructed Phase Space for Automatic Speech Recognition Systems

  • Shekofteh, Yasser;Almasganj, Farshad
    • ETRI Journal
    • /
    • 제35권1호
    • /
    • pp.100-108
    • /
    • 2013
  • In this paper, a feature extraction (FE) method is proposed that is comparable to the traditional FE methods used in automatic speech recognition systems. Unlike the conventional spectral-based FE methods, the proposed method evaluates the similarities between an embedded speech signal and a set of predefined speech attractor models in the reconstructed phase space (RPS) domain. In the first step, a set of Gaussian mixture models is trained to represent the speech attractors in the RPS. Next, for a new input speech frame, a posterior-probability-based feature vector is evaluated, which represents the similarity between the embedded frame and the learned speech attractors. We conduct experiments for a speech recognition task utilizing a toolkit based on hidden Markov models, over FARSDAT, a well-known Persian speech corpus. Through the proposed FE method, we gain 3.11% absolute phoneme error rate improvement in comparison to the baseline system, which exploits the mel-frequency cepstral coefficient FE method.

연속음성에서 천이구간의 탐색, 추출, 근사합성에 관한 연구 (A Study on a Searching, Extraction and Approximation-Synthesis of Transition Segment in Continuous Speech)

  • 이시우
    • 한국정보처리학회논문지
    • /
    • 제7권4호
    • /
    • pp.1299-1304
    • /
    • 2000
  • In a speed coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech quality in case coexist with a voiced and an unvoiced consonants in a frame. So, I propose TSIUVC(Transition Segment Including UnVoiced Consonant) searching, extraction ad approximation-synthesis method in order to uncoexistent with a voiced and unvoiced consonants in a frame. This method based on a zerocrossing rate and pitch detector using FIR-STREAK Digital Filter. As a result, the extraction rates of TSIUVC are 84.8% (plosive), 94.9%(fricative), 92.3%(affricative) in female voice, and 88%(plosive), 94.9%(fricative), 92.3%(affricative) in male voice respectively, Also, I obain a high quality approximation-synthesis waveforms within TSIUVC by using frequency information of 0.547kHz below and 2.813kHz above. This method has the capability of being applied to speech coding of low bit rate, speech analysis and speech synthesis.

  • PDF

차량 추적 시스템을 위한 적응적 배경 영상 생성 (Adaptive Background Generation for Vehicle Tracking System)

  • 장승호;정정훈;신정호;박주용;백준기
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2003년도 신호처리소사이어티 추계학술대회 논문집
    • /
    • pp.413-416
    • /
    • 2003
  • This paper proposes an adaptive background image generation method based on the frame difference for traffic monitoring. The performance of the conventional method is limited when there are more vehicles due to traffic Jam. To improve on this, we use frame differencing to separate vehicles from background in frame differencing, we adopt selective approach by using part of the image not considered as vehicle fer extraction of background. The proposed method generates background more efficiently than conventional methods even in the presence of heavy traffic.

  • PDF

압축된 영상 시퀀스에서 내용 기반 색인을 위한 장면 분석 및 키 프레임 추출 (The Scene Analysis and Keyframe Extraction for Content-Based Indexing on Compressed Image Sequence)

  • 오상헌;김상렬;김주도;이근영
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 1999년도 추계종합학술대회 논문집
    • /
    • pp.605-608
    • /
    • 1999
  • In this paper, we proposed several scene analysis algorithms. These algorithms using image difference and histogram operate on the sequence of DC coefficient which is extracted from Motion JPEG or MPEG without full-frame decompression. Since DC sequence has the most information of full frame while it has reduced data. Experimental results show less than 1/64 of full frame analysing complexity and exactly analyze scene changes and extract key frames.

  • PDF

Fast key-frame extraction for 3D reconstruction from a handheld video

  • Choi, Jongho;Kwon, Soonchul;Son, Kwangchul;Yoo, Jisang
    • International journal of advanced smart convergence
    • /
    • 제5권4호
    • /
    • pp.1-9
    • /
    • 2016
  • In order to reconstruct a 3D model in video sequences, to select key frames that are easy to estimate a geometric model is essential. This paper proposes a method to easily extract informative frames from a handheld video. The method combines selection criteria based on appropriate-baseline determination between frames, frame jumping for fast searching in the video, geometric robust information criterion (GRIC) scores for the frame-to-frame homography and fundamental matrix, and blurry-frame removal. Through experiments with videos taken in indoor space, the proposed method shows creating a more robust 3D point cloud than existing methods, even in the presence of motion blur and degenerate motions.