• Title/Summary/Keyword: Frame Extraction

Search Result 320, Processing Time 0.03 seconds

Automatic melody extraction algorithm using a convolutional neural network

  • Lee, Jongseol;Jang, Dalwon;Yoon, Kyoungro
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.12
    • /
    • pp.6038-6053
    • /
    • 2017
  • In this study, we propose an automatic melody extraction algorithm using deep learning. In this algorithm, feature images, generated using the energy of frequency band, are extracted from polyphonic audio files and a deep learning technique, a convolutional neural network (CNN), is applied on the feature images. In the training data, a short frame of polyphonic music is labeled as a musical note and a classifier based on CNN is learned in order to determine a pitch value of a short frame of audio signal. We want to build a novel structure of melody extraction, thus the proposed algorithm has a simple structure and instead of using various signal processing techniques for melody extraction, we use only a CNN to find a melody from a polyphonic audio. Despite of simple structure, the promising results are obtained in the experiments. Compared with state-of-the-art algorithms, the proposed algorithm did not give the best result, but comparable results were obtained and we believe they could be improved with the appropriate training data. In this paper, melody extraction and the proposed algorithm are introduced first, and the proposed algorithm is then further explained in detail. Finally, we present our experiment and the comparison of results follows.

Design and Evaluation of the Key-Frame Extraction Algorithm for Constructing the Virtual Storyboard Surrogates (영상 초록 구현을 위한 키프레임 추출 알고리즘의 설계와 성능 평가)

  • Kim, Hyun-Hee
    • Journal of the Korean Society for information Management
    • /
    • v.25 no.4
    • /
    • pp.131-148
    • /
    • 2008
  • The purposes of the study are to design a key-frame extraction algorithm for constructing the virtual storyboard surrogates and to evaluate the efficiency of the proposed algorithm. To do this, first, the theoretical framework was built by conducting two tasks. One is to investigate the previous studies on relevance and image recognition and classification. Second is to conduct an experiment in order to identify their frames recognition pattern of 20 participants. As a result, the key-frame extraction algorithm was constructed. Then the efficiency of proposed algorithm(hybrid method) was evaluated by conducting an experiment using 42 participants. In the experiment, the proposed algorithm was compared to the random method where key-frames were extracted simply at an interval of few seconds(or minutes) in terms of accuracy in summarizing or indexing a video. Finally, ways to utilize the proposed algorithm in digital libraries and Internet environment were suggested.

An Efficient Implementation of Key Frame Extraction and Sharing in Android for Wireless Video Sensor Network

  • Kim, Kang-Wook
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.9
    • /
    • pp.3357-3376
    • /
    • 2015
  • Wireless sensor network is an important research topic that has attracted a lot of attention in recent years. However, most of the interest has focused on wireless sensor network to gather scalar data such as temperature, humidity and vibration. Scalar data are insufficient for diverse applications such as video surveillance, target recognition and traffic monitoring. However, if we use camera sensors in wireless sensor network to collect video data which are vast in information, they can provide important visual information. Video sensor networks continue to gain interest due to their ability to collect video information for a wide range of applications in the past few years. However, how to efficiently store the massive data that reflect environmental state of different times in video sensor network and how to quickly search interested information from them are challenging issues in current research, especially when the sensor network environment is complicated. Therefore, in this paper, we propose a fast algorithm for extracting key frames from video and describe the design and implementation of key frame extraction and sharing in Android for wireless video sensor network.

Frame Arguments Role Labeling for Event extraction in Dialogue (대화문에서의 이벤트 추출을 위한 프레임 논항 역할 분류기)

  • Heo, Cheolhun;Noh, Youngbin;Hahm, Younggyun;Choi, Key-Sun
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.119-123
    • /
    • 2020
  • 이벤트 추출은 텍스트에서 구조화된 이벤트를 분석하는 것이다. 본 논문은 대화문에서 발생하는 다양한 종류의 이벤트를 다루기 위해 이벤트 스키마를 프레임넷으로 정한다. 대화문에서의 이벤트 논항은 이벤트가 발생하는 문장 뿐만 아니라 다른 문장 또는 대화에 참여하는 발화자에서 발생할 수 있다. 대화문 주석 데이터의 부재로 대화문에서의 프레임 파싱 연구는 진행되지 않았다. 본 논문이 제안하는 모델은 대화문에서의 이벤트 논항 구간이 주어졌을 때, 논항 구간의 역할을 식별하는 모델이다. 해당 모델은 이벤트를 유발한 어휘, 논항 구간, 논항 역할 간의 관계를 학습한다. 대화문 주석 데이터의 부족을 극복하기 위해 문어체 주석 데이터인 한국어 프레임넷을 활용하여 전이학습을 진행한다. 이를 통해 정확도 51.21%를 달성한다.

  • PDF

Feature Extraction Based on Speech Attractors in the Reconstructed Phase Space for Automatic Speech Recognition Systems

  • Shekofteh, Yasser;Almasganj, Farshad
    • ETRI Journal
    • /
    • v.35 no.1
    • /
    • pp.100-108
    • /
    • 2013
  • In this paper, a feature extraction (FE) method is proposed that is comparable to the traditional FE methods used in automatic speech recognition systems. Unlike the conventional spectral-based FE methods, the proposed method evaluates the similarities between an embedded speech signal and a set of predefined speech attractor models in the reconstructed phase space (RPS) domain. In the first step, a set of Gaussian mixture models is trained to represent the speech attractors in the RPS. Next, for a new input speech frame, a posterior-probability-based feature vector is evaluated, which represents the similarity between the embedded frame and the learned speech attractors. We conduct experiments for a speech recognition task utilizing a toolkit based on hidden Markov models, over FARSDAT, a well-known Persian speech corpus. Through the proposed FE method, we gain 3.11% absolute phoneme error rate improvement in comparison to the baseline system, which exploits the mel-frequency cepstral coefficient FE method.

A Study on a Searching, Extraction and Approximation-Synthesis of Transition Segment in Continuous Speech (연속음성에서 천이구간의 탐색, 추출, 근사합성에 관한 연구)

  • Lee, Si-U
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.4
    • /
    • pp.1299-1304
    • /
    • 2000
  • In a speed coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech quality in case coexist with a voiced and an unvoiced consonants in a frame. So, I propose TSIUVC(Transition Segment Including UnVoiced Consonant) searching, extraction ad approximation-synthesis method in order to uncoexistent with a voiced and unvoiced consonants in a frame. This method based on a zerocrossing rate and pitch detector using FIR-STREAK Digital Filter. As a result, the extraction rates of TSIUVC are 84.8% (plosive), 94.9%(fricative), 92.3%(affricative) in female voice, and 88%(plosive), 94.9%(fricative), 92.3%(affricative) in male voice respectively, Also, I obain a high quality approximation-synthesis waveforms within TSIUVC by using frequency information of 0.547kHz below and 2.813kHz above. This method has the capability of being applied to speech coding of low bit rate, speech analysis and speech synthesis.

  • PDF

Adaptive Background Generation for Vehicle Tracking System (차량 추적 시스템을 위한 적응적 배경 영상 생성)

  • 장승호;정정훈;신정호;박주용;백준기
    • Proceedings of the IEEK Conference
    • /
    • 2003.11a
    • /
    • pp.413-416
    • /
    • 2003
  • This paper proposes an adaptive background image generation method based on the frame difference for traffic monitoring. The performance of the conventional method is limited when there are more vehicles due to traffic Jam. To improve on this, we use frame differencing to separate vehicles from background in frame differencing, we adopt selective approach by using part of the image not considered as vehicle fer extraction of background. The proposed method generates background more efficiently than conventional methods even in the presence of heavy traffic.

  • PDF

The Scene Analysis and Keyframe Extraction for Content-Based Indexing on Compressed Image Sequence (압축된 영상 시퀀스에서 내용 기반 색인을 위한 장면 분석 및 키 프레임 추출)

  • 오상헌;김상렬;김주도;이근영
    • Proceedings of the IEEK Conference
    • /
    • 1999.11a
    • /
    • pp.605-608
    • /
    • 1999
  • In this paper, we proposed several scene analysis algorithms. These algorithms using image difference and histogram operate on the sequence of DC coefficient which is extracted from Motion JPEG or MPEG without full-frame decompression. Since DC sequence has the most information of full frame while it has reduced data. Experimental results show less than 1/64 of full frame analysing complexity and exactly analyze scene changes and extract key frames.

  • PDF

Fast key-frame extraction for 3D reconstruction from a handheld video

  • Choi, Jongho;Kwon, Soonchul;Son, Kwangchul;Yoo, Jisang
    • International journal of advanced smart convergence
    • /
    • v.5 no.4
    • /
    • pp.1-9
    • /
    • 2016
  • In order to reconstruct a 3D model in video sequences, to select key frames that are easy to estimate a geometric model is essential. This paper proposes a method to easily extract informative frames from a handheld video. The method combines selection criteria based on appropriate-baseline determination between frames, frame jumping for fast searching in the video, geometric robust information criterion (GRIC) scores for the frame-to-frame homography and fundamental matrix, and blurry-frame removal. Through experiments with videos taken in indoor space, the proposed method shows creating a more robust 3D point cloud than existing methods, even in the presence of motion blur and degenerate motions.