• Title/Summary/Keyword: Content-based approach

Search Result 759, Processing Time 0.023 seconds

Speech Enhancement Using Phase-Dependent A Priori SNR Estimator in Log-Mel Spectral Domain

  • Lee, Yun-Kyung;Park, Jeon Gue;Lee, Yun Keun;Kwon, Oh-Wook
    • ETRI Journal
    • /
    • v.36 no.5
    • /
    • pp.721-729
    • /
    • 2014
  • We propose a novel phase-based method for single-channel speech enhancement to extract and enhance the desired signals in noisy environments by utilizing the phase information. In the method, a phase-dependent a priori signal-to-noise ratio (SNR) is estimated in the log-mel spectral domain to utilize both the magnitude and phase information of input speech signals. The phase-dependent estimator is incorporated into the conventional magnitude-based decision-directed approach that recursively computes the a priori SNR from noisy speech. Additionally, we reduce the performance degradation owing to the one-frame delay of the estimated phase-dependent a priori SNR by using a minimum mean square error (MMSE)-based and maximum a posteriori (MAP)-based estimator. In our speech enhancement experiments, the proposed phase-dependent a priori SNR estimator is shown to improve the output SNR by 2.6 dB for both the MMSE-based and MAP-based estimator cases as compared to a conventional magnitude-based estimator.

Fast Random-Forest-Based Human Pose Estimation Using a Multi-scale and Cascade Approach

  • Chang, Ju Yong;Nam, Seung Woo
    • ETRI Journal
    • /
    • v.35 no.6
    • /
    • pp.949-959
    • /
    • 2013
  • Since the recent launch of Microsoft Xbox Kinect, research on 3D human pose estimation has attracted a lot of attention in the computer vision community. Kinect shows impressive estimation accuracy and real-time performance on massive graphics processing unit hardware. In this paper, we focus on further reducing the computation complexity of the existing state-of-the-art method to make the real-time 3D human pose estimation functionality applicable to devices with lower computing power. As a result, we propose two simple approaches to speed up the random-forest-based human pose estimation method. In the original algorithm, the random forest classifier is applied to all pixels of the segmented human depth image. We first use a multi-scale approach to reduce the number of such calculations. Second, the complexity of the random forest classification itself is decreased by the proposed cascade approach. Experiment results for real data show that our method is effective and works in real time (30 fps) without any parallelization efforts.

Text-driven Speech Animation with Emotion Control

  • Chae, Wonseok;Kim, Yejin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.8
    • /
    • pp.3473-3487
    • /
    • 2020
  • In this paper, we present a new approach to creating speech animation with emotional expressions using a small set of example models. To generate realistic facial animation, two example models called key visemes and expressions are used for lip-synchronization and facial expressions, respectively. The key visemes represent lip shapes of phonemes such as vowels and consonants while the key expressions represent basic emotions of a face. Our approach utilizes a text-to-speech (TTS) system to create a phonetic transcript for the speech animation. Based on a phonetic transcript, a sequence of speech animation is synthesized by interpolating the corresponding sequence of key visemes. Using an input parameter vector, the key expressions are blended by a method of scattered data interpolation. During the synthesizing process, an importance-based scheme is introduced to combine both lip-synchronization and facial expressions into one animation sequence in real time (over 120Hz). The proposed approach can be applied to diverse types of digital content and applications that use facial animation with high accuracy (over 90%) in speech recognition.

Model Adaptation Using Discriminative Noise Adaptive Training Approach for New Environments

  • Jung, Ho-Young;Kang, Byung-Ok;Lee, Yun-Keun
    • ETRI Journal
    • /
    • v.30 no.6
    • /
    • pp.865-867
    • /
    • 2008
  • A conventional environment adaptation for robust speech recognition is usually conducted using transform-based techniques. Here, we present a discriminative adaptation strategy based on a multi-condition-trained model, and propose a new method to provide universal application to a new environment using the environment's specific conditions. Experimental results show that a speech recognition system adapted using the proposed method works successfully for other conditions as well as for those of the new environment.

  • PDF

An approach for improving the performance of the Content-Based Image Retrieval (CBIR)

  • Jeong, Inseong
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.30 no.6_2
    • /
    • pp.665-672
    • /
    • 2012
  • Amid rapidly increasing imagery inputs and their volume in a remote sensing imagery database, Content-Based Image Retrieval (CBIR) is an effective tool to search for an image feature or image content of interest a user wants to retrieve. It seeks to capture salient features from a 'query' image, and then to locate other instances of image region having similar features elsewhere in the image database. For a CBIR approach that uses texture as a primary feature primitive, designing a texture descriptor to better represent image contents is a key to improve CBIR results. For this purpose, an extended feature vector combining the Gabor filter and co-occurrence histogram method is suggested and evaluated for quantitywise and qualitywise retrieval performance criterion. For the better CBIR performance, assessing similarity between high dimensional feature vectors is also a challenging issue. Therefore a number of distance metrics (i.e. L1 and L2 norm) is tried to measure closeness between two feature vectors, and its impact on retrieval result is analyzed. In this paper, experimental results are presented with several CBIR samples. The current results show that 1) the overall retrieval quantity and quality is improved by combining two types of feature vectors, 2) some feature is better retrieved by a specific feature vector, and 3) retrieval result quality (i.e. ranking of retrieved image tiles) is sensitive to an adopted similarity metric when the extended feature vector is employed.

Designing the Content-Based Korean Instructional Model Using the Flipped Learning

  • Mun, Jung-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.6
    • /
    • pp.15-21
    • /
    • 2018
  • The purpose of this study is to design a Content-based Korean Class model using Flipped learning for foreign students. The class model that presents on this paper will lead the language learning through content learning, also it will be enable the student more active and to have an initiative in the class. Prior to designing a Content-based Korean Class model using Flipped learning, the concepts and educational significance and characteristics of flip learning were reviewed through previous studies. Then, It emphasizes the necessity of teaching method adapting Flipped learning to Content-based teaching method in Korean language education. It also suggests standards and principles of composition in Contents-based teaching method using Flipped learning. After designing the instructional model based on the suggested standards and principles, it presents a course of instruction about how learning methods, contents and activities should be done step by step. The Content-based Korean class model using the Flipped learning will be an alternative approach to overcome the limitations of teacher-centered teaching methods and lecture-teaching methods which are the dominant of present classroom environment.

Human-Content Interface : A Friction-Based Interface Model for Efficient Interaction with Android App and Web-Based Contents

  • Kim, Jong-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.4
    • /
    • pp.55-62
    • /
    • 2021
  • In this paper, we propose a human-content interface that allows users to quickly and efficiently search data through friction-based scrolling with ROI(Regions of interests). Our approach, conceived from the behavior of finding information or content of interest to users, efficiently calculates ROI for a given content. Based on the kernel developed by conceiving from GMM(Gaussian mixture model), information is searched by moving the screen smoothly and quickly to the location of the information of interest to the user. In this paper, linear interpolation is applied to make one softer inertia, and this is applied to scrolls. As a result, unlike the existing approach in which information is searched according to the user's input, our method can more easily and intuitively find information or content that the user is interested in through friction-based scrolling. For this reason, the user can save search time.

Combining Collaborative, Diversity and Content Based Filtering for Recommendation System

  • Shrestha, Jenu;Uddin, Mohammed Nazim;Jo, Geun-Sik
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2007.11a
    • /
    • pp.602-609
    • /
    • 2007
  • Combining collaborative filtering with some other technique is most common in hybrid recommender systems. As many recommended items from collaborative filtering seem to be similar with respect to content, the collaborative-content hybrid system suffers in terms of quality recommendation and recommending new items as well. To alleviate such problem, we have developed a novel method that uses a diversity metric to select the dissimilar items among the recommended items from collaborative filtering, which together with the input when fed into content space let us improve and include new items in the recommendation. We present experimental results on movielens dataset that shows how our approach performs better than simple content-based system and naive hybrid system

  • PDF

The Study of the Content-oriented Spatial Design Trends - Based on the characteristic of "matter" in space - (콘텐츠 중심의 공간디자인 경향에 관한 연구 - 공간의 질료적 특성을 중심으로 -)

  • Park, Young-Tae
    • Korean Institute of Interior Design Journal
    • /
    • v.18 no.2
    • /
    • pp.3-15
    • /
    • 2009
  • In the swirl of change in the world, Media, the development of information technology, and the enterprise-oriented capitalism have tied the world as one unit, and have entered the period of adaptation. With the view of introspection of these concepts, this paper is mainly written by the theory of "matter as the content" with exploration of the design of essence and scientific approach. The design process is the condensational process which is developed by the exchange of reason and sensibility. This process, which is characterized by the objective manipulation of a given condition and the intervention of extrapolation, is the new way of thinking and approach rather than the concept of the way and means. The research of this paper is based on paintings and the analysis of essential experiments in design, the definition of the terms of content in the view of philosophy and design, and the analysis of the cases in architecture and spacial design. As a result, this paper shows that these designs are not just "Simulacre" but the essential eidos, and "content" is the core of these designs which can produce prototype as machine. Also, these designs can relatively be the persuasive methodology for the reflexive modernization.

A Content-Based Synchronization Approach using Scene Keywords in Enhanced TV based on MPEG-4 (MPEG-4 기반 연동형 방송에서 장면 키워드를 이용한 내용 기반 동기화 기법)

  • Yim, Hyun-Jeong;Lim, Soon-Bum
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.6
    • /
    • pp.737-741
    • /
    • 2010
  • When implementing Enhanced TV services, the time synchronization between the video stream that forms the background and the data contents overlaid on audio/video is an important issue. Currently, however, the basic method of synchronizing the data in the MPEG-4 environment is based on absolute time values. For more efficient synchronization when developing Enhanced TV content, this paper proposes a content-based synchronization in which the data content varies depending on the video content. The proposed content-based synchronization method is implemented by defining BIFS nodes more widely, based on scene keywords, and then using the metadata of MPEG7.