• Title/Summary/Keyword: spatial cues

Search Result 34, Processing Time 0.021 seconds

An efficient method of spatial cues and compensation method of spectrums on multichannel spatial audio coding (멀티채널 Spatial Audio Coding에서의 효율적인 Spatial Cues 사용과 그에 따른 Spectrum 보상방법)

  • Lee, Byong-Hwa;Beack, Seung-Kwon;Seo, Jeong-Gil;Han, Min-Soo
    • MALSORI
    • /
    • no.53
    • /
    • pp.157-169
    • /
    • 2005
  • This paper proposes an efficiently representing method of spatial cues on multichannel spatial audio coding. The Binaural Cue Coding (BCC) method introduced recently represents multichannel audio signals by means of Inter Channel Level Difference (ICLD) or Source Index (SI). We tried to express more efficiently ICLD and SI information based on Inter Channel Correlation in this paper. We adopt different spatial cues according to ICC and propose a compensation method of empty spectrums created by using SI. We performed a MOS test and measuring spectral distortion. The results show that the proposed method can reduce the bitrate of side information without large degradation of the audio quality.

  • PDF

Statistical Model-Based Voice Activity Detection Using Spatial Cues for Dual-Channel Noisy Speech Recognition (이중채널 잡음음성인식을 위한 공간정보를 이용한 통계모델 기반 음성구간 검출)

  • Shin, Min-Hwa;Park, Ji-Hun;Kim, Hong-Kook;Lee, Yeon-Woo;Lee, Seong-Ro
    • Phonetics and Speech Sciences
    • /
    • v.2 no.3
    • /
    • pp.141-148
    • /
    • 2010
  • In this paper, voice activity detection (VAD) for dual-channel noisy speech recognition is proposed in which spatial cues are employed. In the proposed method, a probability model for speech presence/absence is constructed using spatial cues obtained from dual-channel input signal, and a speech activity interval is detected through this probability model. In particular, spatial cues are composed of interaural time differences and interaural level differences of dual-channel speech signals, and the probability model for speech presence/absence is based on a Gaussian kernel density. In order to evaluate the performance of the proposed VAD method, speech recognition is performed for speech segments that only include speech intervals detected by the proposed VAD method. The performance of the proposed method is compared with those of several methods such as an SNR-based method, a direction of arrival (DOA) based method, and a phase vector based method. It is shown from the speech recognition experiments that the proposed method outperforms conventional methods by providing relative word error rates reductions of 11.68%, 41.92%, and 10.15% compared with SNR-based, DOA-based, and phase vector based method, respectively.

  • PDF

The Effects of Variety and Visual Cue on PerceivedQuantity and Consumer Attitude toward Participationinto Sales Promotion Events

  • Lee, Changhyun;Kim, Youngchan
    • Asia Marketing Journal
    • /
    • v.21 no.1
    • /
    • pp.65-87
    • /
    • 2019
  • Most studies on how people perceive a given quantity of items were conducted with visual cues exclusively and only offered spatial area based explanations, such as spatial estimation and perceptual grouping theories. This article establishes how people perceive a given quantity when only a written description is provided without any visual cues. Across two studies we show that variety decreases perceived quantity when a variety cue is given, while variety increases perceived quantity when a visual cue is not given. This is because people tend to rely heavily on spatial areas when a visual cue is present and because people are prone to confirmation bias when they are provided with no visual cues but only written descriptions. Furthermore, we highlight that quantity perception has a mediation effect on consumers' attitude-the intention to participate in sales promotional events. Lastly, we summarize the article and discuss its contributions, implications, limitations, and suggestions for future research.

Improving visual relationship detection using linguistic and spatial cues

  • Jung, Jaewon;Park, Jongyoul
    • ETRI Journal
    • /
    • v.42 no.3
    • /
    • pp.399-410
    • /
    • 2020
  • Detecting visual relationships in an image is important in an image understanding task. It enables higher image understanding tasks, that is, predicting the next scene and understanding what occurs in an image. A visual relationship comprises of a subject, a predicate, and an object, and is related to visual, language, and spatial cues. The predicate explains the relationship between the subject and object and can be categorized into different categories such as prepositions and verbs. A large visual gap exists although the visual relationship is included in the same predicate. This study improves upon a previous study (that uses language cues using two losses) and a spatial cue (that only includes individual information) by adding relative information on the subject and object of the extant study. The architectural limitation is demonstrated and is overcome to detect all zero-shot visual relationships. A new problem is discovered, and an explanation of how it decreases performance is provided. The experiment is conducted on the VRD and VG datasets and a significant improvement over previous results is obtained.

Detection of Forest Areas using Airborne LIDAR Data (항공 라이다데이터를 이용한 산림영역 탐지)

  • Hwang, Se-Ran;Kim, Seong-Joon;Lee, Im-Pyeong
    • Spatial Information Research
    • /
    • v.18 no.3
    • /
    • pp.23-32
    • /
    • 2010
  • LIDAR data are useful for forest applications such as bare-earth DEM generation for forest areas, and estimation of tree height and forest biomass. As a core preprocessing procedure for most forest applications, this study attempts to develop an efficient method to detect forest areas from LIDAR data. First, we suggest three perceptual cues based on multiple return characteristics, height deviation and spatial distribution, being expected as reliable perceptual cues for forest area detection from LIDAR data. We then classify the potential forest areas based on the individual cue and refine them with a bi-morphological process to eliminate falsely detected areas and smoothing the boundaries. The final refined forest areas have been compared with the reference data manually generated with an aerial image. All the methods based on three types of cues show the accuracy of more than 90%. Particularly, the method based on multiple returns is slightly better than other two cues in terms of the simplicity and accuracy. Also, it is shown that the combination of the individual results from each cue can enhance the classification accuracy.

Memory-for-Object Location in Toddlers (유아의 물체위치 기억에 관한 연구)

  • Kim, Mee Hae
    • Korean Journal of Child Studies
    • /
    • v.7 no.1
    • /
    • pp.85-95
    • /
    • 1986
  • The purpose of the present research was to study effects of experimental conditions and developmental tendency in the use of external cues in memory-for-object location in toddlers. This study consisted of two experiments. In study 1, the subjects were 12 toddlers, 18 to 23 months old ; in study 2, 30 toddlers, 24 to 41 months old. The findings showed that memory-for-object location in toddlers was different in accordance with experimental conditions; that is, memory-for-object location in the natural condition was significantly better than in the artificial condition. Effects of external cues were found ; that is, memory-for-object location was best in the condition of spatial cues, and next best in the condition of picture cues, and least good in the no cue condition.

  • PDF

The Effect of Spatial Attention in Hangul Word Recognition: Depending on Visual Factors (한글 단어 재인에서 시각적 요인에 따른 공간주의의 영향)

  • Ko Eun Lee;Hye-Won Lee
    • Korean Journal of Cognitive Science
    • /
    • v.34 no.1
    • /
    • pp.1-20
    • /
    • 2023
  • In this study, we examined the effects of spatial attention in Hangul word recognition depending on visual factors. The visual complexity of words (Experiment 1) and contrast (Experiment 2) were manipulated to examine whether the effect of spatial attention differs depending on visual quality. Participants responded to words with and without codas in experiment 1 and words in high-contrast and low-contrast conditions in experiment 2. The effects of spatial attention were investigated by calculating the difference in performance between the condition where spatial cues were given at the target location (valid trial) and the condition where the spatial cues were not given at the target location (invalid trial) as the cuing effects. As a result, the cuing effects were similar depending on the complexity of the words. It indicates that the effects of spatial attention were not different across the visual complexity conditions. The cuing effects were greater in the low-contrast condition than in the high-contrast condition. The greater effect of spatial attention when the contrast is low was explained as a mechanism of signal enhancement.

The Type of e-book's Visualization by the Narrative Space (내러티브 공간에 의한 이북(e-book)의 시각화 유형)

  • Shin, Seung-Yun;Jung, Hyun-Sun
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.7
    • /
    • pp.103-114
    • /
    • 2014
  • This study intends to make a proposal the direction classification to develop the independent study of e-book's visualization. For this, we research into the e-book of Disney animation which achieved recognition in the literary value and amusement First of all, We grasp the meaning of the concept of e-book's aerial-image and perceptual principle. Next, We found the subject that starts the movement, and then observed the factor of the presentation to be possible to experience the actual spatial experience by the motion-produced cues. Through analysis process, We can classify the appearance elements, media, camera, and the readers' motion-produced cues into 13 parts and define as the codes. As we analysis the frequency of use of the analysis object, We separated it into the 46 combination exercises. According to the combination with the independent exercise, We separated them into 4 groups. There are the actual spatial experience, narrative spatial experience, the experience of characters. The basis for these, we can analyze the characteristics of the motion-produced cues. This study has the meaning of the expansion of e-book into the film language system by separating the e-book's narrative visualization type.

Human Performance Evaluation of Virtual Object Moving Task in the Different Temporal, Spatial and Pictorial Resolution of a Stereoscopic Display (가상현실 표시장치에서의 시간적, 공간적, 회화적 해상도에 따른 가상물체 이동작업의 인간성능 평가)

  • Park, Jae-Hee
    • IE interfaces
    • /
    • v.18 no.1
    • /
    • pp.82-87
    • /
    • 2005
  • Most of virtual reality systems ask users to control 3D objects or to navigate 3D world using 3D controllers. To maximize the human performance in the control, the design of virtual reality system and its input and output devices should be optimized. In this study, an experiment was designed to investigate the effects of three resolution factors of a virtual reality system on the human performance. Six subjects conducted the experiment for the factors; two frame rates, three spatial resolutions, and three pictorial contents. The result showed that the greater the spatial resolution was, the higher the human performance was. For the temporal resolution, fixed frame rate at 18 Hz was better than the varied maximized frame rate. For the pictorial contents, the virtual space with orientation cues marked the greatest performance than the other two conditions; the virtual space without any orientation cue and the virtual space like real world. These results could be applied for the design of virtual reality systems.

Automatic Person Identification using Multiple Cues

  • Swangpol, Danuwat;Chalidabhongse, Thanarat
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.1202-1205
    • /
    • 2005
  • This paper describes a method for vision-based person identification that can detect, track, and recognize person from video using multiple cues: height and dressing colors. The method does not require constrained target's pose or fully frontal face image to identify the person. First, the system, which is connected to a pan-tilt-zoom camera, detects target using motion detection and human cardboard model. The system keeps tracking the moving target while it is trying to identify whether it is a human and identify who it is among the registered persons in the database. To segment the moving target from the background scene, we employ a version of background subtraction technique and some spatial filtering. Once the target is segmented, we then align the target with the generic human cardboard model to verify whether the detected target is a human. If the target is identified as a human, the card board model is also used to segment the body parts to obtain some salient features such as head, torso, and legs. The whole body silhouette is also analyzed to obtain the target's shape information such as height and slimness. We then use these multiple cues (at present, we uses shirt color, trousers color, and body height) to recognize the target using a supervised self-organization process. We preliminary tested the system on a set of 5 subjects with multiple clothes. The recognition rate is 100% if the person is wearing the clothes that were learned before. In case a person wears new dresses the system fail to identify. This means height is not enough to classify persons. We plan to extend the work by adding more cues such as skin color, and face recognition by utilizing the zoom capability of the camera to obtain high resolution view of face; then, evaluate the system with more subjects.

  • PDF