• Title/Summary/Keyword: Pitch detection

Search Result 184, Processing Time 0.022 seconds

A study on pitch detection for RUI emotion classification based on voice (RUI용 음성신호기반의 감정분류를 위한 피치검출기에 관한 연구)

  • Byun, Sung-Woo;Lee, Seok-Pil
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2015.07a
    • /
    • pp.421-424
    • /
    • 2015
  • 컴퓨터 기술이 발전하고 컴퓨터 사용이 일반화 되면서 휴먼 인터페이스에 대한 많은 연구들이 진행되어 왔다. 휴먼 인터페이스에서 감정을 인식하는 기술은 컴퓨터와 사람간의 상호작용을 위해 중요한 기술이다. 감정을 인식하는 기술에서 분류 정확도를 높이기 위해 특징벡터를 정확하게 추출하는 것이 중요하다. 본 논문에서는 정확한 피치검출을 위하여 음성신호에서 음성 구간과 비 음성구간을 추출하였으며, Speech Processing 분야에서 사용되는 전 처리 기법인 저역 필터와 유성음 추출 기법, 후처리 기법인 Smoothing 기법을 사용하여 피치 검출을 수행하고 비교하였다. 그 결과, 전 처리 기법인 유성음 추출 기법과 후처리 기법인 Smoothing 기법은 피치 검출의 정확도를 높였고, 저역 필터를 사용한 경우는 피치 검출의 정확도가 떨어트렸다.

  • PDF

Video Segmentation Using Audio and Image Information (오디오와 영상 정보를 이용한 비디오 세그먼테이션)

  • 정해준;정성환
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10b
    • /
    • pp.470-472
    • /
    • 2000
  • 본 논문에서는 영상 정보뿐만 아니라 오디오 정보를 함께 사용한 비디오 세그멘테이션에 대해 연구하였다. 대용량의 정보를 가지고 있는 비디오에 대하여 장면 경계 검출(Scene Break Detection)을 할 경우, 카메라 팬이나 장면 내에 여려 가지 다른 샷(Shot)으로 인하여 영상 정보만으로는 효과적인 검출이 어렵다. 이러한 문제를 해결하기 위해 비디오 내의 오디오 정보도 함께 사용함으로써 문제를 개선했다. 뉴스, 광고, 스포츠 등 다양한 3개 분야의 TV 프로그램으로 구성된 약 4,000개 영상 프레임과 약 30,000개의 오디오 프레임으로 구성된 비디오 데이터베이스에 대하여 실험한 결과, 영상 정보만 사용한 경우보다 우수한 성능을 확인하였다. 영상 정보 특징값으로는 칼라 히스토그램과 DC계수를 사용했고, 오디오 특징값으로는 SR(Silence ratio), VSTD(Volume standard deviation), NPR(Non pitch ratio)을 사용했다.

  • PDF

Massive Music Resources Retrieval Method Based on Ant Colony Algorithm

  • Yun Meng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.5
    • /
    • pp.1208-1222
    • /
    • 2024
  • Music resources are characterized by quantization, diversification and complication. With the rapid increase of the demand for music resources, the storage of music resources is very large. In order to improve the retrieval effect of music resources, a massive music resources retrieval method based on ant colony algorithm is proposed to effectively use music resources. This paper constructs autocorrelation function to extract pitch feature of music resource, classifies the music resource information by calculating feature similarity. Using ant colony algorithm to correlate the feature of music resource, gain the result of correlative, locate the result of detection and get the result of multi-module. Simulation results show that the proposed method has high precision and recall, short retrieval time and can effectively retrieve massive music resources.

Structural damage detection through longitudinal wave propagation using spectral finite element method

  • Kumar, K. Varun;Saravanan, T. Jothi;Sreekala, R.;Gopalakrishnan, N.;Mini, K.M.
    • Geomechanics and Engineering
    • /
    • v.12 no.1
    • /
    • pp.161-183
    • /
    • 2017
  • This paper investigates the damage identification of the concrete pile element through axial wave propagation technique using computational and experimental studies. Now-a-days, concrete pile foundations are often common in all engineering structures and their safety is significant for preventing the failure. Damage detection and estimation in a sub-structure is challenging as the visual picture of the sub-structure and its condition is not well known and the state of the structure or foundation can be inferred only through its static and dynamic response. The concept of wave propagation involves dynamic impedance and whenever a wave encounters a changing impedance (due to loss of stiffness), a reflecting wave is generated with the total strain energy forked as reflected as well as refracted portions. Among many frequency domain methods, the Spectral Finite Element method (SFEM) has been found suitable for analysis of wave propagation in real engineering structures as the formulation is based on dynamic equilibrium under harmonic steady state excitation. The feasibility of the axial wave propagation technique is studied through numerical simulations using Elementary rod theory and higher order Love rod theory under SFEM and ABAQUS dynamic explicit analysis with experimental validation exercise. Towards simulating the damage scenario in a pile element, dis-continuity (impedance mismatch) is induced by varying its cross-sectional area along its length. Both experimental and computational investigations are performed under pulse-echo and pitch-catch configuration methods. Analytical and experimental results are in good agreement.

Development of Automatic Fault Detection System for Chip-On-Film (칩 온 필름을 위한 자동 결함 검출 시스템 개발)

  • Ryu, Jee-Youl;Noh, Seok-Ho
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.2
    • /
    • pp.313-318
    • /
    • 2012
  • This paper presents an automatic system to detect variety of faults from fine pitch COF(chip-on-film) which is less than $30{\mu}m$. Developed system contains circuits and technique to detect fast various faults such as hard open, hard short, soft open and soft short from fine pattern. Basic principle for fault detection is to monitor fine differential voltage from pattern resistance differences between fault-free and faulty cases. The technique uses also radio frequency resonator arrays for easy detection to amplify fine differential voltage. We anticipate that proposed system is to be an alternative for conventional COF test systems since it can fast and accurately detect variety of faults from fine pattern COF test process.

Usefullness of the Vibration Pick-Up in Detection of Pitch for Synchronization of Laryngeal Stroboscopy (후두 스트로보스코프 검사의 신호 동기화를 위한 진동 검출기의 유용성)

  • Lee, Jin-Choon;Lee, Byung-Joo;Wang, Soo-Geun;Roh, Jung-Hoon;Kwon, Sun-Bok;Jo, Cheol-Woo
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.18 no.1
    • /
    • pp.26-32
    • /
    • 2007
  • Objective and Background: Laryngeal stroboscope is an useful equipment in evaluation of vocal cord vibration and in early detection of mucosal lesion including invasive cancer of the vocal cord. Recently Lee et al. (2006) developed portable stroboscope using voice as synchronization signal. It has been frequently impaired ability to synchronize the flashes even in normal female. Authors tried to investigate various methods including vibration pick-up, microphone, laryngeal microphone, and contact microphone for development of simple and accurate method like electroglottograph signal. The purpose of this study was to estimate wheher the vibration pick-up is available and is consistent with the signal of EGG. Subjects and Methods: Authors compared the signals between EGG and noncontact method such as voice, contact methods including vibration pick-up, laryngeal microphone, and contact microphone in normal twenty adults (male 10 and female 10). The number of peak in one cycle was compared with the number of the peak in EGG, and the percent of phase difference in the peak was compared with EGG Also, authors tried to investigate which site of vibration pick-up was most effective for synchronization of stobo flashes. Three site including anterior neck below the cricoid cartilage, thyroid ala, and suprahyoid region were analysed. Results: Among various methods for synchronization of strobo flashes, vibration pick-up was most effective method in peak detection. And anterior neck below cricoid cartilage was the most available site of the vibration pick-up. Conclusion: Authors suggest that vibration pick-up is most available and effective method for synchronization of strobo flashes.

  • PDF

Watermarking Algorithm using Power of Subbands Decomposed by Wavelet Packet and QIM (웨이블릿 패킷 변환한 후의 대역별 에너지와 QIM을 이용한 워터마킹 알고리즘)

  • Seo, Ye-Jin;Cho, Sang-Jin;Chong, Ui-Pil
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.11
    • /
    • pp.1431-1437
    • /
    • 2011
  • This paper proposes a novel watermarking algorithm that protects digital copyrights and is robust to attacks. Watermarks are embedded in the subband including the significant part of the signal such as a pitch. Generally, the subband containing the pitch has the biggest energy. In order to find this subband, wavelet packet transform is used to decompose the subbands and their energy are calculated. The signal of the selected subbands is transformed in frequency domain using FFT. The watermarks are embedded using QIM for samples higher than a certain threshold. The blind detection uses the Euclidean distance. The proposed method shows less than 5% BER in the audio watermark benchmarking.

Analysis of Lower-Limb Motion during Walking on Various Types of Terrain in Daily Life

  • Kim, Myeongkyu;Lee, Donghun
    • Journal of the Ergonomics Society of Korea
    • /
    • v.35 no.5
    • /
    • pp.319-341
    • /
    • 2016
  • Objective:This research analyzed the lower-limb motion in kinetic and kinematic way while walking on various terrains to develop Foot-Ground Contact Detection (FGCD) algorithm using the Inertial Measurement Unit (IMU). Background: To estimate the location of human in GPS-denied environments, it is well known that the lower-limb kinematics based on IMU sensors, and pressure insoles are very useful. IMU is mainly used to solve the lower-limb kinematics, and pressure insole are mainly used to detect the foot-ground contacts in stance phase. However, the use of multiple sensors are not desirable in most cases. Therefore, only IMU based FGCD can be an efficient method. Method: Orientation and acceleration of lower-limb of 10 participants were measured using IMU while walking on flat ground, ascending and descending slope and stairs. And the inertial information showing significant changes at the Heel strike (HS), Full contact (FC), Heel off (HO) and Toe off (TO) was analyzed. Results: The results confirm that pitch angle, rate of pitch angle of foot and shank, and acceleration in x, z directions of the foot are useful in detecting the four different contacts in five different walking terrain. Conclusion: IMU based FGCD Algorithm considering all walking terrain possible in daily life was successfully developed based on all IMU output signals showing significant changes at the four steps of stance phase. Application: The information of the contact between foot and ground can be used for solving lower-limb kinematics to estimating an individual's location and walking speed.

Measurement of Micro-displacement of an Object by Laser Speckle using Linear Array CCD Detection System (레이저 스펙클과 1차원 CCD소자를 이용한 물체의 미소변위측정에 관한 연구)

  • 우창헌;민동현;김수용
    • Korean Journal of Optics and Photonics
    • /
    • v.5 no.1
    • /
    • pp.138-143
    • /
    • 1994
  • A speckle correlation method was applied to measure the in-plane translation of a diffuse object which has rough surface using a linear CCD sensor and personal computer. Displacement of a speckle pattern produced from the object illuminated by a laser beam was measured by the cross-correlation functions between the I-D speckle profiles before and after the object translation, which were measured by linear CCD array sensor to be sent to IBM 386 personal computer. The sensitivity of the measurement was dependent on the radius of the wavefront curvature of incident beam as well as the spatial resolution of linear CCD array. A linear CCD array had 15 Jlffi pitch and 1728 pixels. The ratio of the speckle displacement and object translation varied from 1.03 to 5.20. The object translation of $3\mu\textrm{m}$ can be measured br the linear CCD sensor of which pitch was $15\mu\textrm{m}$, when the ratio of the speckle displacement and object translation was 5.20.s 5.20.

  • PDF

A Study on A Multi-Pulse Linear Predictive Filtering And Likelihood Ratio Test with Adaptive Threshold (멀티 펄스에 의한 선형 예측 필터링과 적응 임계값을 갖는 LRT의 연구)

  • Lee, Ki-Yong;Lee, Joo-Hun;Song, Iick-Ho;Ann, Sou-Guil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.1
    • /
    • pp.20-29
    • /
    • 1991
  • A fundamental assumption in conventional linear predictive coding (LPC) analysis procedure is that the input to an all-pole vocal tract filter is white process. In the case of periodic inputs, however, a pitch bias error is introduced into the conventional LP coefficient. Multi-pulse (MP) LP analysis can reduce this bias, provided that an estimate of the excitation is available. Since the prediction error of conventional LP analysis can be modeled as the sum of an MP excitation sequence and a random noise sequence, we can view extracting MP sequences from the prediction error as a classical detection and estimation problem. In this paper, we propose an algorithm in which the locations and amplitudes of the MP sequences are first obtained by applying a likelihood ratio test (LRT) to the prediction error, and LP coefficients free of pitch bias are then obtained from the MP sequences. To verify the performance enhancement, we iterate the above procedure with adaptive threshold at each step.

  • PDF