Search | Korea Science

Implementation of Quad Variable Rates ADPCM Speech CODEC on C6000 DSP considering the Environmental Noise (배경잡음을 고려한 4배 가변 압축률을 갖는 ADPCM의 C6000 DSP 실시간 구현)

Kim Dae-Sung;Han Kyong-ho
- Proceedings of the KIPE Conference
- /
- 2002.07a
- /
- pp.727-729
- /
- 2002
In this paper, we proposed quad variable rates ADPCM coding method and its implementation on C6000 DSP, which is modified from the standard ADPCM of ITU G.726 for speech quality improvement considering the environmental noise Four coding rates, 16Kbps, 24Kbps, 32Kbps and 40Kbps are used for speech window samples and the rate decision threshold is decided by the environmental noise level. The object of the proposed method is to reduce the coding rate while retaining the speech quality and the speech quality is considerably close to 40Kbps single rate coder with the coding rate close to 16Kbps single rate coder under the environmental noise. The environmental noise level affects the coding rate and the noise level is calculated per every speech window samples. At high noise level, more samples are coded at higher rates to enhance the quality, but at low noise level, only the big speech signals are coded at higher rates and more speech samples are coded at lower coding rates to reduce the coding rates. The influence of the noise on tile speech signal is considerably high for small signals and the small signal has the higher ZCR (zero crossing rate). The method is simulated in PC and to be implemented on C6000 floating point DSP board in real time operations.
PDF

Real-Time Implementation of Speaker Dependent Speech Recognition Hardware Module Using the TMS320C32 DSP : VR32 (TMS320C32 DSP를 이용한 실시간 화자종속 음성인식 하드웨어 모듈(VR32) 구현)

Chung, Ik-Joo;Chung, Hoon
- The Journal of the Acoustical Society of Korea
- /
- v.17 no.4
- /
- pp.14-22
- /
- 1998
본 연구에서는 Texas Instruments 사의 저가형 부동소수점 디지털 신호 처리기 (Digital Singnal Processor, DSP)인 TMS320C32를 이용하여 실시간 화자종속 음성인식 하 드웨어 모듈(VR32)을 개발하였다. 하드웨어 모듈의 구성은 40MHz의 TMS320C32 DSP, 14bit 코덱인 TLC32044(또는 8bit μ-law PCM 코덱), EPROM과 SRAM 등의 메모리와 호 스트 인터페이스를 위한 로직 회로로 이루어졌다. 뿐만 아니라 이 하드웨어 모듈을 PC사에 서 평가해보기 위한 PC 인터페이스용 보드 및 소프트웨어도 개발하였다. 음성인식 알고리 즘의 구성은 에너지와 ZCR을 기반으로 한 끝점검출(Endpoint Detection) 침 10차 가중 LPC 켑스터럼(Weighted LPC Cepstrum) 분석이 실시간으로 이루어지며 이후 Dynamic Time Warping(DTW)를 통하여 최고 유사 단어를 결정하고 다시 검증과정을 거쳐 최종 인식을 수행한다. 끝점검출의 경우 적응 문턱값(Adaptive threshold)을 이용하여 잡음에 강인한 끝 점검출이 가능하며 DTW 알고리즘의 경우 C 및 어셈블리를 이용한 최적화를 통하여 계산 속도를 대폭 개선하였다. 현재 인식률은 일반 사무실 환경에서 통상 단축다이얼 용도로 사 용할 수 있는 30 단어에 대하여 95% 이상으로 매우 높은 편이며, 특히 배경음악이나 자동 차 소음과 같은 잡음환경에서도 잘 동작한다.
PDF

Fault Detection for Ceramic Heater in CVD Equipment using Zero-Crossing Rate and Gaussian Mixture Model (영교차율과 가우시안 혼합모델을 이용한 박막증착장비의 세라믹 히터 결함 검출)

Ko, JinSeok;Mu, XiangBin;Rheem, JaeYeol
- Journal of the Semiconductor & Display Technology
- /
- v.12 no.2
- /
- pp.67-72
- /
- 2013
Temperature is a critical parameter in yield improvement for wafer manufacturing. In chemical vapor deposition (CVD) equipment, crack defect in ceramic heater leads to yield reduction, however, there is no suitable ceramic heater fault detection system for conventional CVD equipment. This paper proposes a short-time zero-crossing rate based fault detection method for the ceramic heater in CVD equipment. The proposed method measures the output signal ($V_{pp}$) of RF filter and extracts the zero-crossing rate (ZCR) as feature vector. The extracted feature vectors have a discriminant power and Gaussian mixture model (GMM) based fault detection method can detect fault in ceramic heater. Experimental results, carried out by measured signals provided by a CVD equipment manufacturer, indicate that the proposed method detects effectively faults in various process conditions.
PDF KSCI

Emotion Recognition in Arabic Speech from Saudi Dialect Corpus Using Machine Learning and Deep Learning Algorithms

Hanaa Alamri;Hanan S. Alshanbari
- International Journal of Computer Science & Network Security
- /
- v.23 no.8
- /
- pp.9-16
- /
- 2023
Speech can actively elicit feelings and attitudes by using words. It is important for researchers to identify the emotional content contained in speech signals as well as the sort of emotion that resulted from the speech that was made. In this study, we studied the emotion recognition system using a database in Arabic, especially in the Saudi dialect, the database is from a YouTube channel called Telfaz11, The four emotions that were examined were anger, happiness, sadness, and neutral. In our experiments, we extracted features from audio signals, such as Mel Frequency Cepstral Coefficient (MFCC) and Zero-Crossing Rate (ZCR), then we classified emotions using many classification algorithms such as machine learning algorithms (Support Vector Machine (SVM) and K-Nearest Neighbor (KNN)) and deep learning algorithms such as (Convolution Neural Network (CNN) and Long Short-Term Memory (LSTM)). Our Experiments showed that the MFCC feature extraction method and CNN model obtained the best accuracy result with 95%, proving the effectiveness of this classification system in recognizing Arabic spoken emotions.
https://doi.org/10.22937/IJCSNS.2023.23.8.2 인용 PDF

Intelligent Adaptive Active Noise Control in Non-stationary Noise Environments (비정상 잡음환경에서의 지능형 적응 능동소음제어)

Mu, Xiangbin;Ko, JinSeok;Rheem, JaeYeol
- The Journal of the Acoustical Society of Korea
- /
- v.32 no.5
- /
- pp.408-414
- /
- 2013
The famous filtered-x least mean square (FxLMS) algorithm for active noise control (ANC) systems may become unstable in non-stationary noise environment. To solve this problem, Sun's algorithm and Akhtar's algorithm are developed based on modifying the reference signal in update of FxLMS algorithm, but these two algorithms have dissatisfactory stability in dealing with sustaining impulsive noise. In proposed algorithm, probability estimation and zero-crossing rate (ZCR) control are used to improve the stability and performance, at the same time, an optimal parameter selection based on fuzzy system is utilized. Computer simulation results prove the proposed algorithm has faster convergence and better stability in non-stationary noise environment.
https://doi.org/10.7776/ASK.2013.32.5.408 인용 PDF KSCI

근육 피로도 분석시 사용되는 매개변수들간의 민감도 비교 연구

정명철;김정룡
- Proceedings of the ESK Conference
- /
- 1997.10a
- /
- pp.406-413
- /
- 1997
근전도(EMG:Electromyogram)를 사용하여 국부 근육 피로(Localized Muscle Fatigue)를 정량화으로 분석 하기 위해 널리 이용되고 있는 AR(Autoregressive)모델의 1차 계수, RMS(Root Mean Square), ZCR(Zero Crossing Rate), MPF(Mean Power Frequency), MF(Median Frequency)를 선택하여, 근육이 발휘하는 힘과 시간의 흐름에 따라 근육 피로의 정도를 민감하게 나타내는 매개변수를 규명하였다. 피실험자 10명의 좌우 척추세움근(Erector Spinae Muscle)을 대상으로 등장수축(Sustained Isometric Contraction)조건에서 허리의 신전(Extension)운동을 실시하였다. 이때 발휘해야 하는 힘의 수준은 15%, 30%, 45%, 60%, 75% MVC 로 정하였고 각 수준마다 20초 동안 근전도를 측정하 였다. 데이터 분석은 총20초 구간의 근전도를 0.5초 간격으로 나누어 매개변수들을 각각 구하고 분석을 실시하였다. 시간의 흐름에 대한 피로도 분석 결과, AR 모델의 1차 계수와 MPF가 유의한 차이를 보였으며, 낮은 수준의 %MVC에서는 AR 계수가, 높은 수준에서는 MPF가 민감한 반응 결과를 나타냈다. 그리고 근육이 발휘하는 힘의 정도를 분석하기 위해 주로 사용되고 있는 RMS 보다는 더 AR 계수가 모든 수준에서 뚜렷하게 차이를 보인 것이 확인되었다. 따라서 AR 모델의 1차 계수가 근육의 피로 정도와 힘의 수준을 다른 매개변수에 비해 더욱 민감하게 구별함이 입증되었다. 이러한 결과는 다른 분야에서도 근육 피로를 정량적으로 측정하는데 사용될 수 있을 것으로 생각되며, 개인적 변이도를 고려한 확률 기법을 사용한다면 보다 정확한 근전도 분석이 이루어질 것으로 기대된다.있음을 알 수 있었다. 사료된다.의 결과는 자전거 에르고노미터의 결과가 트레드밀의 결과에 87.60%정도 나타났다.음을 관찰하였다. 특히 vitamin C와 E의 병용투여는 상승적으로 적용하여 간세포손상을 더욱 억제시킴을 알 수 있었다.mance and on TFP(Total Factor Productivity) growth which is a pure measure of firm performance. To utilize the advantage of panel data, FEM(Fixed Effect Model) and REM(Random Effect Model) were used. The empirical result shows that the entropy index as a measurement of inter-business relatedness is not significant but technological relatedness index is significant. OLS estimates on pooled data were considerably different from FEM or REM estimates on panel data. By introducing interaction effect among the three variables for business portfolio properties, we obtained three findings. First, only VI (Vertical integration) has a significant positive correlation with ROS. Second, when using TFP growth as an dependent variable, both TR(Technological Relatedness) and f[ are signif
PDF

A Study on the Improvement of DTW with Speech Silence Detection (음성의 묵음구간 검출을 통한 DTW의 성능개선에 관한 연구)

Kim, Jong-Kuk;Jo, Wang-Rae;Bae, Myung-Jin
- Speech Sciences
- /
- v.10 no.4
- /
- pp.117-124
- /
- 2003
Speaker recognition is the technology that confirms the identification of speaker by using the characteristic of speech. Such technique is classified into speaker identification and speaker verification: The first method discriminates the speaker from the preregistered group and recognize the word, the second verifies the speaker who claims the identification. This method that extracts the information of speaker from the speech and confirms the individual identification becomes one of the most efficient technology as the service via telephone network is popularized. Some problems, however, must be solved for the real application as follows; The first thing is concerning that the safe method is necessary to reject the imposter because the recognition is not performed for the only preregistered customer. The second thing is about the fact that the characteristic of speech is changed as time goes by, So this fact causes the severe degradation of recognition rate and the inconvenience of users as the number of times to utter the text increases. The last thing is relating to the fact that the common characteristic among speakers causes the wrong recognition result. The silence parts being included the center of speech cause that identification rate is decreased. In this paper, to make improvement, We proposed identification rate can be improved by removing silence part before processing identification algorithm. The methods detecting speech area are zero crossing rate, energy of signal detect end point and starting point of the speech and process DTW algorithm by using two methods in this paper. As a result, the proposed method is obtained about 3% of improved recognition rate compare with the conventional methods.
PDF

The Study on Workload Reducing Effects of Multi-Elastic Insoles (다탄성 Insole의 Workload 감소 효과에 관한 연구)

Lee, Chang-Min;Lee, Kyun-Deuk;Oh, Yeon-Ju;Kim, Jin-Hoon
- Journal of the Ergonomics Society of Korea
- /
- v.26 no.2
- /
- pp.157-165
- /
- 2007
The Work-Related Musculoskeletal Disorders (WMSDs) can be occurred by various factors such as repetition, forceful exertions and awkward postures. Especially, occurrences of the WMSDs on the waist and lower limb are reported in workplaces, demanded standing postures for a long time, in service and manufacturing industry. The static and standing postures without movement for a long time increase work loads to the lower limb and the waist. Accordingly, anti-fatigue mat or anti-fatigue insole is used as a preventing device of the WMSDs. However anti-fatigue mats are limited in space and movement. In this study, multi-elastic insoles are designed and shown the effects of the workload reduction for a long time under the standing work. The foot pressures and EMG (Electromyography) are measured at 0 hour and after 2 hours by 6 health students in their twenties. The 6 prototype insoles are designed with three elastic (Low, Medium and High). These insoles are compared with no insole (insole type 7) as control group. The EMG measurement was conducted to waist (erector spinae muscle), thigh (vastus lateralis muscle) and calf (gastrocnemius muscle). The foot pressure is analyzed by mean pressure value and the EMG analysis is investigated through MF (Median Frequency), MPF (Mean Power Frequency) and ZCR (Zero Crossing Rate). The results of the foot pressure show that the multi-elastic insoles had smaller foot pressure value than that of no-insole. Moreover, Insole 2 and Insole 3 have the smallest increasing rate in foot pressure. The EMG results show that the multi-elastic insoles had smaller EMG shift value than that of no-insole in 2 hour, and then shift value shows the smallest value in Insole 2. Therefore, this study presents that the multi-elastic insoles have reducing effects of the work load for a long time standing work in both side of foot pressure and EMG.
https://doi.org/10.5143/JESK.2007.26.2.157 인용 PDF KSCI

Auto Frame Extraction Method for Video Cartooning System (동영상 카투닝 시스템을 위한 자동 프레임 추출 기법)

Kim, Dae-Jin;Koo, Ddeo-Ol-Ra
- The Journal of the Korea Contents Association
- /
- v.11 no.12
- /
- pp.28-39
- /
- 2011
While the broadband multimedia technologies have been developing, the commercial market of digital contents has also been widely spreading. Most of all, digital cartoon market like internet cartoon has been rapidly large so video cartooning continuously has been researched because of lack and variety of cartoon. Until now, video cartooning system has been focused in non-photorealistic rendering and word balloon. But the meaningful frame extraction must take priority for cartooning system when applying in service. In this paper, we propose new automatic frame extraction method for video cartooning system. At frist, we separate video and audio from movie and extract features parameter like MFCC and ZCR from audio data. Audio signal is classified to speech, music and speech+music comparing with already trained audio data using GMM distributor. So we can set speech area. In the video case, we extract frame using general scene change detection method like histogram method and extract meaningful frames in the cartoon using face detection among the already extracted frames. After that, first of all existent face within speech area image transition frame extract automatically. Suitable frame about movie cartooning automatically extract that extraction image transition frame at continuable period of time domain.
https://doi.org/10.5392/JKCA.2011.11.12.028 인용 PDF KSCI

Search Result 59, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)