• Title/Summary/Keyword: Perceptual evaluation

Search Result 248, Processing Time 0.027 seconds

A study of the prosodic patterns of autism and normal children in the imitating declarative and interrogative sentences (따라말하기 과제를 통한 자폐범주성 장애 아동과 일반 아동의 평서문과 의문문의 음향학적 특성 비교)

  • Lee, Jinhyung;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.12 no.2
    • /
    • pp.39-49
    • /
    • 2020
  • The prosody of children with autism spectrum disorders (ASD) has several abnormal features, including monotonous speech. The purpose of this study was to compare acoustic features between an ASD group and a typically developing (TD) group and within the ASD group. The study also examined audience perceptions of the lengthening effect of increasing the number of syllables. 50 participants were divided into two groups (20 with ASD and 30 TD), and they were asked to imitate a total of 28 sentences. In the auditory-perceptual evaluation, seven participants chose sentence types in 115 sentences. Pitch, intensity, speech rate, and pitch slope were used to analyze the significant differences. In conclusion, the ASD group showed higher pitch and intensity and a lower overall speaking rate than the TD group. Moreover, there were significant differences in s2 slope of interrogative sentences. Finally, based on the auditory-perceptual evaluation, only 4.3% of interrogative sentences produced by participants with ASD were perceived as declarative sentences. The cause of this abnormal prosody has not been clearly identified; however, pragmatic ability and other characteristics of autism are related to ASD prosody. This study identified prosodic ASD patterns and suggested the need to develop treatments to improve prosody.

A study on loss combination in time and frequency for effective speech enhancement based on complex-valued spectrum (효과적인 복소 스펙트럼 기반 음성 향상을 위한 시간과 주파수 영역 손실함수 조합에 관한 연구)

  • Jung, Jaehee;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.1
    • /
    • pp.38-44
    • /
    • 2022
  • Speech enhancement is performed to improve intelligibility and quality of the noise-corrupted speech. In this paper, speech enhancement performance was compared using different loss functions in time and frequency domains. This study proposes a combination of loss functions to utilize advantage of each domain by considering both the details of spectrum and the speech waveform. In our study, Scale Invariant-Source to Noise Ratio (SI-SNR) is used for the time domain loss function, and Mean Squared Error (MSE) is used for the frequency domain, which is calculated over the complex-valued spectrum and magnitude spectrum. The phase loss is obtained using the sin function. Speech enhancement result is evaluated using Source-to-Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligibility (STOI). In order to confirm the result of speech enhancement, resulting spectrograms are also compared. The experimental results over the TIMIT database show the highest performance when using combination of SI-SNR and magnitude loss functions.

A study on deep neural speech enhancement in drone noise environment (드론 소음 환경에서 심층 신경망 기반 음성 향상 기법 적용에 관한 연구)

  • Kim, Jimin;Jung, Jaehee;Yeo, Chaneun;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.3
    • /
    • pp.342-350
    • /
    • 2022
  • In this paper, actual drone noise samples are collected for speech processing in disaster environments to build noise-corrupted speech database, and speech enhancement performance is evaluated by applying spectrum subtraction and mask-based speech enhancement techniques. To improve the performance of VoiceFilter (VF), an existing deep neural network-based speech enhancement model, we apply the Self-Attention operation and use the estimated noise information as input to the Attention model. Compared to existing VF model techniques, the experimental results show 3.77%, 1.66% and 0.32% improvements for Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligence (STOI), respectively. When trained with a 75% mix of speech data with drone sounds collected from the Internet, the relative performance drop rates for SDR, PESQ, and STOI are 3.18%, 2.79% and 0.96%, respectively, compared to using only actual drone noise. This confirms that data similar to real data can be collected and effectively used for model training for speech enhancement in environments where real data is difficult to obtain.

A Novel Approach to a Robust A Priori SNR Estimator in Speech Enhancement (음성 향상에서 강인한 새로운 선행 SNR 추정 기법에 관한 연구)

  • Park, Yun-Sik;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.8
    • /
    • pp.383-388
    • /
    • 2006
  • This Paper presents a novel approach to single channel microphone speech enhancement in noisy environments. Widely used noise reduction techniques based on the spectral subtraction are generally expressed as a spectral gam depending on the signal-to-noise ratio (SNR). The well-known decision-directed(DD) estimator of Ephraim and Malah efficiently reduces musical noise under the background noise conditions, but generates the delay of the a prioiri SNR because the DD weights the speech spectrum component of the Previous frame in the speech signal. Therefore, the noise suppression gain which is affected by the delay of the a priori SNR, which is estimated by the DD matches the previous frame rather than the current one, so after noise suppression. this degrades the noise reduction performance during speech transient periods. We propose a computationally simple but effective speech enhancement technique based on the sigmoid type function for the weight Parameter of the DD. The proposed approach solves the delay problem about the main parameter, the a priori SNR of the DD while maintaining the benefits of the DD. Performances of the proposed enhancement algorithm are evaluated by ITU-T p.862 Perceptual Evaluation of Speech duality (PESQ). the Mean Opinion Score (MOS) and the speech spectrogram under various noise environments and yields better results compared with the fixed weight parameter of the DD.

The Effect of Voice Therapy in Unilateral Vocal Fold Paralysis (일측성 성대마비 환자의 음성치료 효과)

  • Lee, Chang-Yoon;An, Soo-Youn;Chang, Hyun;Son, Hee Young
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.27 no.1
    • /
    • pp.45-50
    • /
    • 2016
  • Background and Objectives : This study aims to conduct post-voice therapy to patients with unilateral vocal fold paralysis for vocal improvement, motility recovery and analyze the results. Materials and Methods : Voice therapy was conducted to 13 patients who had shown response to voice therapy amongst 98 patients diagnosed with unilateral vocal fold paralysis. In order to be able compare before and after perceptual, acoustic and aerodynamic evaluations were conducted after voice therapy. Also, by using dysphagia checklist, we have verified whether if the patient had dysphagia prior to voice therapy. The therapy was conducted by improving the larynx movement and glottal contact, whilst removing hypertension of the supraglottic. Results : All 13 patients who underwent voice therapy had shown improvements that are statistically significant from 4 scales excluding the S scale from auditory perception evaluation (p<0.05), with enhanced glottal contact. In acoustic evaluation, Jitter, Shimmer and NHR had shown significant improvement after voice therapy. MPT was also notably improved among aerodynamical evaluation (p<0.001). All 11 patients had with dysphagia prior to voice therapy reported to have improved swallowing functions. Conclusion : Application of adequate voice therapy to patients with unilateral vocal fold paralysis, is an effective method that might be employed in the initial phase. Especially, the voice therapy proposed in this study is expected to be useful for patients in hypertension status due to secondary compensation after initial paralysis, since it focuses on improving vocal symptoms in a calm state with the supraglottis sufficiently relaxed. Also, the therapy is expected to be effective for improving swallowing functions.

  • PDF

Evaluation Strategy of Consumer Perception According to the Game Genre Positining (게임장르별 포지셔닝에 대한 소비자 지각도 평가 전략)

  • Lee, Ji-Hun
    • Journal of Korea Game Society
    • /
    • v.5 no.3
    • /
    • pp.31-38
    • /
    • 2005
  • Consumer perception evaluation depending on the game genre affects many parts of the corporate management including market share, gaining new consumer, maintaining consumer and competition. If consumer perceives a company and a product as bad image, gaining new consumer would be no more available, and enormous amount of time would have to be spent to recover from that bad image. However game companies tend to simply keep spontaneous marketing strategy with the enforcing marketing. Its results will be the short-term success sacrificing the long-term marketing opportunity In order to increase sales and market share, the consumer perception evaluation as well as evaluating the game product and the corporate image is necessary. This article gives emphasis on the general game analysis and formulating strategy in the general game genre rather than a certain corporation and a product. Analyzing a particular product, company, platform and nation is necessary and will be followed subsequently.

  • PDF

Human Thermal Sensation and Comfort of Beach Areas in Summer - Woljeong-ri Beach, Gujwa-eup, Jeju-si, Jeju Special Self-Governing Province - (여름철 해변지역의 인간 열환경지수 및 열쾌적성 - 제주특별자치도 제주시 구좌읍 월정리 해변 -)

  • Park, Sookuk;Sin, Jihwan;Jo, Sangman;Hyun, Cheolji;Kang, Hoon
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.44 no.4
    • /
    • pp.100-108
    • /
    • 2016
  • The climatic index for tourism(CIT) has recently been advanced, which includes complete human energy balance models such as physiological equivalent temperature(PET) and universal thermal climate index(UTCI). This study investigated human thermal sensation and comfort at Woljung-ri Beach, Jeju, Republic of Korea, in spring and summer 2015 for landscape planning and design in beach areas. Microclimatic data measurements and human thermal sensation/comfort surveys from ISO 10551 were conducted together. There were 869 adults that participated. As a result, perceptual and thermal preference that consider only physiological aspects had high coefficients of determination($r^2$) with PET in linear regression analyses: 92.8% and 87.6%, respectively. However, affective evaluation, personal acceptability and personal tolerance, which consider both physiological and psychological aspects, had low $r^2s$: 60.0%, 21.1% and 46.4%, respectively. However, the correlations between them and PET were all significant at the 0.01 level. The neutral PET range in perceptual for human thermal sensation was $25{\sim}27^{\circ}C$, but a PET range less or equal to 20% dissatisfaction, which was recommended by ASHRAE Standard 55, could not be achieved in perceptual. Only PET ranges in affective evaluation and personal tolerance affected by both aspects were qualified for the recommendation as $21{\sim}32^{\circ}C$ and $17{\sim}37^{\circ}C$, respectively. Therefore, the PET range of $21{\sim}32^{\circ}C$ is recommended to be used for the human thermal comfort zone of beach areas in landscape planning and design as well as tourism and recreational planning. PET heat stress level ranges on the beach were $2{\sim}5^{\circ}C$ higher than those in inland urban areas of the Republic of Korea. Also, they were similar to high results of tropical areas such as Taiwan and Nigeria, and higher than those of western and middle Europe and Tel Aviv, Israel.

Constructing Strategic Management Plan for University Foodservice Using Conjoint Analysis and Multidimensional Scaling (컨조인트 분석과 다차원척도법을 이용한 대학급식소의 전략적 운영 방안 모색)

  • Yang, Il-Sun;Shin, Seo-Young;Lee, Hae-Young;Lee, So-Jung;Chae, In-Sook
    • Journal of the Korean Society of Food Culture
    • /
    • v.15 no.1
    • /
    • pp.51-58
    • /
    • 2000
  • This study is designed to 1) understand customers' choice behavior and preference of foodservices in campus and 2) provide recommendation on management strategies for university foodservice manager. Individual interview and focus group interview were used to identify important selection attributes. The questionnaire was developed and distributed to 480 Yonsei university students and statistical data analysis was completed using SPSS WIN/7.5 for descriptive analysis, multidimensional scaling and conjoint analysis. The results of this study were summarized as follows: Students evaluated four foodservices in different ways, and strength/weakness points could be identified from the evaluation patterns. Most students(51.1%) were frequently used 'A' foodservice, though they preferred other foodservices, and cost, mainly, caused the difference. Perceptual map from multidimensional scaling showed that preference and patronage were close with different attributes. Cost was most relatively important attribute to select foodservice in campus from conjoint analysis. Therefore, relative importance of attributes should be considered in customer preference survey for constructing management plan.

  • PDF

Two-Microphone Binary Mask Speech Enhancement in Diffuse and Directional Noise Fields

  • Abdipour, Roohollah;Akbari, Ahmad;Rahmani, Mohsen
    • ETRI Journal
    • /
    • v.36 no.5
    • /
    • pp.772-782
    • /
    • 2014
  • Two-microphone binary mask speech enhancement (2mBMSE) has been of particular interest in recent literature and has shown promising results. Current 2mBMSE systems rely on spatial cues of speech and noise sources. Although these cues are helpful for directional noise sources, they lose their efficiency in diffuse noise fields. We propose a new system that is effective in both directional and diffuse noise conditions. The system exploits two features. The first determines whether a given time-frequency (T-F) unit of the input spectrum is dominated by a diffuse or directional source. A diffuse signal is certainly a noise signal, but a directional signal could correspond to a noise or speech source. The second feature discriminates between T-F units dominated by speech or directional noise signals. Speech enhancement is performed using a binary mask, calculated based on the proposed features. In both directional and diffuse noise fields, the proposed system segregates speech T-F units with hit rates above 85%. It outperforms previous solutions in terms of signal-to-noise ratio and perceptual evaluation of speech quality improvement, especially in diffuse noise conditions.

A Study on the Effect of Pre-cue in Simple Reactions on Control-on-Display Interfaces

  • Lim, Ji-Hyoun;Choi, Jun-Young;Kim, Young-Su
    • Journal of the Ergonomics Society of Korea
    • /
    • v.30 no.4
    • /
    • pp.563-569
    • /
    • 2011
  • Objective: This study focuses on the effects of pre-cues informing the location of upcoming visual stimulus on finger movement response in the context of control-on-display interfaces. Background: Previous research on pre-cues focus on attention allocation and motion studies were limited to indirect control conditions. The design of this study aimed to collect data on the exact landing point for finger-tap responses to a given visual stimulus. Method: Controlled visual stimuli and tasks were presented on a UI evaluation system built using mobile web standards; response accuracy and response time were measured and collected as appropriate. Among the 16 recruited participants, 11 completed the experiment. Results: Providing pre-cue on the location of stimulus affected response time and response accuracy. The response bias, which is a distance from the center of stimulus to the finger-tap location, was larger when the pre-cue was given during a one-handed operation. Conclusion: Given a pre-cue, response time decreases, but with accuracy penalized. Application: In designing touch-screen UI's - more strictly, visual components also acting as controllers - designers would do well to balance human perceptual and cognitive characteristics strategically.