Search | Korea Science

Speaker Recognition Performance Improvement by Voiced/Unvoiced Classification and Heterogeneous Feature Combination (유/무성음 구분 및 이종적 특징 파라미터 결합을 이용한 화자인식 성능 개선)

Kang, Jihoon;Jeong, Sangbae
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.18 no.6
- /
- pp.1294-1301
- /
- 2014
In this paper, separate probabilistic distribution models for voiced and unvoiced speech are estimated and utilized to improve speaker recognition performance. Also, in addition to the conventional mel-frequency cepstral coefficient, skewness, kurtosis, and harmonic-to-noise ratio are extracted and used for voiced speech intervals. Two kinds of scores for voiced and unvoiced speech are linearly fused with the optimal weight found by exhaustive search. The performance of the proposed speaker recognizer is compared with that of the conventional recognizer which uses mel-frequency cepstral coefficient and a unified probabilistic distribution function based on the Gassian mixture model. Experimental results show that the lower the number of Gaussian mixture, the greater the performance improvement by the proposed algorithm.
https://doi.org/10.6109/jkiice.2014.18.6.1294 인용 PDF KSCI

Convolutional Neural Network and Data Mutation for Time Series Pattern Recognition (컨벌루션 신경망과 변종데이터를 이용한 시계열 패턴 인식)

Ahn, Myong-ho;Ryoo, Mi-hyeon
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2016.05a
- /
- pp.727-730
- /
- 2016
TSC means classifying time series data based on pattern. Time series data is quite common data type and it has high potential in many fields, so data mining and machine learning have paid attention for long time. In traditional approach, distance and dictionary based methods are quite popular. but due to time scale and random noise problems, it has clear limitation. In this paper, we propose a novel approach to deal with these problems with CNN and data mutation. CNN is regarded as proven neural network model in image recognition, and could be applied to time series pattern recognition by extracting pattern. Data mutation is a way to generate mutated data with different methods to make CNN more robust and solid. The proposed method shows better performance than traditional approach.
PDF

Classification of Infant Crying Audio based on 3D Feature-Vector through Audio Data Augmentation

JeongHyeon Park;JunHyeok Go;SiUng Kim;Nammee Moon
- Journal of the Korea Society of Computer and Information
- /
- v.28 no.9
- /
- pp.47-54
- /
- 2023
Infants utilize crying as a non-verbal means of communication [1]. However, deciphering infant cries presents challenges. Extensive research has been conducted to interpret infant cry audios [2,3]. This paper proposes the classification of infant cries using 3D feature vectors augmented with various audio data techniques. A total of 5 classes (belly pain, burping, discomfort, hungry, tired) are employed in the study dataset. The data is augmented using 5 techniques (Pitch, Tempo, Shift, Mixup-noise, CutMix). Tempo, Shift, and CutMix augmentation techniques demonstrated improved performance. Ultimately, applying effective data augmentation techniques simultaneously resulted in a 17.75% performance enhancement compared to models using single feature vectors and original data.
https://doi.org/10.9708/jksci.2023.28.09.047 인용 PDF HTML

Tonality Design for Sound Quality Evaluation in Printer (프린터 음질평가를 위한 순음도 설계)

Kim, Eui-Youl;Lee, Young-Jun;Lee, Sang-Kwon
- Transactions of the Korean Society for Noise and Vibration Engineering
- /
- v.22 no.4
- /
- pp.318-327
- /
- 2012
The operating sound radiated from a laser printer includes tonal noise components caused by the rotating mechanical parts such as gear, shaft, motor, fan, etc. The negative effects of the tonal noise components need to be considered in the process of developing a sound quality index for the quantitative evaluation of the emotional satisfaction in terms of psycho-acoustics. However, in a previous paper, it was confirmed that the Aures tonality did not have enough correlation with the results of jury evaluation. The sound quality index based on loudness, articulation index, fluctuation strength has a little problem in considering the effects of rotating mechanical parts on the sound quality. In this paper, to solve the tonality evaluation problem, the calculation algorithm of Aures tonality was investigated in detail to find the cause of decreasing the correlation. The new tonality evaluation model was proposed by modifying and optimizing the masking effect, loudness ratio, and shape of weighting curve based on the basic algorithm of Aures tonality, and applied to two kinds of operating sound groups in order to verify the usefulness of proposed model. As a result, it is confirmed that the proposed tonality evaluation model has enough correlation and usefulness for expressing the tonalness in the operating sounds of laser printers. In the following paper, this results will be used to model the sound quality index as the input data by using the classification algorithm.
https://doi.org/10.5050/KSNVE.2012.22.4.318 인용 PDF KSCI

Noise-Robust Porcine Respiratory Diseases Classification Using Texture Analysis and CNN (질감 분석과 CNN을 이용한 잡음에 강인한 돼지 호흡기 질병 식별)

Choi, Yongju;Lee, Jonguk;Park, Daihee;Chung, Yongwha
- KIPS Transactions on Software and Data Engineering
- /
- v.7 no.3
- /
- pp.91-98
- /
- 2018
Automatic detection of pig wasting diseases is an important issue in the management of group-housed pigs. In particular, porcine respiratory diseases are one of the main causes of mortality among pigs and loss of productivity in intensive pig farming. In this paper, we propose a noise-robust system for the early detection and recognition of pig wasting diseases using sound data. In this method, first we convert one-dimensional sound signals to two-dimensional gray-level images by normalization, and extract texture images by means of dominant neighborhood structure technique. Lastly, the texture features are then used as inputs of convolutional neural networks as an early anomaly detector and a respiratory disease classifier. Our experimental results show that this new method can be used to detect pig wasting diseases both economically (low-cost sound sensor) and accurately (over 96% accuracy) even under noise-environmental conditions, either as a standalone solution or to complement known methods to obtain a more accurate solution.
https://doi.org/10.3745/KTSDE.2018.7.3.91 인용 PDF KSCI

Comparison of the Effect of Music and Noise Blocking on Postoperative Pain, Length of Stay at Post Anesthetic Care Unit and Satisfaction after a Laparoscopic Colectomy (음악요법과 소음차단요법이 수술 후 통증, 진통제 투여량, 회복실 체류시간 및 만족도에 미치는 효과 비교)

Seo, Eunju;Yoon, Haesang
- Journal of Korean Biological Nursing Science
- /
- v.17 no.4
- /
- pp.315-323
- /
- 2015
Purpose: This study compared the effect of music and noise blocking on the vital signs, postoperative pain, analgesic use, length of stay in the Post Anesthesia Care Unit (PACU) and satisfaction after a laparoscopic colectomy. Methods: This randomized controlled trial was performed in a 555-bed National Cancer Center, from February 13 through May 31, 2012. Subjects consisted of 69 patients who underwent a laparoscopic colectomy under general anesthesia, and were recruited by informed notices. The inclusion criteria were patients between the ages of 35-75, with an American Society Anesthesiologist physical classification I or II. The subjects were randomly allocated to three groups; music therapy group (MTG), noise blocking group (NBG) and control group (CG). Collected data were analyzed using Repeated measures ANOVA, one-way ANOVA and Kruskal-Wallis test through IBM SPSS (Version 19.0). Results: There were no significant differences in vital signs among the three groups. Postoperative pain in MTG (p<.05) and NBG (p<.05) was significantly decreased compared to CG. The amount of analgesics (p=.030) and length of stay at PACU (p=.021) in MTG was significantly decreased compared to NBG or CG; satisfaction in MTG and NBG was significantly higher compared to CG. Conclusion: Music seems to reduce postoperative pain, the amount of analgesics, and the length of stay at PACU. Therefore, music therapy is considered to be included in nursing intervention for postoperative patients at PACU.
https://doi.org/10.7586/jkbns.2015.17.4.315 인용 PDF KSCI

Low-dose CT Image Denoising Using Classification Densely Connected Residual Network

Ming, Jun;Yi, Benshun;Zhang, Yungang;Li, Huixin
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.14 no.6
- /
- pp.2480-2496
- /
- 2020
Considering that high-dose X-ray radiation during CT scans may bring potential risks to patients, in the medical imaging industry there has been increasing emphasis on low-dose CT. Due to complex statistical characteristics of noise found in low-dose CT images, many traditional methods are difficult to preserve structural details effectively while suppressing noise and artifacts. Inspired by the deep learning techniques, we propose a densely connected residual network (DCRN) for low-dose CT image noise cancelation, which combines the ideas of dense connection with residual learning. On one hand, dense connection maximizes information flow between layers in the network, which is beneficial to maintain structural details when denoising images. On the other hand, residual learning paired with batch normalization would allow for decreased training speed and better noise reduction performance in images. The experiments are performed on the 100 CT images selected from a public medical dataset-TCIA(The Cancer Imaging Archive). Compared with the other three competitive denoising algorithms, both subjective visual effect and objective evaluation indexes which include PSNR, RMSE, MAE and SSIM show that the proposed network can improve LDCT images quality more effectively while maintaining a low computational cost. In the objective evaluation indexes, the highest PSNR 33.67, RMSE 5.659, MAE 1.965 and SSIM 0.9434 are achieved by the proposed method. Especially for RMSE, compare with the best performing algorithm in the comparison algorithms, the proposed network increases it by 7 percentage points.
https://doi.org/10.3837/tiis.2020.06.009 인용 PDF KSCI HTML

A Study on Indoor Environment Performances of Power Yacht in Summer Season (여름철 파워요트 실내환경 성능에 관한 연구)

Lee, Han-Seok;Doe, Guen-Young;Lim, Duck-Min;Kim, Hak-Chul
- Journal of Navigation and Port Research
- /
- v.33 no.3
- /
- pp.175-180
- /
- 2009
In this study, the basic data were collected for improving the amenity of indoor environment of a super yacht and the performance of indoor environment was analyzed by utilizing measured data during summer period. Through the results of examination, the following conclusions are drawn. 1) It is estimated that, in case of closing the door of Saloon connected with outside, there is little inflow of exhaust gas, but when the door is open, the indoor-air might be polluted so fast. Therefore, it is necessary to make a counter plan about the method of ventilation and amount of ventilation to keep the indoor aerial environment agreeable. 2) It is urgent to conceive countermeasure against engine noise because the noise level of all rooms exceeds 60dB, which is regulation of noise for protecting crew established in ship's classification, during the sailing. 3) State cabin and Guest cabin are super cooled by operating air conditioner exceeding agreeableness extent and it is needed to prevent them.
https://doi.org/10.5394/KINPR.2009.33.3.175 인용 PDF KSCI

Types of Hazardous Factors and Time-trend of Exposure Levels from the Working Environment at a Shock Absorber Manufacturing Facility (자동차 쇼크업소바 제조사업장의 작업자 노출 유해인자의 종류 및 노출수준의 경시적 변화)

Na, Gyu-Chae;Moon, Chan-Seok
- Journal of Korean Society of Occupational and Environmental Hygiene
- /
- v.28 no.4
- /
- pp.393-405
- /
- 2018
Objective: This study examines the types of hazardous factors in the working environment and the time-trend for their exposure levels over 10 years (2007 to 2016). Study Design and Method: The types of hazardous factors and exposure levels were drawn from the 19 measurement reports on the working environment over 10 years at a shock absorber manufacturing facility. Risk assessment of the types of factors and time-trend of exposure levels were evaluated using the factors and exposure levels. Results: A total of 34 hazardous factors were evaluated. The types were noise, 15 organic compounds, seven kinds of acid sand alkalis, eight kinds of heavy metals, and three other compounds. Special management materials used were nickel, hexavalent chrome, and sulfuric acid. Human carcinogens (1A) used were trichloroethylene, nickel, and sulfuric acid. There were six types of substances belonging to the IARC's 2B (body carcinogens) classification or higher, including, methyl isobutyl ketone, ethyl benzene, and trichloroethylene. No detection was found for 627 out of the 2065 total measurements in 19 exposure survey reports, representing 30.4%. Organic solvents, acid and alkali products, and heavy metals showed continuous low exposure concentrations. Noise, welding fumes, and the evaluation of mixed solvents show a gradual decrease in geometric mean and maximum over the time-trend of 10 years. Conclusions: In the case of a shock absorber manufacturing facility, the hazardous factors of noise and the evaluation of mixed solvents still indicate high concentrations exceeding the exposure limits and necessitate reduction studies. These two factors and welding fumes showed a continuous decrease in their ten-year tendency. Organic compounds, acids/alkalis, and heavy metals were managed smoothly in a work environment of continuous low concentrations.
https://doi.org/10.15269/JKSOEH.2018.28.4.393 인용 PDF KSCI HTML

Switching Filter Algorithm using Fuzzy Weights based on Gaussian Distribution in AWGN Environment (AWGN 환경에서 가우시안 분포 기반의 퍼지 가중치를 사용한 스위칭 필터 알고리즘)

Cheon, Bong-Won;Kim, Nam-Ho
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.26 no.2
- /
- pp.207-213
- /
- 2022
Recently, with the improvement of the performance of IoT technology and AI, automation and unmanned work are progressing in a wide range of fields, and interest in image processing, which is the basis of automation such as object recognition and object classification, is increasing. Image noise removal is an important process used as a preprocessing step in an image processing system, and various studies have been conducted. However, in most cases, it is difficult to preserve detailed information due to the smoothing effect in high-frequency components such as edges. In this paper, we propose an algorithm to restore damaged images in AWGN(additive white Gaussian noise) using fuzzy weights based on Gaussian distribution. The proposed algorithm switched the filtering process by comparing the filtering mask and the noise estimate with each other, and reconstructed the image by calculating the fuzzy weights according to the low-frequency and high-frequency components of the image.
https://doi.org/10.6109/jkiice.2022.26.2.207 인용 PDF KSCI

Search Result 670, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)