• Title/Summary/Keyword: computer science

Search Result 31,704, Processing Time 0.063 seconds

A study on loss combination in time and frequency for effective speech enhancement based on complex-valued spectrum (효과적인 복소 스펙트럼 기반 음성 향상을 위한 시간과 주파수 영역 손실함수 조합에 관한 연구)

  • Jung, Jaehee;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.1
    • /
    • pp.38-44
    • /
    • 2022
  • Speech enhancement is performed to improve intelligibility and quality of the noise-corrupted speech. In this paper, speech enhancement performance was compared using different loss functions in time and frequency domains. This study proposes a combination of loss functions to utilize advantage of each domain by considering both the details of spectrum and the speech waveform. In our study, Scale Invariant-Source to Noise Ratio (SI-SNR) is used for the time domain loss function, and Mean Squared Error (MSE) is used for the frequency domain, which is calculated over the complex-valued spectrum and magnitude spectrum. The phase loss is obtained using the sin function. Speech enhancement result is evaluated using Source-to-Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligibility (STOI). In order to confirm the result of speech enhancement, resulting spectrograms are also compared. The experimental results over the TIMIT database show the highest performance when using combination of SI-SNR and magnitude loss functions.

Design and Implementation of a Systemic Learner-centered Teaching Method Model - Focusing on H University - (체계적인 학습자 중심의 교수법 모델 개발 및 구현 - H 대학을 중심으로 -)

  • Kim, Sun-Hee;Cho, Young-Sik;Kim, Bo-Young;Han, Yong-Su
    • Journal of Korea Entertainment Industry Association
    • /
    • v.15 no.5
    • /
    • pp.163-173
    • /
    • 2021
  • This study tried to develop and implement a class model that can apply the teaching method that can operate learner-centered classes in university education to the class operation of the entire university, not individuals. For the development of the instructional model, the final model was derived through analysis of prior research, expert review, derivation of instructional model and design principles, pilot operation, primary questionnaire analysis, model and design strategy revision, and secondary questionnaire analysis. Shift_N+1 class consists of 6 models, and each model was divided into 3 parts. It was a preliminary learning using video, a face-to-face class for question-and-answer and in-depth learning on the core content, and feedback and process evaluation for individual student. We have built our own computer system so that we can implement this every week. The teaching method model that can apply the learner-centered curriculum to all classes at the university was standardized. The Shift_N+1 teaching method seeks to maximize the learner-centered learning effect by reflecting the characteristics of the subject, and to improve the quality of education by identifying students' achievements by week.

Integrated receptive field diversification method for improving speaker verification performance for variable-length utterances (가변 길이 입력 발성에서의 화자 인증 성능 향상을 위한 통합된 수용 영역 다양화 기법)

  • Shin, Hyun-seo;Kim, Ju-ho;Heo, Jungwoo;Shim, Hye-jin;Yu, Ha-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.3
    • /
    • pp.319-325
    • /
    • 2022
  • The variation of utterance lengths is a representative factor that can degrade the performance of speaker verification systems. To handle this issue, previous studies had attempted to extract speaker features from various branches or to use convolution layers with different receptive fields. Combining the advantages of the previous two approaches for variable-length input, this paper proposes integrated receptive field diversification that extracts speaker features through more diverse receptive field. The proposed method processes the input features by convolutional layers with different receptive fields at multiple time-axis branches, and extracts speaker embedding by dynamically aggregating the processed features according to the lengths of input utterances. The deep neural networks in this study were trained on the VoxCeleb2 dataset and tested on the VoxCeleb1 evaluation dataset that divided into 1 s, 2 s, 5 s, and full-length. Experimental results demonstrated that the proposed method reduces the equal error rate by 19.7 % compared to the baseline.

A Cooperative Security Gateway cooperating with 5G+ network for next generation mBcN (차세대 mBcN을 위한 5G+ 연동보안게이트웨이)

  • Nam, Gu-Min;Kim, Hyoungshick;Lee, Hyun-Jin;Cho, Hark-Su
    • Journal of Internet Computing and Services
    • /
    • v.22 no.6
    • /
    • pp.129-140
    • /
    • 2021
  • The next generation mBcN should be built to cooperate with the wireless network to support hyper-speed and hyper-connectivity. In this paper, we propose a network architecture for the cooperation mBcN and 5G commercial network and architecture of the cooperative security gateway required for the cooperation. The proposed cooperative security gateway is between gNB and UPF to support LBO, SFC, and security. Our analysis shows that the proposed architecture has several advantages. First of all, user equipment connected with the mBcN can be easily connected through the 5G commercial radio network to the mBcN. Second, the military application traffic can be transmitted to mBcN without going through the 5G core network, reducing the end-to-end transmission delay without causing the traffic load on the 5G core network. In addition, the security level of the military application can effectively be maintained because the user equipment can be connected to the cooperative security gateway, and the traffic generated by the user equipment is transmitted to the mBcN without going through the 5G core network. Finally, we demonstrate that LBO, SFC, and security modules are essential functions of the proposed gateway in the 5G test-bed environment.

Comparison of Cognitive Response Time according to Ageing and Cognitive Ability (노화 및 인지 능력에 따른 인지반응시간 비교)

  • Kim, Eun-Mi;Kim, Jung-Wan
    • Therapeutic Science for Rehabilitation
    • /
    • v.10 no.4
    • /
    • pp.81-94
    • /
    • 2021
  • Objective : Response time plays a prominent part in research on cognitive ability and the aging effect. This study aimed to identify the impact of cognitive ability on information processing by conducting cognitive response time (CRT) using a computer program. Methods : This study was conducted in 30 normal elderly (NE) and 30 elderly with amnestic MCI (aMCI), aged 65-79 years old living in Daegu and Gyeongbuk. The results were analyzed using the statistical analysis program R 4.0.2 (University of Auckland, New Zealand). Results : In the three sub-areas of CRT, the total response time showed a significant difference depending on group or age, and the error rate showed a significant difference depending on age or group in some sub-areas. In the aMCI group, the performance of CRT significantly correlated with that of the overall cognition and memory test. Conclusion : Information processing depending on aging or cognitive ability and the differential performance of processing speed could be observed through CRT. The performance of this test was found to be significantly correlated with that of the overall cognition and memory test. Therefore, CRT could be used meaningfully as a simplified tool to predict the initial cognitive disorder of the elderly in the community.

Spatiotemporal Analysis of Ship Floating Object Accidents (선박 부유물 감김사고의 시·공간적 분석)

  • Yoo, Sang-Lok;Kim, Deug-Bong;Jang, Da-Un
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.27 no.7
    • /
    • pp.1004-1010
    • /
    • 2021
  • Ship-floating object accidents can lead not only to a delay in ship's operations, but also to large scale casualties. Hence, preventive measures are required to avoid them. This study analyzed the spatiotemporal aspects of such collisions based on the data on ship-floating object accidents in sea areas in the last five years, including the collisions in South Korea's territorial seas and exclusive economic zones. We also provide basic data for related research fields. To understand the distribution of the relative density of accidents involving floating objects, the sea area under analysis was visualized as a grid and a two-dimensional histogram was generated. A multinomial logistic regression model was used to analyze the effect of variables such as time of day and season on the collisions. The spatial analysis revealed that the collision density was highest for the areas extending from Geoje Island to Tongyeong, including Jinhae Bay, and that it was high near Jeongok Port in the West Sea and the northern part of Jeju Island. The temporal analysis revealed that the collisions occurred most frequently during the day (71.4%) and in autumn. Furthermore, the likelihood of collision with floating objects was much higher for professional fishing vessels, leisure vessels, and recreational fishing vessels than for cargo vessels during the day and in autumn. The results of this analysis can be used as primary data for the arrangement of Coast Guard vessels, rigid enforcement of regulations, removal of floating objects, and preparation of countermeasures involving preliminary removal of floating objects to prevent accidents by time and season.

A study on deep neural speech enhancement in drone noise environment (드론 소음 환경에서 심층 신경망 기반 음성 향상 기법 적용에 관한 연구)

  • Kim, Jimin;Jung, Jaehee;Yeo, Chaneun;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.3
    • /
    • pp.342-350
    • /
    • 2022
  • In this paper, actual drone noise samples are collected for speech processing in disaster environments to build noise-corrupted speech database, and speech enhancement performance is evaluated by applying spectrum subtraction and mask-based speech enhancement techniques. To improve the performance of VoiceFilter (VF), an existing deep neural network-based speech enhancement model, we apply the Self-Attention operation and use the estimated noise information as input to the Attention model. Compared to existing VF model techniques, the experimental results show 3.77%, 1.66% and 0.32% improvements for Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligence (STOI), respectively. When trained with a 75% mix of speech data with drone sounds collected from the Internet, the relative performance drop rates for SDR, PESQ, and STOI are 3.18%, 2.79% and 0.96%, respectively, compared to using only actual drone noise. This confirms that data similar to real data can be collected and effectively used for model training for speech enhancement in environments where real data is difficult to obtain.

Dynamic Channel Management Scheme for Device-to-device Communication in Next Generation Downlink Cellular Networks (차세대 하향링크 셀룰러 네트워크에서 단말 간 직접 통신을 위한 유동적 채널관리 방법)

  • Se-Jin Kim
    • Journal of Internet Computing and Services
    • /
    • v.24 no.1
    • /
    • pp.1-7
    • /
    • 2023
  • Recently, the technology of device-to-device(D2D) communication has been receiving big attention to improve the system performance since the amount of high quality/large capacity data traffic from smart phones and various devices of Internet of Things increase rapidly in 5G/6G based next generation cellular networks. However, even though the system performance of macro cells increase by reusing the frequency, the performance of macro user equipments(MUEs) decrease because of the strong interference from D2D user equipments(DUEs). Therefore, this paper proposes a dynamic channel management(DCM) scheme for DUEs to guarantee the performance of MUEs as the number of DUEs increases in next generation downlink cellular networks. In the proposed D2D DCM scheme, macro base stations dynamically assign subchannels to DUEs based on the interference information and signal to interference and noise ratio(SINR) of MUEs. Simulation results show that the proposed D2D DCM scheme outperforms other schemes in terms of the mean MUE capacity as the threshold of the SINR of MUEs incareases.

Deep Learning Acoustic Non-line-of-Sight Object Detection (음향신호를 활용한 딥러닝 기반 비가시 영역 객체 탐지)

  • Ui-Hyeon Shin;Kwangsu Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.233-247
    • /
    • 2023
  • Recently, research on detecting objects in hidden spaces beyond the direct line-of-sight of observers has received attention. Most studies use optical equipment that utilizes the directional of light, but sound that has both diffraction and directional is also suitable for non-line-of-sight(NLOS) research. In this paper, we propose a novel method of detecting objects in non-line-of-sight (NLOS) areas using acoustic signals in the audible frequency range. We developed a deep learning model that extracts information from the NLOS area by inputting only acoustic signals and predicts the properties and location of hidden objects. Additionally, for the training and evaluation of the deep learning model, we collected data by varying the signal transmission and reception location for a total of 11 objects. We show that the deep learning model demonstrates outstanding performance in detecting objects in the NLOS area using acoustic signals. We observed that the performance decreases as the distance between the signal collection location and the reflecting wall, and the performance improves through the combination of signals collected from multiple locations. Finally, we propose the optimal conditions for detecting objects in the NLOS area using acoustic signals.

Enhancement of durability of tall buildings by using deep-learning-based predictions of wind-induced pressure

  • K.R. Sri Preethaa;N. Yuvaraj;Gitanjali Wadhwa;Sujeen Song;Se-Woon Choi;Bubryur Kim
    • Wind and Structures
    • /
    • v.36 no.4
    • /
    • pp.237-247
    • /
    • 2023
  • The emergence of high-rise buildings has necessitated frequent structural health monitoring and maintenance for safety reasons. Wind causes damage and structural changes on tall structures; thus, safe structures should be designed. The pressure developed on tall buildings has been utilized in previous research studies to assess the impacts of wind on structures. The wind tunnel test is a primary research method commonly used to quantify the aerodynamic characteristics of high-rise buildings. Wind pressure is measured by placing pressure sensor taps at different locations on tall buildings, and the collected data are used for analysis. However, sensors may malfunction and produce erroneous data; these data losses make it difficult to analyze aerodynamic properties. Therefore, it is essential to generate missing data relative to the original data obtained from neighboring pressure sensor taps at various intervals. This study proposes a deep learning-based, deep convolutional generative adversarial network (DCGAN) to restore missing data associated with faulty pressure sensors installed on high-rise buildings. The performance of the proposed DCGAN is validated by using a standard imputation model known as the generative adversarial imputation network (GAIN). The average mean-square error (AMSE) and average R-squared (ARSE) are used as performance metrics. The calculated ARSE values by DCGAN on the building model's front, backside, left, and right sides are 0.970, 0.972, 0.984 and 0.978, respectively. The AMSE produced by DCGAN on four sides of the building model is 0.008, 0.010, 0.015 and 0.014. The average standard deviation of the actual measures of the pressure sensors on four sides of the model were 0.1738, 0.1758, 0.2234 and 0.2278. The average standard deviation of the pressure values generated by the proposed DCGAN imputation model was closer to that of the measured actual with values of 0.1736,0.1746,0.2191, and 0.2239 on four sides, respectively. In comparison, the standard deviation of the values predicted by GAIN are 0.1726,0.1735,0.2161, and 0.2209, which is far from actual values. The results demonstrate that DCGAN model fits better for data imputation than the GAIN model with improved accuracy and fewer error rates. Additionally, the DCGAN is utilized to estimate the wind pressure in regions of buildings where no pressure sensor taps are available; the model yielded greater prediction accuracy than GAIN.