Search | Korea Science

Development of Korean Audio Caption System (한국어 오디오 캡션 시스템 개발)

Kang, Taeho;Kim, Juhee;Lee, Joonha
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2020.11a
- /
- pp.364-367
- /
- 2020
오디오 캡셔닝(Audio Captioning)은 시스템이 입력으로 오디오 신호를 받아들이고 해당 신호의 텍스트 설명을 출력하는 중간 번역 작업이다. 이 논문에서는 컨볼루셔널 뉴럴 네트워크(CNN), 트랜스포머의 딥러닝 알고리즘을 사용하여 주변 환경 소리에 대한 오디오 캡셔닝을 자동으로 수행하고 한글화된 출력 결과를 제공하는 모델을 제시한다. 본 연구 결과, 모델의 성능 평가 척도인 SPIDEr 점수는 0.1977이 나왔다.
PDF

Source Estimation of Digital Filter System using Inverse Problem (역문제를 이용한 디지털 필터 시스템의 소스 추정)

Kim, Tae Yong;Lee, Hoon-Jae
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2014.05a
- /
- pp.57-58
- /
- 2014
Digital filter is very important role in signal processing system. In general, input signal is determined by transfer function of the digital filter. But if input signal was exposured in various sound environment, it is difficult to verify its original source. In this paper, inverse problem in order to extract original input signal from noisy environment is considered.
PDF

디지털 경제를 주도할 디지털 컨텐츠 산업의 육성방향

박영일
- Proceedings of the Korea Database Society Conference
- /
- 1999.10a
- /
- pp.1-11
- /
- 1999
o 디지털컨텐츠(멀티미디어컨텐츠)란 무엇인가\ulcorner 멀티미디어 : 기존 아날로그 기술에서 개별적으로 성장했던 문자, 음성, 사진, 비디오, 애니메이션의 미디어 영역들이 디지털 기술이 발달하면서 통합된 미디어를 말함. 디지털화는 글, 소리, 그림, 영상, 숫자 등의 온갖 정보들을 컴퓨터가 인식할 수 있는 신호(2진수 코드)로 바꾸는 것임. (중략)
PDF

Research of real-time image which is responding to the strings sound in art performance (무대 공연에서 현악기 소리에 반응하는 실시간 영상에 관한 연구)

Jang, Eun-Sun;Hong, Sung-Dae;Park, Jin-Wan
- Proceedings of the Korea Contents Association Conference
- /
- 2009.05a
- /
- pp.185-190
- /
- 2009
Recent performing-art has a trend to be new cultural contents style which mixes various genre not just traditional way. Especially in stage performance, unique performance is playing using high technology and image. In sound performance, one of technology, a new experiment is trying which re-analyze the sound and mixes the result with image. But in public performance we have a technical difficulty with making visualization regarding the sound in realtime. Because we can not make visualization with instant sound from performers and audience it is difficult to interact smoothly between performer and audience. To resolve this kind of restriction, this paper suggests Real-time sound visualization. And we use string music instrument for sound source. Using the MaxMSP/Jitter based the Midi, we build image control system then we test and control the image with Korg Nano Kontrol. With above experiment we can verify verious emotion, feeling and rhythm of performer according to performance environment and also we can verify the real time interactive image which can be changed momently by performer's action.
PDF

An Inter-floor Noise Prevention System using an Open-source Controller (오픈소스 컨트롤러를 사용한 층간 소음 방지 시스템)

Kim, Tae-Hoon;Jang, Hyuk-Jae;Lee, Won-Young
- The Journal of the Korea institute of electronic communication sciences
- /
- v.12 no.5
- /
- pp.899-906
- /
- 2017
This paper proposes an inter-floor noise prevention system using an open-source controller. In the proposed system, Arduino which is a widely used open source controller analyzes sound signals and vibration signals with fast fourier transform. When the magnitude of the band-passed signal excesses the noise reference considering transmission loss of a panel or a wall, the system displays warning messages on an LCD module and a mobile device for users to be aware of the noise condition. In the experiment, the system has succeeded extracting and processing the band-passed signals between 130 Hz ~ 1040 Hz. When the magnitude of the extracted signal that is subtracted from the transmission loss exceeds 45 dB, the system has displayed the warning message on an LCD module and a mobile devicefor noise reduction.
https://doi.org/10.13067/JKIECS.2017.12.5.899 인용 PDF KSCI

The Analysis of EEG Signal Responding to the Pure Tone Auditory Stimulus (청각자극의 반송 주파수에 따른 뇌전위 신호의 해석)

Choe, Jeong-Mi;Bae, Byeong-Hun;Kim, Su-Yong
- Journal of Biomedical Engineering Research
- /
- v.15 no.4
- /
- pp.383-388
- /
- 1994
Chaotic analysis of EEG signal responding to auditory stimulus with various carrier frequency and constant triggering frequency is given in this paper. The EEG signal is obtained from the digital 12channel EEG system made in our laboratory. The carrier frequency is varied from 1 kHz to 3 kHz by 0.5 kHz step. Chaos analysis such as pseudo phase space portrait, Lyapunov exponent, and so on is done on the auditory stimulated evoked potential. This result is found to be quite consistent with the well known results from the psychological perception theory.
PDF

A Study on the Acoustic Characteristics of the Pansori by Voice Signals Analysis (음성신호 분석에 의한 판소리의 음성학적 특징 연구)

Kim, HyunSook
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.14 no.7
- /
- pp.3218-3222
- /
- 2013
Pansori is our traditional vocal sound, originality and excellence in the art of conversation, gesture general became a globally recognized world intangible heritage. Especially, Pansori as shrews and humorous representation of audience participation with a high degree of artistic value and enjoy the arts throughout all layers to be responsible for the social integration of functions is evaluated. Therefore, in this paper, Pansori five yard target speech signal analysis techniques applied to analyze the Pansori acoustic features of a representation of a society and era correlation extraction studies were performed. Pansori on the five yard spectrogram, pitch, stability and strength analysis for this experiment. Pansori through experimental results Comical story while keeping the audience focused and interested to better reflect the characteristics of energy for the wave of voice and vocal cord tremor change the width of a large, stable and voice with a loud voice, that expresses were analyzed.
https://doi.org/10.5762/KAIS.2013.14.7.3218 인용 PDF KSCI

Classification of bearded seals signal based on convolutional neural network (Convolutional neural network 기법을 이용한 턱수염물범 신호 판별)

Kim, Ji Seop;Yoon, Young Geul;Han, Dong-Gyun;La, Hyoung Sul;Choi, Jee Woong
- The Journal of the Acoustical Society of Korea
- /
- v.41 no.2
- /
- pp.235-241
- /
- 2022
Several studies using Convolutional Neural Network (CNN) have been conducted to detect and classify the sounds of marine mammals in underwater acoustic data collected through passive acoustic monitoring. In this study, the possibility of automatic classification of bearded seal sounds was confirmed using a CNN model based on the underwater acoustic spectrogram images collected from August 2017 to August 2018 in East Siberian Sea. When only the clear seal sound was used as training dataset, overfitting due to memorization was occurred. By evaluating the entire training data by replacing some training data with data containing noise, it was confirmed that overfitting was prevented as the model was generalized more than before with accuracy (0.9743), precision (0.9783), recall (0.9520). As a result, the performance of the classification model for bearded seals signal has improved when the noise was included in the training data.
https://doi.org/10.7776/ASK.2022.41.2.235 인용 PDF KSCI

Development of Auditory Icon in Ship Bridge Alarm Management System (선교알람관리시스템의 청각아이콘 개발을 위한 연구)

Oh, Seung-Bin;Jang, Jun-Hyuk;Kim, Hong-Tae
- Proceedings of the Korean Institute of Navigation and Port Research Conference
- /
- 2012.10a
- /
- pp.5-7
- /
- 2012
선교에는 항해사에게 정보를 전달하기 위하여 다양한 신호가 존재한다. 항해 및 통신 장비로부터 나오는 음향 신호 등 다양한 청각 신호들이 존재하지만 이러한 청각 신호, 청각 경고음에 대한 인간의 인지능력에 관한 연구는 미흡한 실정이다. 청각 경고음은 크게 음성(speech), 함축적 소리(abstract sound), 청각 아이콘(auditory icon)으로 구분 할 수 있다. 본 연구에서는 청각 경고음 중 청각아이콘을 활용하여 5가지의 경보상황(엔진, 화재, 조타, 전기, 충돌)에서 청각아이콘에 대한 감성평가를 통해 각 상황에 적합한 청각아이콘을 선별하였다. 본 연구 결과는 선교 내 청각표시장치와 통합선교알람관리시스템을 위한 기초자료로 활용될 수 있을 것으로 기대된다.
PDF

Design of Wide Input Range Multiple Filter-Banks for Analog Cochlear Chip (입력 신호범위가 넓은 아날로그 다중필터의 설계)

Choi, B.K.;Lee, K.;Ryu, S.T.;Cho, G.H.
- Proceedings of the KIEE Conference
- /
- 2001.07d
- /
- pp.2613-2615
- /
- 2001
청각시스템의 저전력 및 가격의 저렴화를 위해 달팽이관의 BM(Basilar Membrain)모델을 아날로그 VLSI 마이크로 파워 공정으로 구현하고 있다. 본 논문에서는 소리의 주파수 정보 추출기능을 하는 직렬 연결된 트리구조(TSBF : Tree-structured Cascaded Bandpass Filter)의 16채널의 아날로그 중간대역통과 필터회로를 CMOS VLSI 공정을 이용하여 설계하였다. 특히 큰 입력 신호에 대해서도 파형왜곡 없이 선형적인 특성을 가지는 트랜스 컨턱터를 이용하여 필터를 구현하였다. 필터는 저대역통과필터와 출력이득의 감쇄를 줄이기 위해서 중간대역통과필터를 이용하여 전체 시스템을 설계했다. 본 논문에서 기존의 150mVp-p 입력신호 범위의 트랜스 컨턱터를 Substrate 입력을 가지는 트랜스 컨턱터를 이용하여 입력신호 범위를 1Vp-p 까지 늘였다.
PDF

Search Result 198, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)