• 제목/요약/키워드: 음성공학

Search Result 1,130, Processing Time 0.026 seconds

User Detection and Main Body Parts Estimation using Inaccurate Depth Information and 2D Motion Information (정밀하지 않은 깊이정보와 2D움직임 정보를 이용한 사용자 검출과 주요 신체부위 추정)

  • Lee, Jae-Won;Hong, Sung-Hoon
    • Journal of Broadcast Engineering
    • /
    • v.17 no.4
    • /
    • pp.611-624
    • /
    • 2012
  • 'Gesture' is the most intuitive means of communication except the voice. Therefore, there are many researches for method that controls computer using gesture input to replace the keyboard or mouse. In these researches, the method of user detection and main body parts estimation is one of the very important process. in this paper, we propose user objects detection and main body parts estimation method on inaccurate depth information for pose estimation. we present user detection method using 2D and 3D depth information, so this method robust to changes in lighting and noise and 2D signal processing 1D signals, so mainly suitable for real-time and using the previous object information, so more accurate and robust. Also, we present main body parts estimation method using 2D contour information, 3D depth information, and tracking. The result of an experiment, proposed user detection method is more robust than only using 2D information method and exactly detect object on inaccurate depth information. Also, proposed main body parts estimation method overcome the disadvantage that can't detect main body parts in occlusion area only using 2D contour information and sensitive to changes in illumination or environment using color information.

Design of a Low Power Digital Filter Using Variable Canonic Signed Digit Coefficients (가변 CSD 계수를 이용한 저전력 디지털 필터의 설계)

  • Kim, Yeong-U;Yu, Jae-Taek;Kim, Su-Won
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.38 no.7
    • /
    • pp.455-463
    • /
    • 2001
  • In this Paper, an approximate processing method is proposed and tested. The proposed method uses variable CSD (VCSD) coefficients which approximate filter stopband attenuation by controlling the precision of the CSD coefficient sets. A decimation filter for Audio Codec '97 specifications has been designed having processor architecture that consists of program/data memory, arithmetic unit, energy/level decision, and sinc filter blocks, and fabricated with 0.6${\mu}{\textrm}{m}$ CMOS sea-of-gate technology. For the combined two halfband FIR filters in decimation filter, the number of addition operations were reduced to 63.5%, 35.7%, and 13.9%, compared to worst-case which is not an adaptive one. Experimental results show that the total power reduction rate of the filter is varying from 3.8 % to 9.0 % with respect to worst-case. The proposed approximate processing method using variable CSD coefficients is readily applicable to various kinds of filters and suitable, especially, for the speech and audio applications, like oversampling ADCs and DACs, filter banks, voice/audio codecs, etc.

  • PDF

Design and Implementation of Receiver Algorithms for VDL Mode-2 Systems (VDL Mode-2 시스템을 위한 수신 알고리듬 설계 및 구현)

  • Lee, Hui-Soo;Kang, Dong-Hoon;Park, Hyo-Bae;Oh, Wang-Rock
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.46 no.10
    • /
    • pp.28-33
    • /
    • 2009
  • In this paper, we propose the receiver algorithms suitable for the VHF (Very High Frequency) digital link mode-2(VDL Mode-2) system. Unlike conventional digital communication systems using the root raised cosine filter as a transmit and receive filter, raised cosine filter is used as a transmit filter in the VDL Mode-2 system. Hence, it is crucial to design and implement the optimum lowpass receive filter by considering inter-symbol interference and noise performance. On the other hand, due to the short preamble pattern, it is crucial to develop an efficient packet detection algorithm for reliable communication link for the VDL Mode-2 system. Also, frequency offset due to the carrier frequency difference between transmitter and receiver and doppler frequency shift must be estimated and compensated for reliable communication. In this paper, the optimum receive filter, packet detection and frequency offset compensation algorithms are proposed and the performance of the VDL system employing the proposed algorithms are evaluated.

Performance Evaluation of Scheduling Algorithm for VoIP under Data Traffic in LTE Networks (데이터 트래픽 중심의 LTE망에서 VoIP를 위한 스케줄링 알고리즘 성능 분석)

  • Kim, Sung-Ju;Lee, Jae Yong;Kim, Byung Chul
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.12
    • /
    • pp.20-29
    • /
    • 2014
  • Recently, LTE is preparing to make a new leap forward LTE-A all over the world. As LTE privides high speed service, the role of mobile phones seems to change from voice to data service. According to Cisco, global mobile data traffic will increase nearly 11-fold between 2013 and 2018. Mobile video traffic will reach 75% by 2018 from 66% in 2013 in Korea. However, voice service is still the most important role of mobile phones. Thus, controllability of throughput and low BLER is indispensable for high-quality VoIP service among various type of traffic. Although the maximum AMR-WB, 23.85 Kbps is sufficient to a VoIP call, it is difficult for the LTE which can provide tens to hundreds of MB/s may not keep the certain level VoIP QoS especially in the cell-edge area. This paper proposes a new scheduling algorithm in order to improve VoIP performance after analyzing various scheduling algorithms. The proposal is the technology which applies more priority processing for VoIP than other applications in cell-edge area based on two-tier scheduling algorithm. The simulation result shows the improvement of VoIP performance in the view point of throughput and BLER.

Design and Implementation of a Real-time Bio-signal Obtaining, Transmitting, Compressing and Storing System for Telemedicine (원격 진료를 위한 실시간 생체 신호 취득, 전송 및 압축, 저장 시스템의 설계 및 구현)

  • Jung, In-Kyo;Kim, Young-Joon;Park, In-Su;Lee, In-Sung
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.45 no.4
    • /
    • pp.42-50
    • /
    • 2008
  • The real-time bio-signal monitoring system based on the ZigBee and SIP/RTP has proposed and implemented for telemedicine but that has some problems at the stabilities to transmit bio-signal from the sensors to the other sides. In this paper, we designed and implemented a real-time bio-signal monitoring system that is focused on the reliability and efficiency for transmitting bio-signal at real-time. We designed the system to have enhanced architecture and performance in the ubiquitous sensor network, SIP/RTP real-time transmission and management of the database. The Bluetooth network is combined with ZigBee network to distribute traffic of the ECG and the other bio-signal. The modified and multiplied RTP session is used to ensure real-time transmission of ECG, other bio-signals and speech information on the internet. The modified ECG compression method based on DWLT and MSVQ is used to reduce data rate for storing ECG to the database. Finally we implemented a system that has improved performance for transmitting bio-signal from the sensors to the monitoring console and database. This implemented system makes possible to make various applications to serve U-health care services.

Development of Driver's Emotion and Attention Recognition System using Multi-modal Sensor Fusion Algorithm (다중 센서 융합 알고리즘을 이용한 운전자의 감정 및 주의력 인식 기술 개발)

  • Han, Cheol-Hun;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.6
    • /
    • pp.754-761
    • /
    • 2008
  • As the automobile industry and technologies are developed, driver's tend to more concern about service matters than mechanical matters. For this reason, interests about recognition of human knowledge and emotion to make safe and convenient driving environment for driver are increasing more and more. recognition of human knowledge and emotion are emotion engineering technology which has been studied since the late 1980s to provide people with human-friendly services. Emotion engineering technology analyzes people's emotion through their faces, voices and gestures, so if we use this technology for automobile, we can supply drivels with various kinds of service for each driver's situation and help them drive safely. Furthermore, we can prevent accidents which are caused by careless driving or dozing off while driving by recognizing driver's gestures. the purpose of this paper is to develop a system which can recognize states of driver's emotion and attention for safe driving. First of all, we detect a signals of driver's emotion by using bio-motion signals, sleepiness and attention, and then we build several types of databases. by analyzing this databases, we find some special features about drivers' emotion, sleepiness and attention, and fuse the results through Multi-Modal method so that it is possible to develop the system.

3D Pose Estimation of a Human Arm for Human-Computer Interaction - Application of Mechanical Modeling Techniques to Computer Vision (인간-컴퓨터 상호 작용을 위한 인간 팔의 3차원 자세 추정 - 기계요소 모델링 기법을 컴퓨터 비전에 적용)

  • Han Young-Mo
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.42 no.4 s.304
    • /
    • pp.11-18
    • /
    • 2005
  • For expressing intention the human often use body languages as well as vocal languages. Of course the gestures using arms and hands are the representative ones among the body languages. Therefore it is very important to understand the human arm motion in human-computer interaction. In this respect we present here how to estimate 3D pose of human arms by using computer vision systems. For this we first focus on the idea that the human arm motion consists of mostly revolute joint motions, and then we present an algorithm for understanding 3D motion of a revolute joint using vision systems. Next we apply it to estimating 3D pose of human arms using vision systems. The fundamental idea for this algorithm extension is that we may apply the algorithm for a revolute joint to each of the revolute joints of hmm arms one after another. In designing the algorithms we focus on seeking closed-form solutions with high accuracy because we aim at applying them to human computer interaction for ubiquitous computing and virtual reality.

Classification standard of Communication Tool (플랫폼 분류 기준 고찰 : 감각의 입·출력)

  • Kim, Hyo-Yeun
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2018.05a
    • /
    • pp.189-190
    • /
    • 2018
  • Digital content requires the concept and structure that give us insights into the languages between computers and humans and how humans experience manifested among the flow of characters, images, and voice. Communicology, $Vil{\acute{e}}m$ Flusser's original study, allows us to reconsider and to reconstruct the boundary of human awareness. This paper intends to begin understanding digital content consisting of numerical codes by reviewing communicology. communicology helps to break up pre-existing categories and thinking about new standards. ith the help of information technology. Planning content can be actualized by classifying and reconstructing content that are input/output of senses. The standard of classification is 'boundary' and 'direction,' communication elements that cannot be broken down any further. There is no need to communicate if there is no boundary. The operation of communication is comprised of 'direction.' Considering humankind as the standard, the boundary that takes in stimulation from outside can be seen as senses. Direction can be expressed as input/output. Output assumes that technical pictures receive information. The coordinates for various pre-existing platforms and content and uncovered platforms can be set with a consistent standard. This allows us to escape from the standard of flat content that was activated by sight and rationality at the ideology of characters, to seek a three-dimensional standard that can be vitalized by various senses and irrationality, and to reconstruct the input/output of senses to show the possibility of planning a new platform.

  • PDF

Surface Modification of Recycled Plastic Film-Based Aggregates for Use in Concrete (폐플라스틱 복합필름 기반 콘크리트용 골재의 표면 개질)

  • Kim, Tae Hun;Lee, Jea Uk;Hong, Jin-Yong
    • Journal of the Korean Recycled Construction Resources Institute
    • /
    • v.9 no.3
    • /
    • pp.295-302
    • /
    • 2021
  • Surface modification of recycled plastic film-based aggregates is demonstrated to enhance the interaction between aggregates and cement paste. It is shown that the oxygen(O2) atmospheric pressure plasma(APP) treatment leads to a drastic increase in hydrophilicity. In case of the plasma treatment at 100W of RF power, 15/4sccm of O2/Ar flow rate and 30sec of discharging time, the water contact angle on the aggregates surface decreased from 104.5° to 44.0°. In addition, the contact angle of surface modified aggregates kept in air increased with time elapse. Improvement of hydrophilicity can be explained by the formation of new hydrophilic oxygen functional groups which is identified as C-OH, C-O-C, C=O, -COOH by X-ray photoelectron spectroscopy(XPS) analysis and Fourier-transform infrared spectroscopy(FT-IR). Therefore, it can be concluded that the plasma treatment process is an effective method to improve adhesion of the recycled plastic film-based aggregates and cement paste.

CRNN-Based Korean Phoneme Recognition Model with CTC Algorithm (CTC를 적용한 CRNN 기반 한국어 음소인식 모델 연구)

  • Hong, Yoonseok;Ki, Kyungseo;Gweon, Gahgene
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.3
    • /
    • pp.115-122
    • /
    • 2019
  • For Korean phoneme recognition, Hidden Markov-Gaussian Mixture model(HMM-GMM) or hybrid models which combine artificial neural network with HMM have been mainly used. However, current approach has limitations in that such models require force-aligned corpus training data that is manually annotated by experts. Recently, researchers used neural network based phoneme recognition model which combines recurrent neural network(RNN)-based structure with connectionist temporal classification(CTC) algorithm to overcome the problem of obtaining manually annotated training data. Yet, in terms of implementation, these RNN-based models have another difficulty in that the amount of data gets larger as the structure gets more sophisticated. This problem of large data size is particularly problematic in the Korean language, which lacks refined corpora. In this study, we introduce CTC algorithm that does not require force-alignment to create a Korean phoneme recognition model. Specifically, the phoneme recognition model is based on convolutional neural network(CNN) which requires relatively small amount of data and can be trained faster when compared to RNN based models. We present the results from two different experiments and a resulting best performing phoneme recognition model which distinguishes 49 Korean phonemes. The best performing phoneme recognition model combines CNN with 3hop Bidirectional LSTM with the final Phoneme Error Rate(PER) at 3.26. The PER is a considerable improvement compared to existing Korean phoneme recognition models that report PER ranging from 10 to 12.