• Title/Summary/Keyword: SpeechWeb

Search Result 101, Processing Time 0.028 seconds

The Analysis for the Distinctive Directing of Speech Balloons in Webtoon (웹툰에 나타난 특징적 말칸 연출에 대한 분석)

  • Jeung, Kiu-Ha;Yoon, Ki-Heon
    • Cartoon and Animation Studies
    • /
    • s.36
    • /
    • pp.393-416
    • /
    • 2014
  • Comics has three components: cuts, gap between cuts, speech balloons. Still, it is true that speech balloons are not commonly subject to the study for comics. A few preceding researches pinpoint exactly the morphological features and functions of speech balloons. In today when webtoon becomes generalized, these features and functions are continued as they are and are used in webtoon. We can catch that speech balloons are also affected since the environmental elements of web induce the change in the overall comics directing. There are two perspectives to sort out the features of speech balloons: first, the placement issue of speech balloons. The unlimited expansion of web space gives the environment for comicss to use the gap between cuts as wide as they can. It leads to turn out some of the ways to place the balloons, so we can sort them out general placement, exterior placement, the upper and lower placement, scroll-use type. Second, as the directing techniques for webtoon become digitalized by the morphological issue, speech balloon itself has been expanded its ways to express by various expression methods. Analyzing and classifying, recording the newly emerged conditions on the preceding study are worthy of trying and will become the cornerstone for the follow research.

Development of Automatic Creating Web-Site Tool for the Blind (시각장애인용 웹사이트 자동생성 툴 개발)

  • Baek, Hyeun-Ki;Ha, Tai-Hyun
    • Journal of Digital Contents Society
    • /
    • v.8 no.4
    • /
    • pp.467-474
    • /
    • 2007
  • This paper documents the design and implementation of an automatic creating web-site tool for the blind to build their own homepage by using both voice recognition and voice mixed technology with equal ease as the non-disabled. The blind can make voice mails, schedules, address lists and bookmarks by making use of the tool. It also facilitates communication between the non-disabled with the help of their information management system. This tool converts basic commands into voice recognition, also making an offer of text-to-speech which supports voice output. In the end, the tool will remove the blind's social isolation, allowing them to enjoy the information age like the non-disabled.

  • PDF

Study of Speech Recognition System Using the Java (자바를 이용한 음성인식 시스템에 관한 연구)

  • Choi, Kwang-Kook;Kim, Cheol;Choi, Seung-Ho;Kim, Jin-Young
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.6
    • /
    • pp.41-46
    • /
    • 2000
  • In this paper, we implement the speech recognition system based on the continuous distribution HMM and Browser-embedded model using the Java. That is developed for the speech analysis, processing and recognition on the Web. Client sends server through the socket to the speech informations that extracting of end-point detection, MFCC, energy and delta coefficients using the Java Applet. The sewer consists of the HMM recognizer and trained DB which recognizes the speech and display the recognized text back to the client. Because of speech recognition system using the java is high error rate, the platform is independent of system on the network. But the meaning of implemented system is merged into multi-media parts and shows new information and communication service possibility in the future.

  • PDF

Speech Database for 3-5 years old Korean Children (만 3-5세 유아의 한국어 음성 데이터베이스 구축)

  • Yoo, Jae-Kwon;Lee, Kyung-Ok;Lee, Kyoung-Mi
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.4
    • /
    • pp.52-59
    • /
    • 2012
  • Children develop their language skill rapidly between age 3 and 5. To meet the child's language development through a variety of experiences, it is necessary to develop age-appropriate contents. So it needs to develop various contents using speech interface for children, but there is no speech database of korean children. In this paper, we develop speech database of 3 to 5 years old children in korean. For collecting accurate children's speech, child education experts examine in the speech database development process. The words for database are selected from MCDI-K in two stage and children speak a word three times. Such collected speech are tokenized by child and word and stored in database. This speech database will be transferred through web and, hopefully, be the foundation of development of children-oriented contents.

Combining deep learning-based online beamforming with spectral subtraction for speech recognition in noisy environments (잡음 환경에서의 음성인식을 위한 온라인 빔포밍과 스펙트럼 감산의 결합)

  • Yoon, Sung-Wook;Kwon, Oh-Wook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.439-451
    • /
    • 2021
  • We propose a deep learning-based beamformer combined with spectral subtraction for continuous speech recognition operating in noisy environments. Conventional beamforming systems were mostly evaluated by using pre-segmented audio signals which were typically generated by mixing speech and noise continuously on a computer. However, since speech utterances are sparsely uttered along the time axis in real environments, conventional beamforming systems degrade in case when noise-only signals without speech are input. To alleviate this drawback, we combine online beamforming algorithm and spectral subtraction. We construct a Continuous Speech Enhancement (CSE) evaluation set to evaluate the online beamforming algorithm in noisy environments. The evaluation set is built by mixing sparsely-occurring speech utterances of the CHiME3 evaluation set and continuously-played CHiME3 background noise and background music of MUSDB. Using a Kaldi-based toolkit and Google web speech recognizer as a speech recognition back-end, we confirm that the proposed online beamforming algorithm with spectral subtraction shows better performance than the baseline online algorithm.

MAS: Real-time Meeting Scripting and Summarization Service using BART and WebRTC library (MAS: BART 와 WebRTC 라이브러리를 이용한 실시간 회의 스크립트화 및 요약 서비스)

  • Kwon, Ki-Jun;Ko, Geon-Jun;Joo, Yeong-Hwan;Chi, Jeong-hee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.619-621
    • /
    • 2022
  • COVID-19 사태의 지속화로 재택근무 및 화상 수업의 수요가 증가함에 따라, 화상 회의 서비스에 대한 수요 또한 증가하고 있다. 본 논문은 회의 내용의 텍스트화 및 요약 회의록 생성에 관한 연구를 통해 보다 효율적인 화상 회의 서비스를 제공하고자 한다. WebRTC를 기반으로 화상 회의 서비스를 제공하며, WebSpeech API 를 활용하여 회의 내용을 스크립트화 한다. 회의 스크립트는 BART를 통해 요약본으로 재생성되며, 회의 스크립트와 요약본은 언제든지 열람 및 다운로드가 가능하다. 본 논문은 회의 요약 기능을 제공하는 화상 회의 서비스 MAS (Meeting Auto Summarization)를 제안하며, MAS 의 설계 및 구현 방법을 소개한다.

Voice Portal based on SMS Authentication at CTI Module Implementation by Speech Recognition (SMS 인증 기반의 보이스포탈에서의 음성인식을 위한 CTI 모듈 구현)

  • Oh, Se-Il;Kim, Bong-Hyun;Koh, Jin-Hwan;Park, Won-Tea
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2001.04b
    • /
    • pp.1177-1180
    • /
    • 2001
  • 전화를 통해 인터넷 정보를 들을 수 있는 보이스 포탈(Voice Portal) 서비스가 인기를 얻고 있다. Voice Portal 서비스란 알고자 하는 정보를 Speech Recognition System에 음성으로 명령하면 전화를 통해 음성으로 원하는 정보를 듣는 서비스이다. Authentication의 절차를 수행하는 SMS (Short Message Service) 서버 Module, PSTN과 Database 서버사이의 Interface를 제공하는 CTI (Computer Telephony Integration) Module, CTI 서버와 WWW (World Wide Web) 사이의 Voice XML Module, 정보를 검색하기 위한 Searching Module들이 필요하다. 본 논문은 Speech Recognition technology를 기반으로 한 CTI Module 설계를 구현하였다. 또한 인정 방식으로 Random한 일회용 password를 기반으로 한 SMS Authentication을 택하므로 더욱 더 안정된 서비스 제공을 목적으로 하였다.

  • PDF

Improvement of Shop Music Broadcasting Services Using Music Lists and User Experience (방송목록과 사용자 경험 정보를 이용한 매장 음원 방송 서비스의 개선)

  • Kang, Sun-Mee;Kim, Hyun-Deuc;Chang, Moon-Soo
    • Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.121-130
    • /
    • 2008
  • This paper proposes the way of improvement and system build-up for shop music broadcasting services provided by the Internet. Comparing the shop music broadcasting services and personal music broadcasting services, we propose the way of shop music broadcasting services customers prefer to. That is, such a function is provided that a user can control the broadcasting music lists a specialist provides according to the current circumstance of shop. This paper proposes the whole system such a service is possible and verifies the efficiency by experiments.

  • PDF

A Study on Korean Intonation Using Momel (Momel을 이용한 한국어의 억양 연구)

  • Kim, Sun-Hee;Yoo, Hyun-Ji;Hong, Hye-Jin;Lee, Ho-Young
    • MALSORI
    • /
    • no.63
    • /
    • pp.85-100
    • /
    • 2007
  • This paper aims to propose how to extract intonation patterns using Momel, a pitch stylization algorithm, and to present results of analyzing speech corpora in comparison with those in earlier researches. Two speech corpora are used: one is the sound files obtained from the K-ToBI web site, and the other consists of 80 passages pronounced by 4 speakers (2 male and 2 female). The results show that Momel provides significant pitch targets which can be labeled as H and L tones within prosodic units such as Accentual Phrase (AP) and Intonation Phrase (IP). The resulting AP patterns and IP boundary tone patterns correspond to those in earlier researches. Thus, this study will contribute to the study of intonation as well as to the development of automatic intonation labeling systems.

  • PDF

Implementation of interactive Stock Trading System Using VoiceXML

  • Shin Jeong-Hoon;Cho Chang-Su;Hong Kwang-Seok
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.387-390
    • /
    • 2004
  • In this paper, we design and implement practical application service using VoiceXML. And we suggest new solutions of problems can be occurred when implementing a new systems using VoiceXML, based on the fact. Up to now, speech related services were developed using API (Application Program Interface) and programming languages, which methods depend on system architectures. It thus appears that reuse of contents and resource was very difficult. To solve these problems, nowadays, companies develop their applications using VoiceXML. Advantages of using VoiceXML when developing services are as follows. First, we can use web developing technologies and technologies for transmitting web contents. And, we can save labors for low level programming like C language or Assembler language. And we can save labors for managing resources, too. As the result of these advantages, we can reduce developing hours of applications services and we can solve problem of compatibility between systems. But, there's poor grip of actual problems can be occurred when implementing their own services using VoiceXML. To overcome these problems, we implemented interactive stock trading system using VoiceXML and concentrated our effort to find out problems when using VoiceXML. And then, we proposed solutions to these problems and analyzed strong points and weak points of suggested system.

  • PDF