• Title/Summary/Keyword: Speech-Recognition

Search Result 2,051, Processing Time 0.028 seconds

The impact of Digital Video Effects and subtitles on evaluation and agenda recognition in TV News (TV뉴스의 어깨걸이와 자막이 뉴스에 대한 평가와 의제 인식에 미치는 영향)

  • Bae, Jin-Ah
    • Journal of Digital Contents Society
    • /
    • v.18 no.3
    • /
    • pp.465-473
    • /
    • 2017
  • An experiment was conducted to investigate the relationship between the DVEs and subtitles provided with anchor speech in TV news and the news evaluation, trust and agenda recognition. 120 university students were asked to watch four types of news that differed in the contents of their DVEs and subtitles, and then they evaluated the fairness, sensibility, and irritability of the news. The content of DVEs and subtitles were related to the irritability evaluation and trust of the news, and it was not related with the fairness and sensibility evaluation. When the DVEs and subtitles emphasizing the stimulating aspects of the issue were given, the irritability was highly evaluated and the trust was low. The evaluation on news fairness affected the news trust. The more they rated news as fair, the greater the trust in news. Unlike the assumption in this study, DVEs and subtitles were not the main factors influencing the news agenda perception, and the viewers tended to perceive the agenda based on the news content itself.

Study on Gesture and Voice-based Interaction in Perspective of a Presentation Support Tool

  • Ha, Sang-Ho;Park, So-Young;Hong, Hye-Soo;Kim, Nam-Hun
    • Journal of the Ergonomics Society of Korea
    • /
    • v.31 no.4
    • /
    • pp.593-599
    • /
    • 2012
  • Objective: This study aims to implement a non-contact gesture-based interface for presentation purposes and to analyze the effect of the proposed interface as information transfer assisted device. Background: Recently, research on control device using gesture recognition or speech recognition is being conducted with rapid technological growth in UI/UX area and appearance of smart service products which requires a new human-machine interface. However, few quantitative researches on practical effects of the new interface type have been done relatively, while activities on system implementation are very popular. Method: The system presented in this study is implemented with KINECT$^{(R)}$ sensor offered by Microsoft Corporation. To investigate whether the proposed system is effective as a presentation support tool or not, we conduct experiments by giving several lectures to 40 participants in both a traditional lecture room(keyboard-based presentation control) and a non-contact gesture-based lecture room(KINECT-based presentation control), evaluating their interests and immersion based on contents of the lecture and lecturing methods, and analyzing their understanding about contents of the lecture. Result: We check that whether the gesture-based presentation system can play effective role as presentation supporting tools or not depending on the level of difficulty of contents using ANOVA. Conclusion: We check that a non-contact gesture-based interface is a meaningful tool as a sportive device when delivering easy and simple information. However, the effect can vary with the contents and the level of difficulty of information provided. Application: The results presented in this paper might help to design a new human-machine(computer) interface for communication support tools.

Part-Of-Speech Tagging and the Recognition of the Korean Unknown-words Based on Machine Learning (기계학습에 기반한 한국어 미등록 형태소 인식 및 품사 태깅)

  • Choi, Maeng-Sik;Kim, Hark-Soo
    • The KIPS Transactions:PartB
    • /
    • v.18B no.1
    • /
    • pp.45-50
    • /
    • 2011
  • Unknown morpheme errors in Korean morphological analysis are divided into two types: The one is the errors that a morphological analyzer entirely fails to return any morpheme sequences, and the other is the errors that a morphological analyzer returns incorrect combinations of known morphemes. Most previous unknown morpheme estimation techniques have been focused on only the former errors. This paper proposes a unknown morpheme estimation method which can handle both of the unknown morpheme errors. The proposed method detects Eojeols (Korean spacing units) that may include unknown morpheme errors using SVM (Support Vector Machine). Then, using CRFs (Conditional Random Fields), it segments morphemes from the detected Eojeols and annotates the segmented morphemes with new POS tags. In the experiments, the proposed method outperformed the conventional method based on the longest matching of functional words. Based on the experimental results, we knew that the second type errors should be dealt with in order to increase the performance of Korean morphological analysis.

An Enhancement of Learning Speed of the Error - Backpropagation Algorithm (오류 역전도 알고리즘의 학습속도 향상기법)

  • Shim, Bum-Sik;Jung, Eui-Yong;Yoon, Chung-Hwa;Kang, Kyung-Sik
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.7
    • /
    • pp.1759-1769
    • /
    • 1997
  • The Error BackPropagation (EBP) algorithm for multi-layered neural networks is widely used in various areas such as associative memory, speech recognition, pattern recognition and robotics, etc. Nevertheless, many researchers have continuously published papers about improvements over the original EBP algorithm. The main reason for this research activity is that EBP is exceeding slow when the number of neurons and the size of training set is large. In this study, we developed new learning speed acceleration methods using variable learning rate, variable momentum rate and variable slope for the sigmoid function. During the learning process, these parameters should be adjusted continuously according to the total error of network, and it has been shown that these methods significantly reduced learning time over the original EBP. In order to show the efficiency of the proposed methods, first we have used binary data which are made by random number generator and showed the vast improvements in terms of epoch. Also, we have applied our methods to the binary-valued Monk's data, 4, 5, 6, 7-bit parity checker and real-valued Iris data which are famous benchmark training sets for machine learning.

  • PDF

Research Trends for the Deep Learning-based Metabolic Rate Calculation (재실자 활동량 산출을 위한 딥러닝 기반 선행연구 동향)

  • Park, Bo-Rang;Choi, Eun-Ji;Lee, Hyo Eun;Kim, Tae-Won;Moon, Jin Woo
    • KIEAE Journal
    • /
    • v.17 no.5
    • /
    • pp.95-100
    • /
    • 2017
  • Purpose: The purpose of this study is to investigate the prior art based on deep learning to objectively calculate the metabolic rate which is the subjective factor for the PMV optimum control and to make a plan for future research based on this study. Methods: For this purpose, the theoretical and technical review and applicability analysis were conducted through various documents and data both in domestic and foreign. Results: As a result of the prior art research, the machine learning model of artificial neural network and deep learning has been used in various fields such as speech recognition, scene recognition, and image restoration. As a representative case, OpenCV Background Subtraction is a technique to separate backgrounds from objects or people. PASCAL VOC and ILSVRC are surveyed as representative technologies that can recognize people, objects, and backgrounds. Based on the results of previous researches on deep learning based on metabolic rate for occupational metabolic rate, it was found out that basic technology applicable to occupational metabolic rate calculation technology to be developed in future researches. It is considered that the study on the development of the activity quantity calculation model with high accuracy will be done.

Analysis of IT Technology through the Trends in Home Video Game Console (가정용 게임기 동향을 통해 본 IT 기술 분석)

  • Bae, Jung-Min;Bae, Yu-Mi;Jung, Sung-Jae;Jang, Rea-Young;Sung, Kyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.05a
    • /
    • pp.675-678
    • /
    • 2014
  • One time, Home video game console's penetration was as comparable to the personal computer's penetration, growth has slowed since the advent of smartphones, tablets and moblie devices. But game console actively introducing new IT technologies not available in the pc games and mobile games, still keeping a firm position in the relevent market. In this paper Home video game console's history, contemporary trends, and learn about trends in the company, New IT technologies applied to gaming was analyzed. Home video console market become the arena of New IT technologies according to the introduction of New IT technologies such as gesture recognition technology, speech recognition technology, media facade technology, virtual reality technology.

  • PDF

Expiration Date Notification System Based on YOLO and OCR algorithms for Visually Impaired Person (YOLO와 OCR 알고리즘에 기반한 시각 장애우를 위한 유통기한 알림 시스템)

  • Kim, Min-Soo;Moon, Mi-Kyung;Han, Chang-Hee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.6
    • /
    • pp.1329-1338
    • /
    • 2021
  • There are rarely effective methods to help visually impaired people when they want to know the expiration date of products excepted to only Braille. In this study, we developed an expiration date notification system based on YOLO and OCR for visually impaired people. The handicapped people can automatically know the expiration date of a specific product by using our system without the help of a caregiver, fast and accurately. The proposed system is worked by four different steps: (1) identification of a target product by scanning its barcode; (2) segmentation of an image area with the expiration date using YOLO; (3) classification of the expiration date by OCR: (4) notification of the expiration date by TTS. Our system showed an average classification accuracy of about 86.00% when blindfolded subjects used the proposed system in real-time. This result validates that the proposed system can be potentially used for visually impaired people.

A Review on Advanced Methodologies to Identify the Breast Cancer Classification using the Deep Learning Techniques

  • Bandaru, Satish Babu;Babu, G. Rama Mohan
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.4
    • /
    • pp.420-426
    • /
    • 2022
  • Breast cancer is among the cancers that may be healed as the disease diagnosed at early times before it is distributed through all the areas of the body. The Automatic Analysis of Diagnostic Tests (AAT) is an automated assistance for physicians that can deliver reliable findings to analyze the critically endangered diseases. Deep learning, a family of machine learning methods, has grown at an astonishing pace in recent years. It is used to search and render diagnoses in fields from banking to medicine to machine learning. We attempt to create a deep learning algorithm that can reliably diagnose the breast cancer in the mammogram. We want the algorithm to identify it as cancer, or this image is not cancer, allowing use of a full testing dataset of either strong clinical annotations in training data or the cancer status only, in which a few images of either cancers or noncancer were annotated. Even with this technique, the photographs would be annotated with the condition; an optional portion of the annotated image will then act as the mark. The final stage of the suggested system doesn't need any based labels to be accessible during model training. Furthermore, the results of the review process suggest that deep learning approaches have surpassed the extent of the level of state-of-of-the-the-the-art in tumor identification, feature extraction, and classification. in these three ways, the paper explains why learning algorithms were applied: train the network from scratch, transplanting certain deep learning concepts and constraints into a network, and (another way) reducing the amount of parameters in the trained nets, are two functions that help expand the scope of the networks. Researchers in economically developing countries have applied deep learning imaging devices to cancer detection; on the other hand, cancer chances have gone through the roof in Africa. Convolutional Neural Network (CNN) is a sort of deep learning that can aid you with a variety of other activities, such as speech recognition, image recognition, and classification. To accomplish this goal in this article, we will use CNN to categorize and identify breast cancer photographs from the available databases from the US Centers for Disease Control and Prevention.

Method of Automatically Generating Metadata through Audio Analysis of Video Content (영상 콘텐츠의 오디오 분석을 통한 메타데이터 자동 생성 방법)

  • Sung-Jung Young;Hyo-Gyeong Park;Yeon-Hwi You;Il-Young Moon
    • Journal of Advanced Navigation Technology
    • /
    • v.25 no.6
    • /
    • pp.557-561
    • /
    • 2021
  • A meatadata has become an essential element in order to recommend video content to users. However, it is passively generated by video content providers. In the paper, a method for automatically generating metadata was studied in the existing manual metadata input method. In addition to the method of extracting emotion tags in the previous study, a study was conducted on a method for automatically generating metadata for genre and country of production through movie audio. The genre was extracted from the audio spectrogram using the ResNet34 artificial neural network model, a transfer learning model, and the language of the speaker in the movie was detected through speech recognition. Through this, it was possible to confirm the possibility of automatically generating metadata through artificial intelligence.

Volume Control using Gesture Recognition System

  • Shreyansh Gupta;Samyak Barnwal
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.6
    • /
    • pp.161-170
    • /
    • 2024
  • With the technological advances, the humans have made so much progress in the ease of living and now incorporating the use of sight, motion, sound, speech etc. for various application and software controls. In this paper, we have explored the project in which gestures plays a very significant role in the project. The topic of gesture control which has been researched a lot and is just getting evolved every day. We see the usage of computer vision in this project. The main objective that we achieved in this project is controlling the computer settings with hand gestures using computer vision. In this project we are creating a module which acts a volume controlling program in which we use hand gestures to control the computer system volume. We have included the use of OpenCV. This module is used in the implementation of hand gestures in computer controls. The module in execution uses the web camera of the computer to record the images or videos and then processes them to find the needed information and then based on the input, performs the action on the volume settings if that computer. The program has the functionality of increasing and decreasing the volume of the computer. The setup needed for the program execution is a web camera to record the input images and videos which will be given by the user. The program will perform gesture recognition with the help of OpenCV and python and its libraries and them it will recognize or identify the specified human gestures and use them to perform or carry out the changes in the device setting. The objective is to adjust the volume of a computer device without the need for physical interaction using a mouse or keyboard. OpenCV, a widely utilized tool for image processing and computer vision applications in this domain, enjoys extensive popularity. The OpenCV community consists of over 47,000 individuals, and as of a survey conducted in 2020, the estimated number of downloads exceeds 18 million.