Search | Korea Science

Synthesis of Expressive Talking Heads from Speech with Recurrent Neural Network (RNN을 이용한 Expressive Talking Head from Speech의 합성)

Sakurai, Ryuhei;Shimba, Taiki;Yamazoe, Hirotake;Lee, Joo-Ho
- The Journal of Korea Robotics Society
- /
- v.13 no.1
- /
- pp.16-25
- /
- 2018
The talking head (TH) indicates an utterance face animation generated based on text and voice input. In this paper, we propose the generation method of TH with facial expression and intonation by speech input only. The problem of generating TH from speech can be regarded as a regression problem from the acoustic feature sequence to the facial code sequence which is a low dimensional vector representation that can efficiently encode and decode a face image. This regression was modeled by bidirectional RNN and trained by using SAVEE database of the front utterance face animation database as training data. The proposed method is able to generate TH with facial expression and intonation TH by using acoustic features such as MFCC, dynamic elements of MFCC, energy, and F0. According to the experiments, the configuration of the BLSTM layer of the first and second layers of bidirectional RNN was able to predict the face code best. For the evaluation, a questionnaire survey was conducted for 62 persons who watched TH animations, generated by the proposed method and the previous method. As a result, 77% of the respondents answered that the proposed method generated TH, which matches well with the speech.
https://doi.org/10.7746/jkros.2018.13.1.016 인용 PDF KSCI

Attitude control in spacecraft orbit-raising using a reduced quaternion model

Yang, Yaguang
- Advances in aircraft and spacecraft science
- /
- v.1 no.4
- /
- pp.427-441
- /
- 2014
Orbit-raising is an important step to place spacecraft from parking orbits into working orbits. Attitude control system design is crucial in the success of orbit-raising. Several text books have discussed this design and focused mainly on the traditional methods based on single-input single-output (SISO) transfer function models. These models are not good representations for many orbit-raising control systems which have multiple thrusters and each thruster has impact on the attitude defined by all outputs. Only one published article is known to use a more suitable multi-input multi-output (MIMO) Euler angle model in spacecraft orbit-raising attitude control system design. In this paper, a quaternion based MIMO model for the orbit-raising attitude control system design is proposed. The advantages of using quaternion based model for orbit-raising control system designs are (a) there is no need for mathematical transformations because the attitude measurements are normally given by quaternion, (b) quaternion based model does not depend on rotational sequences, which reduces the chance of human errors, and (c) the singular point of reduced quaternion model is the farthest from the operational point where linearization is performed. We will show that performance of quaternion model based design will be as good as the performance of Euler angle model based design for orbit-raising problem.
https://doi.org/10.12989/aas.2014.1.4.427 인용

A Prototype Development of Personal Low-frequency Stimulator with Characteristic Analysis (개인용 저주파 자극기의 특성분석 및 Prototype개발)

Lee, Gi-Song;Lee, Dong-Ha;Yu, Jae-Taek
- Proceedings of the KIEE Conference
- /
- 2003.11c
- /
- pp.349-352
- /
- 2003
A personal low-frequency stimulator is a portable device to relax muscle pains of a person. The stimulator generates combined low-frequency pulses to be applied to pads attached to painful muscles. This paper reports a development of such device with its characteristic analyses. The major components of our stimulator are MCU, high-voltage generating circuit part, high-voltage switching circuit part, input switch part and display unit. High-voltage generating circuit is designed by using a boost converter circuit and allows user control of the output voltage. High-voltage switching circuit, controlled by MCU, generates output voltage to be applied to pads. Input switch part is composed of power supply, intensity selection, mode selection and memory. Display unit adopts a text LCD module to display modes, Intensity, output frequency and user set-up time. Our designed safety circuit, to protect human body from possible electric shock, slowly increases the output voltage to the selected output intensity. It continuously checks the output pulse shape and disable the output when dangerous pulses are detected. This paper also shows some experimental results.
PDF

A Study on the Correlation between Atypical Form Factor-based Smartphones and Display-dependent Authentication Methods (비정형 폼 팩터 기반 스마트폰과 디스플레이 의존형 사용자 인증기법의 상관관계 연구)

Choi, Dongmin
- Journal of Korea Multimedia Society
- /
- v.24 no.8
- /
- pp.1076-1089
- /
- 2021
Among the currently used knowledge-based authentication methods for smartphones, text and graphic-based authentication methods, such as PIN and pattern methods, use a display unit and a touch function of the display unit for input/output of secret information. Recently released smartphone form factors are trying to transform into various forms, away from the conventional bar and slate types because of the material change of the display unit used in the existing smartphone and the increased flexibility of the display unit. However, as mentioned in the study of D. Choi [1], the structural change of the display unit may directly or indirectly affect the authentication method using the display unit as the main input/output device for confidential information, resulting in unexpected security vulnerabilities. In this paper, we analyze the security vulnerabilities of the current mobile user authentication methods that is applied atypical form factor. According to the analysis results, it seems that the existing display-dependent mobile user authentication methods do not consider emerging security threats at all. Furthermore, it is easily affected by changes in the form factor of smartphones. Finally, we propose countermeasures for security vulnerabilities expected when applying conventional authentication methods to atypical form factor-based smartphones.
https://doi.org/10.9717/kmms.2021.24.8.1076 인용 PDF KSCI HTML

KI-HABS: Key Information Guided Hierarchical Abstractive Summarization

Zhang, Mengli;Zhou, Gang;Yu, Wanting;Liu, Wenfen
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.15 no.12
- /
- pp.4275-4291
- /
- 2021
With the unprecedented growth of textual information on the Internet, an efficient automatic summarization system has become an urgent need. Recently, the neural network models based on the encoder-decoder with an attention mechanism have demonstrated powerful capabilities in the sentence summarization task. However, for paragraphs or longer document summarization, these models fail to mine the core information in the input text, which leads to information loss and repetitions. In this paper, we propose an abstractive document summarization method by applying guidance signals of key sentences to the encoder based on the hierarchical encoder-decoder architecture, denoted as KI-HABS. Specifically, we first train an extractor to extract key sentences in the input document by the hierarchical bidirectional GRU. Then, we encode the key sentences to the key information representation in the sentence level. Finally, we adopt key information representation guided selective encoding strategies to filter source information, which establishes a connection between the key sentences and the document. We use the CNN/Daily Mail and Gigaword datasets to evaluate our model. The experimental results demonstrate that our method generates more informative and concise summaries, achieving better performance than the competitive models.
https://doi.org/10.3837/tiis.2021.12.001 인용 PDF KSCI

Fake News Detection on Social Media using Video Information: Focused on YouTube (영상정보를 활용한 소셜 미디어상에서의 가짜 뉴스 탐지: 유튜브를 중심으로)

Chang, Yoon Ho;Choi, Byoung Gu
- The Journal of Information Systems
- /
- v.32 no.2
- /
- pp.87-108
- /
- 2023
Purpose The main purpose of this study is to improve fake news detection performance by using video information to overcome the limitations of extant text- and image-oriented studies that do not reflect the latest news consumption trend. Design/methodology/approach This study collected video clips and related information including news scripts, speakers' facial expression, and video metadata from YouTube to develop fake news detection model. Based on the collected data, seven combinations of related information (i.e. scripts, video metadata, facial expression, scripts and video metadata, scripts and facial expression, and scripts, video metadata, and facial expression) were used as an input for taining and evaluation. The input data was analyzed using six models such as support vector machine and deep neural network. The area under the curve(AUC) was used to evaluate the performance of classification model. Findings The results showed that the ACU and accuracy values of three features combination (scripts, video metadata, and facial expression) were the highest in logistic regression, naïve bayes, and deep neural network models. This result implied that the fake news detection could be improved by using video information(video metadata and facial expression). Sample size of this study was relatively small. The generalizablity of the results would be enhanced with a larger sample size.
https://doi.org/10.5859/KAIS.2023.32.2.87 인용 PDF

Development of Dental Consultation Chatbot using Retrieval Augmented LLM (검색 증강 LLM을 이용한 치과 상담용 챗봇 개발)

Jongjin Park
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.24 no.2
- /
- pp.87-92
- /
- 2024
In this paper, a RAG system was implemented using an existing Large Language Model (LLM) and Langchain library to develop a dental consultation chatbot. For this purpose, we collected contents from the webpage bulletin boards of domestic dental university hospitals and constructed consultation data with the advice and supervision of dental specialists. In order to divide the input consultation data into appropriate sizes, the chunk size and the size of the overlapping text in each chunk were set to 1001 and 100, respectively. As a result of the simulation, the Retrieval Augmented LLM searched for and output the consultation content that was most similar to the user input. It was confirmed that the accessibility of dental consultation and the accuracy of consultation content could be improved through the built chatbot.
https://doi.org/10.7236/JIIBC.2024.24.2.87 인용 PDF HTML

Bankruptcy Prediction Modeling Using Qualitative Information Based on Big Data Analytics (빅데이터 기반의 정성 정보를 활용한 부도 예측 모형 구축)

Jo, Nam-ok;Shin, Kyung-shik
- Journal of Intelligence and Information Systems
- /
- v.22 no.2
- /
- pp.33-56
- /
- 2016
Many researchers have focused on developing bankruptcy prediction models using modeling techniques, such as statistical methods including multiple discriminant analysis (MDA) and logit analysis or artificial intelligence techniques containing artificial neural networks (ANN), decision trees, and support vector machines (SVM), to secure enhanced performance. Most of the bankruptcy prediction models in academic studies have used financial ratios as main input variables. The bankruptcy of firms is associated with firm's financial states and the external economic situation. However, the inclusion of qualitative information, such as the economic atmosphere, has not been actively discussed despite the fact that exploiting only financial ratios has some drawbacks. Accounting information, such as financial ratios, is based on past data, and it is usually determined one year before bankruptcy. Thus, a time lag exists between the point of closing financial statements and the point of credit evaluation. In addition, financial ratios do not contain environmental factors, such as external economic situations. Therefore, using only financial ratios may be insufficient in constructing a bankruptcy prediction model, because they essentially reflect past corporate internal accounting information while neglecting recent information. Thus, qualitative information must be added to the conventional bankruptcy prediction model to supplement accounting information. Due to the lack of an analytic mechanism for obtaining and processing qualitative information from various information sources, previous studies have only used qualitative information. However, recently, big data analytics, such as text mining techniques, have been drawing much attention in academia and industry, with an increasing amount of unstructured text data available on the web. A few previous studies have sought to adopt big data analytics in business prediction modeling. Nevertheless, the use of qualitative information on the web for business prediction modeling is still deemed to be in the primary stage, restricted to limited applications, such as stock prediction and movie revenue prediction applications. Thus, it is necessary to apply big data analytics techniques, such as text mining, to various business prediction problems, including credit risk evaluation. Analytic methods are required for processing qualitative information represented in unstructured text form due to the complexity of managing and processing unstructured text data. This study proposes a bankruptcy prediction model for Korean small- and medium-sized construction firms using both quantitative information, such as financial ratios, and qualitative information acquired from economic news articles. The performance of the proposed method depends on how well information types are transformed from qualitative into quantitative information that is suitable for incorporating into the bankruptcy prediction model. We employ big data analytics techniques, especially text mining, as a mechanism for processing qualitative information. The sentiment index is provided at the industry level by extracting from a large amount of text data to quantify the external economic atmosphere represented in the media. The proposed method involves keyword-based sentiment analysis using a domain-specific sentiment lexicon to extract sentiment from economic news articles. The generated sentiment lexicon is designed to represent sentiment for the construction business by considering the relationship between the occurring term and the actual situation with respect to the economic condition of the industry rather than the inherent semantics of the term. The experimental results proved that incorporating qualitative information based on big data analytics into the traditional bankruptcy prediction model based on accounting information is effective for enhancing the predictive performance. The sentiment variable extracted from economic news articles had an impact on corporate bankruptcy. In particular, a negative sentiment variable improved the accuracy of corporate bankruptcy prediction because the corporate bankruptcy of construction firms is sensitive to poor economic conditions. The bankruptcy prediction model using qualitative information based on big data analytics contributes to the field, in that it reflects not only relatively recent information but also environmental factors, such as external economic conditions.
https://doi.org/10.13088/jiis.2016.22.2.033 인용 PDF KSCI

A Hangul Script Matching Algorithm for PDA (PDA상에서의 한글 필기체 매칭 알고리즘)

Cho, Mi-Gyung;Cho, Hwan-Gue
- Journal of KIISE:Software and Applications
- /
- v.29 no.10
- /
- pp.684-693
- /
- 2002
Electronic Ink is a stored data in the form of the handwritten text or the script without converting it into ASCII by handwritten recognition on the pen-based computers and Personal Digital Assistants(PDAs) for supporting natural and convenient data input. One of the most Important issue is to search the electronic ink in order to use it. We proposed and implemented a script matching algorithm for the electronic ink. Proposed matching algorithm separated the input stroke into a set of primitive stroke using the curvature of the stroke curve. After determining the type of separated strokes, it produced a stroke feature vector. And then it calculated the distance between the stroke feature vector of input strokes and one of strokes in the database using the dynamic programming technique. We did various experiments and our algorithm showed high matching rate over 97.7% for only the Korean script and 94% for the data mixed Korean with the Chinese character.
PDF KSCI

Development of Gesture-allowed Electronic Ink Editor (제스쳐 허용 전자 잉크 에디터의 개발)

조미경;오암석
- Journal of Korea Multimedia Society
- /
- v.6 no.6
- /
- pp.1054-1061
- /
- 2003
Electronic ink is multimedia data that have emerged from the development of pen-based computers such as PDAs whose major input device is a stylus pen. Recently with the development and supply of pen-based mobile computers, the necessity of data processing techniques of electronic ink has increased. Techniques to develop a gesture-allowed text editor in electronic ink domain were studied in this paper. Gesture and electronic ink data are a promising feature of pen-based user interface, but they have not yet been fully exploited. A new gesture recognition algorithm to identify pen gestures and a segmentation method for electronic ink to execute gesture commands were proposed. An electronic ink editor, called GesEdit was developed using proposed algorithms. The gesture recognition algorithm is based on eight features of input strokes. Convex hull and input time have been used to segment electronic ink data into GC(Gesture Components) unit. A variety of experiments by ten people showed that the average recognition rate reached 99.6% for nine gestures.
PDF

Search Result 358, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)