• Title/Summary/Keyword: voice image

Search Result 297, Processing Time 0.026 seconds

An Efficient 2-dimensional Addressing Mode for Image Processor (영상처리용 프로세서를 위한 효율적인 이차원 어드레스 지정 기법)

  • Go, Yun-Ho;Yun, Byeong-Ju;Kim, Seong-Dae
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.38 no.5
    • /
    • pp.486-497
    • /
    • 2001
  • In this paper, we propose a new addressing mode, which can be used for programmable image processor to perform image-processing algorithms effectively. Conventional addressing modes are suitable for one-dimensional data processing such as voice, but the proposed addressing mode consider two-dimensional characteristics of image data. The proposed instruction for two-dimensional addressing requires two operands to specify a pixel and doesn't require any change of memory architecture. The proposed two-dimensional addressing mode for image processor has the following advantages. The proposed instruction combines several instructions to load a pixel data from an external memory to a register. Hence, the proposed instruction reduces required code size so that it satisfies high performance and low power requirements of image processor. In addition, it uses inherent two-dimensional characteristics of image data and offers user-friendly instruction to assembler programmer The proposed two-dimensional addressing mode is applicable to DSP, media processor, graphic device, and so on. In this paper, we propose a new concept of two-dimensional addressing mode and an efficient hardware implementation method of it.

  • PDF

Hand Biometric Information Recognition System of Mobile Phone Image for Mobile Security (모바일 보안을 위한 모바일 폰 영상의 손 생체 정보 인식 시스템)

  • Hong, Kyungho;Jung, Eunhwa
    • Journal of Digital Convergence
    • /
    • v.12 no.4
    • /
    • pp.319-326
    • /
    • 2014
  • According to the increasing mobile security users who have experienced authentication failure by forgetting passwords, user names, or a response to a knowledge-based question have preference for biological information such as hand geometry, fingerprints, voice in personal identification and authentication. Therefore biometric verification of personal identification and authentication for mobile security provides assurance to both the customer and the seller in the internet. Our study focuses on human hand biometric information recognition system for personal identification and personal Authentication, including its shape, palm features and the lengths and widths of the fingers taken from mobile phone photographs such as iPhone4 and galaxy s2. Our hand biometric information recognition system consists of six steps processing: image acquisition, preprocessing, removing noises, extracting standard hand feature extraction, individual feature pattern extraction, hand biometric information recognition for personal identification and authentication from input images. The validity of the proposed system from mobile phone image is demonstrated through 93.5% of the sucessful recognition rate for 250 experimental data of hand shape images and palm information images from 50 subjects.

A Study on Removing Impulse Noise using Modified Adaptive Switching Median Filter (변형된 적응 스위칭 메디안 필터를 이용한 임펄스 잡음제거에 관한 연구)

  • Gao, Yinyu;Kim, Nam-Ho
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.11
    • /
    • pp.2474-2479
    • /
    • 2011
  • As society has developed rapidly toward a highly advanced digital information age, a multimedia communication service for acquisition, transmission and storage of image data as well as voice has being commercialized. However, image data is always corrupted by various noises during image processing, so researches for removing noises have been continued until now. In this paper, in order to remove impulse noise we proposed modified adaptive switching median filter that consists of two stages: noise detection and noise removal. Proposed algorithm only processes noise pixels and these noise pixels are replaced by filter output, so proposed algorithm performs well not only removes noise but also preserves edge information. Also we compare existing methods using PSNR(peak signal to noise ratio) as the standard of judgement of improvement effect and choose conventional algorithms to compare with our proposed method.

Design of a User authentication Protocol Using Face Information (얼굴정보를 이용한 사용자 인증 프로토콜 설계)

  • 지은미
    • Journal of the Korea Computer Industry Society
    • /
    • v.5 no.1
    • /
    • pp.157-166
    • /
    • 2004
  • Consequently substantial research has been done on the development of the bio-metric recognition method as well as technical research in the field of authentication. As a method of bio-metric recognition, personal and unique information such as fingerprints, voice, face, Iris, hand-geometry and vein-pattern are used. The face image system in bio-metric recognition and information authentication reduces the denial response from the users because it is a non-contact system the face image system operates through a PC camera attached to a computer base this makes the system economically viable as well as user friendly. Conversely, the face image system is very sensitive to illumination, hair style and appearance and consequently creates recognition errors easily, therefore we must build a stable authentication system which is not too sensitive to changes in appearance and light. In this study, I proposed user authentication protocol to serve a confidentiality and integrity and to obtain a least Equal Error Rate to minimize the wrong authentication rate when it authenticates the user.

  • PDF

Multiple LCD System Development of daisy-chain Method using LVDS (LVDS를 이용한 daisy-chain 방식의 다중 LCD 시스템 개발)

  • Kim, Jae-Chul
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.12
    • /
    • pp.2747-2754
    • /
    • 2012
  • This thesis explains the development of multiple LCD system with the additional function to maximize the utilization of PC contents. The newly developed system is composed of host LCD and slave LCD. Host LCD decodes and outputs the image and voice of NTSC, PAL, SECAM signals. It also converts the decoded signals into LVDS signals before transmitting them to slave LCD stage. In addition, the installation of CF Memory and USB Memory helps display multi-media data. Unlike the host LCD, since not including the tuner and memory part, the slave LCD can't receive TV signals and play video signals. It only has the function to receive LVDS image signals and display on a LCD panel. This newly developed multi-LCD system has competitiveness in various aspects. With its simple structure, the failure rate, price and display power are relatively low due to its simplification of the control part. It has price and functional competitiveness as the product whose host LCD can control the entire slave LCD in terms of channel, volume, and video output.

Efficiency Analysis of Schisandra Tea Using Image & Acoustic Signal Processing (영상과 음성 처리를 이용한 오미자차의 효능 분석)

  • Kim, Bong-Hyun;Cho, Dong-Uk;Han, Kil-Sung;J.Bae, Young-Lae
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.7
    • /
    • pp.2975-2981
    • /
    • 2011
  • We propose an efficiency analysis method to test whether Schisandra tea is beneficial to the health of internal organs, and, if so, to find which organ is mostly benefited. Firstly, for this, the color change in the right cheek region of face images before and after drinking Schisandra tea is analyzed using image processing techniques. Also, acoustic analysis experiments on both of the vocal cords vibration and the intensity value of acoustic signal are performed to investigate further Schisandra tea's effectiveness. The results of lung function smoothly by eating omijacha become. Vocal cords vibration and voice energy intensity of the right cheek region has changed a steady increase of the b values were observed.

Improved Transformer Model for Multimodal Fashion Recommendation Conversation System (멀티모달 패션 추천 대화 시스템을 위한 개선된 트랜스포머 모델)

  • Park, Yeong Joon;Jo, Byeong Cheol;Lee, Kyoung Uk;Kim, Kyung Sun
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.1
    • /
    • pp.138-147
    • /
    • 2022
  • Recently, chatbots have been applied in various fields and have shown good results, and many attempts to use chatbots in shopping mall product recommendation services are being conducted on e-commerce platforms. In this paper, for a conversation system that recommends a fashion that a user wants based on conversation between the user and the system and fashion image information, a transformer model that is currently performing well in various AI fields such as natural language processing, voice recognition, and image recognition. We propose a multimodal-based improved transformer model that is improved to increase the accuracy of recommendation by using dialogue (text) and fashion (image) information together for data preprocessing and data representation. We also propose a method to improve accuracy through data improvement by analyzing the data. The proposed system has a recommendation accuracy score of 0.6563 WKT (Weighted Kendall's tau), which significantly improved the existing system's 0.3372 WKT by 0.3191 WKT or more.

Research on Generative AI for Korean Multi-Modal Montage App (한국형 멀티모달 몽타주 앱을 위한 생성형 AI 연구)

  • Lim, Jeounghyun;Cha, Kyung-Ae;Koh, Jaepil;Hong, Won-Kee
    • Journal of Service Research and Studies
    • /
    • v.14 no.1
    • /
    • pp.13-26
    • /
    • 2024
  • Multi-modal generation is the process of generating results based on a variety of information, such as text, images, and audio. With the rapid development of AI technology, there is a growing number of multi-modal based systems that synthesize different types of data to produce results. In this paper, we present an AI system that uses speech and text recognition to describe a person and generate a montage image. While the existing montage generation technology is based on the appearance of Westerners, the montage generation system developed in this paper learns a model based on Korean facial features. Therefore, it is possible to create more accurate and effective Korean montage images based on multi-modal voice and text specific to Korean. Since the developed montage generation app can be utilized as a draft montage, it can dramatically reduce the manual labor of existing montage production personnel. For this purpose, we utilized persona-based virtual person montage data provided by the AI-Hub of the National Information Society Agency. AI-Hub is an AI integration platform aimed at providing a one-stop service by building artificial intelligence learning data necessary for the development of AI technology and services. The image generation system was implemented using VQGAN, a deep learning model used to generate high-resolution images, and the KoDALLE model, a Korean-based image generation model. It can be confirmed that the learned AI model creates a montage image of a face that is very similar to what was described using voice and text. To verify the practicality of the developed montage generation app, 10 testers used it and more than 70% responded that they were satisfied. The montage generator can be used in various fields, such as criminal detection, to describe and image facial features.

An Advanced User-friendly Wireless Smart System for Vehicle Safety Monitoring and Accident Prevention (차량 안전 모니터링 및 사고 예방을 위한 친사용자 환경의 첨단 무선 스마트 시스템)

  • Oh, Se-Bin;Chung, Yeon-Ho;Kim, Jong-Jin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.9
    • /
    • pp.1898-1905
    • /
    • 2012
  • This paper presents an On-board Smart Device (OSD) for moving vehicle, based on a smooth integration of Android-based devices and a Micro-control Unit (MCU). The MCU is used for the acquisition and transmission of various vehicle-borne data. The OSD has threefold functions: Record, Report and Alarm. Based on these RRA functions, the OSD is basically a safety and convenience oriented smart device, where it facilitates alert services such as accident report and rescue as well as alarm for the status of vehicle. In addition, voice activated interface is developed for the convenience of users. Vehicle data can also be uploaded to a remote server for further access and data manipulation. Therefore, unlike conventional blackboxes, the developed OSD lends itself to a user-friendly smart device for vehicle safety: It basically stores monitoring images in driving plus vehicle data collection. Also, it reports on accident and enables subsequent rescue operation. The developed OSD can thus be considered an essential safety smart device equipped with comprehensive wireless data service, image transfer and voice activated interface.

Two Flow Control Techniques for Teleconferencing over the Internet (인터넷상에서 원격회의를 위한 두 가지 흐름 제어 기법)

  • Na, Seung-Gu;Go, Min-Su;An, Jong-Seok
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.26 no.8
    • /
    • pp.975-983
    • /
    • 1999
  • 최근 네트워크의 속도가 빨라지고 멀티미디어 데이터를 다루기 위한 기술들이 개발됨에 따라 많은 멀티미디어 응용 프로그램들이 인터넷에 등장하고 있다. 그러나 이들 응용프로그램들은 수신자에게 전송되는 영상.음성의 품질이 낮기 때문에 기대만큼 빠르게 확산되지 못하고 있다. 영상.음성의 품질이 낮은 이유는 현재 인터넷이 실시간 응용프로그램이 요구하는 만큼 빠르고 신뢰성 있게 데이터를 전송할 수 없기 때문이다. 현재 인터넷의 내부구조를 바꾸지 않고 품질을 높이기 위해 많은 연구들이 진행되고 있는데 그 중 하나는 동적으로 변화하는 인터넷의 상태에 맞게 멀티캐스트 트래픽의 전송율을 조절하는 종단간의 흐름제어이다. 본 논문은 기존의 흐름제어 기법인 IVS와 RLM의 성능을 개선시키기 위한 두 가지 흐름제어 기법을 소개한다. IVS는 송신자가 주기적으로 측정된 네트워크 상태에 따라 전송율을 일정하게 조절한다. 송신자가 하나의 데이타 스트림을 생성하는 IVS와는 달리 RLM에서는 송신자가 계층적 코딩에 의하여 생성된 여러개의 데이타 스트림을 전송하고 각 수신자는 자신의 네트워크 상태에 맞게 데이타 스트림을 선택하는 기법이다. 그러나 IVS는 송신자가 전송율을 일정하게 증가시키고, RLM은 각자의 네트워크 상태를 고려하지 않고 임의의 시간에 하나 이상의 데이타 스트림을 받기 때문에 성능을 저하시킬 수 있다. 본 논문에서는 TCP-like IVS와 Adaptive RLM이라는 두 가지 새로운 기법을 소개한다. TCP-like IVS는 송신자가 전송율을 동적으로 결정하고, Adaptive RLM은 하나 이상의 데이타 스트림을 받기 위해 적당한 시간을 선택할 수 있다. 본 논문에서는 시뮬레이션을 통해 여러 가지 네트워크 구조에서 두 가지 방식이 기존의 방식에 비하여 더욱 높은 대역폭 이용율과 10~20% 정도 적은 패킷손실율을 이룬다는 것을 보여준다.Abstract Nowadays, many multimedia applications for the Internet are introduced as the network gets faster and many techniques manipulating multimedia data are developed. These multimedia applications, however, do not spread widely and are not fast as expected at their introduction time due to the poor quality of image and voice delivered at receivers. The poor quality is mainly attributed to that the current Internet can not carry data as fast and reliably as the real-time applications require. To improve the quality without modifying the internal structure of the current Internet, many researches are conducted. One of them is an end-to-end flow control of multicast traffic adapting the sending rate to the dynamically varying Internet state. This paper proposes two flow-control techniques which can improve the performance of the two conventional techniques; IVS and RLM. IVS statically adjusts the sending rate based on the network state periodically estimated. Differently from IVS in which a sender produces one single data stream, in RLM a sender transmits several data streams generated by the layered coding scheme and each receiver selects some data streams based on its own network state. The more data streams a receiver receives, the better quality of image or voice the receiver can produce. The two techniques, however, can degrade the performance since IVS increases its sending rate statically and RLM accepts one more data stream at arbitrary time regardless of the network state respectively. We introduce two new techniques called TCP-like IVS and Adaptive RLM; TCP-like IVS can determine the sending rate dynamically and Adaptive RLM can select the right time to add one more data stream. Our simulation experiments show that two techniques can achieve better utilization and less packet loss by 10-20% over various network topologies.