• Title/Summary/Keyword: Voice and Image Recognition

Search Result 74, Processing Time 0.041 seconds

A study on Iris Recognition using Wavelet Transformation and Nonlinear Function

  • Hur, Jung-Youn;Truong, Le Xuan
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2004.10a
    • /
    • pp.553-559
    • /
    • 2004
  • In todays security industry, personal identification is also based on biometric. Biometric identification is performed basing on the measurement and comparison of physiological and behavioral characteristics, Biometric for recognition includes voice dynamics, signature dynamics, hand geometry, fingerprint, iris, etc. Iris can serve as a kind of living passport or living password. Iris recognition system is the one of the most reliable biometrics recognition system. This is applied to client/server system such as the electronic commerce and electronic banking from stand-alone system or networks, ATMs, etc. A new algorithm using nonlinear function in recognition process is proposed in this paper. An algorithm is proposed to determine the localized iris from the iris image received from iris input camera in client. For the first step, the algorithm determines the center of pupil. For the second step, the algorithm determines the outer boundary of the iris and the pupillary boundary. The localized iris area is transform into polar coordinates. After performing three times Wavelet transformation, normalization was done using sigmoid function. The converting binary process performs normalized value of pixel from 0 to 255 to be binary value, and then the converting binary process is compare pairs of two adjacent pixels. The binary code of the iris is transmitted to the by server. the network. In the server, the comparing process compares the binary value of presented iris to the reference value in the University database. Process of recognition or rejection is dependent on the value of Hamming Distance. After matching the binary value of presented iris with the database stored in the server, the result is transmitted to the client.

  • PDF

An Intelligence Embedding Quadruped Pet Robot with Sensor Fusion (센서 퓨전을 통한 인공지능 4족 보행 애완용 로봇)

  • Lee Lae-Kyoung;Park Soo-Min;Kim Hyung-Chul;Kwon Yong-Kwan;Kang Suk-Hee;Choi Byoung-Wook
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.11 no.4
    • /
    • pp.314-321
    • /
    • 2005
  • In this paper an intelligence embedding quadruped pet robot is described. It has 15 degrees of freedom and consists of various sensors such as CMOS image, voice recognition and sound localization, inclinometer, thermistor, real-time clock, tactile touch, PIR and IR to allows owners to interact with pet robot according to human's intention as well as the original features of pet animals. The architecture is flexible and adopts various embedded processors for handling sensors to provide modular structure. The pet robot is also used for additional purpose such like security, gaming visual tracking, and research platform. It is possible to generate various actions and behaviors and to download voice or music files to maintain a close relation of users. With cost-effective sensor, the pet robot is able to find its recharge station and recharge itself when its battery runs low. To facilitate programming of the robot, we support several development environments. Therefore, the developed system is a low-cost programmable entertainment robot platform.

The Character Recognition System of Mobile Camera Based Image (모바일 이미지 기반의 문자인식 시스템)

  • Park, Young-Hyun;Lee, Hyung-Jin;Baek, Joong-Hwan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.5
    • /
    • pp.1677-1684
    • /
    • 2010
  • Recently, due to the development of mobile phone and supply of smart phone, many contents have been developed. Especially, since the small-sized cameras are equiped in mobile devices, people are interested in the image based contents development, and it also becomes important part in their practical use. Among them, the character recognition system can be widely used in the applications such as blind people guidance systems, automatic robot navigation systems, automatic video retrieval and indexing systems, automatic text translation systems. Therefore, this paper proposes a system that is able to extract text area from the natural images captured by smart phone camera. The individual characters are recognized and result is output in voice. Text areas are extracted using Adaboost algorithm and individual characters are recognized using error back propagated neural network.

Improved Transformer Model for Multimodal Fashion Recommendation Conversation System (멀티모달 패션 추천 대화 시스템을 위한 개선된 트랜스포머 모델)

  • Park, Yeong Joon;Jo, Byeong Cheol;Lee, Kyoung Uk;Kim, Kyung Sun
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.1
    • /
    • pp.138-147
    • /
    • 2022
  • Recently, chatbots have been applied in various fields and have shown good results, and many attempts to use chatbots in shopping mall product recommendation services are being conducted on e-commerce platforms. In this paper, for a conversation system that recommends a fashion that a user wants based on conversation between the user and the system and fashion image information, a transformer model that is currently performing well in various AI fields such as natural language processing, voice recognition, and image recognition. We propose a multimodal-based improved transformer model that is improved to increase the accuracy of recommendation by using dialogue (text) and fashion (image) information together for data preprocessing and data representation. We also propose a method to improve accuracy through data improvement by analyzing the data. The proposed system has a recommendation accuracy score of 0.6563 WKT (Weighted Kendall's tau), which significantly improved the existing system's 0.3372 WKT by 0.3191 WKT or more.

Implementation of a Refusable Human-Robot Interaction Task with Humanoid Robot by Connecting Soar and ROS (Soar (State Operator and Result)와 ROS 연계를 통해 거절가능 HRI 태스크의 휴머노이드로봇 구현)

  • Dang, Chien Van;Tran, Tin Trung;Pham, Trung Xuan;Gil, Ki-Jong;Shin, Yong-Bin;Kim, Jong-Wook
    • The Journal of Korea Robotics Society
    • /
    • v.12 no.1
    • /
    • pp.55-64
    • /
    • 2017
  • This paper proposes combination of a cognitive agent architecture named Soar (State, operator, and result) and ROS (Robot Operating System), which can be a basic framework for a robot agent to interact and cope with its environment more intelligently and appropriately. The proposed Soar-ROS human-robot interaction (HRI) agent understands a set of human's commands by voice recognition and chooses to properly react to the command according to the symbol detected by image recognition, implemented on a humanoid robot. The robotic agent is allowed to refuse to follow an inappropriate command like "go" after it has seen the symbol 'X' which represents that an abnormal or immoral situation has occurred. This simple but meaningful HRI task is successfully experimented on the proposed Soar-ROS platform with a small humanoid robot, which implies that extending the present hybrid platform to artificial moral agent is possible.

The Similarity of the Image Comparison System utilizing OpenCV (OpenCV를 활용한 이미지 유사성 비교 시스템)

  • Ban, Tae-Hak;Bang, Jin-Suk;Yuk, Jung-Soo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2016.05a
    • /
    • pp.834-835
    • /
    • 2016
  • In recent years, advances in technology, IT is rapidly growing. Accordingly, real time image processing and multiple platforms, providing compatibility with OpenCV for image processing technology research on actively in progress. At present, different, comparing the images to determine the similarity is low, the system will match the rate of people using the analogue figures to determine the system is for the most part. In this paper, Template Matching of OpenCV and Feature Matching utilizing different images to determine the similarity between digital values for the system. A comparison of the features of a specific point on the screen the image to extract the same feature in a different size, you can compare the features of the target image recognized as compared to three historic castle in comparison, verification. This is the voice and image recognition and analysis, check the matching rate readings than in Zhengzhou treatment techniques are available. The future of forensic and other image processing technologies for OpenCV studies will be needed to feed.

  • PDF

Intelligent Countenance Robot, Humanoid ICHR (지능형 표정로봇, 휴머노이드 ICHR)

  • Byun, Sang-Zoon
    • Proceedings of the KIEE Conference
    • /
    • 2006.10b
    • /
    • pp.175-180
    • /
    • 2006
  • In this paper, we develope a type of humanoid robot which can express its emotion against human actions. To interact with human, the developed robot has several abilities to express its emotion, which are verbal communication with human through voice/image recognition, motion tracking, and facial expression using fourteen Servo Motors. The proposed humanoid robot system consists of a control board designed with AVR90S8535 to control servor motors, a framework equipped with fourteen server motors and two CCD cameras, a personal computer to monitor its operations. The results of this research illustrate that our intelligent emotional humanoid robot is very intuitive and friendly so human can interact with the robot very easily.

  • PDF

Multidimensional Affective model-based Multimodal Complex Emotion Recognition System using Image, Voice and Brainwave (다차원 정서모델 기반 영상, 음성, 뇌파를 이용한 멀티모달 복합 감정인식 시스템)

  • Oh, Byung-Hun;Hong, Kwang-Seok
    • Annual Conference of KIPS
    • /
    • 2016.04a
    • /
    • pp.821-823
    • /
    • 2016
  • 본 논문은 다차원 정서모델 기반 영상, 음성, 뇌파를 이용한 멀티모달 복합 감정인식 시스템을 제안한다. 사용자의 얼굴 영상, 목소리 및 뇌파를 기반으로 각각 추출된 특징을 심리학 및 인지과학 분야에서 인간의 감정을 구성하는 정서적 감응요소로 알려진 다차원 정서모델(Arousal, Valence, Dominance)에 대한 명시적 감응 정도 데이터로 대응하여 스코어링(Scoring)을 수행한다. 이후, 스코어링을 통해 나온 결과 값을 이용하여 다차원으로 구성되는 3차원 감정 모델에 매핑하여 인간의 감정(단일감정, 복합감정)뿐만 아니라 감정의 세기까지 인식한다.

Lipreading using The Fuzzy Degree of Simuliarity

  • Kurosu, Kenji;Furuya, Tadayoshi;Takeuchi, Shigeru;Soeda, Mitsuru
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1993.06a
    • /
    • pp.903-906
    • /
    • 1993
  • Lipreading through visual processing techniques help provide some useful systems for the hearing impaired to learn communication assistance. This paper proposes a method to understand spoken words by using visual images taken by a camera with a video-digitizer. The image is processed to obtain the contours of lip, which is approximated into a hexagon. The pattern lists, consisting of lengths and angles of hexagon, are compared and computed to get the fuzzy similarity between two lists. By similarity matching, the mouth shape is recognized as the one which has the pronounced voice. Some experiments, exemplified by recognition of the Japanese vowels, are given to show feasibilities of this method.

  • PDF

Color-Based Real-Time Hand Region Detection with Robust Performance in Various Environments (다양한 환경에 강인한 컬러기반 실시간 손 영역 검출)

  • Hong, Dong-Gyun;Lee, Donghwa
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.14 no.6
    • /
    • pp.295-311
    • /
    • 2019
  • The smart product market is growing year by year and is being used in many areas. There are various ways of interacting with smart products and users by inputting voice recognition, touch and finger movements. It is most important to detect an accurate hand region as a whole step to recognize hand movement. In this paper, we propose a method to detect accurate hand region in real time in various environments. A conventional method of detecting a hand region includes a method using depth information of a multi-sensor camera, a method of detecting a hand through machine learning, and a method of detecting a hand region using a color model. Among these methods, a method using a multi-sensor camera or a method using a machine learning requires a large amount of calculation and a high-performance PC is essential. Many computations are not suitable for embedded systems, and high-end PCs increase or decrease the price of smart products. The algorithm proposed in this paper detects the hand region using the color model, corrects the problems of the existing hand detection algorithm, and detects the accurate hand region based on various experimental environments.