Search | Korea Science

Oh, Suntaek;Kim, Incheol
- KIPS Transactions on Software and Data Engineering
- /
- v.9 no.9
- /
- pp.281-290
- /
- 2020
The Vision-and-Language Navigation(VLN) task is a complex intelligence problem that requires both visual and language comprehension skills. In this paper, we propose a new learning model for visual-language navigation agents. The model adopts a hybrid learning that combines imitation learning based on demo data and reinforcement learning based on action reward. Therefore, this model can meet both problems of imitation learning that can be biased to the demo data and reinforcement learning with relatively low data efficiency. In addition, the proposed model uses a novel path-based reward function designed to solve the problem of existing goal-based reward functions. In this paper, we demonstrate the high performance of the proposed model through various experiments using both Matterport3D simulation environment and R2R benchmark dataset.
https://doi.org/10.3745/KTSDE.2020.9.9.281 인용 PDF KSCI

Hwang, Jisu;Kim, Incheol
- KIPS Transactions on Software and Data Engineering
- /
- v.8 no.9
- /
- pp.379-390
- /
- 2019
In this paper, we propose a novel deep neural network model for Vision-and-Language Navigation (VLN) named LVLN (Landmark-based VLN). In addition to both visual features extracted from input images and linguistic features extracted from the natural language instructions, this model makes use of information about places and landmark objects detected from images. The model also applies a context-based attention mechanism in order to associate each entity mentioned in the instruction, the corresponding region of interest (ROI) in the image, and the corresponding place and landmark object detected from the image with each other. Moreover, in order to improve the success rate of arriving the target goal, the model adopts a progress monitor module for checking substantial approach to the target goal. Conducting experiments with the Matterport3D simulator and the Room-to-Room (R2R) benchmark dataset, we demonstrate high performance of the proposed model.
https://doi.org/10.3745/KTSDE.2019.8.9.379 인용 PDF KSCI

Sohn, Eun-Ho;Park, Jong-Ho;Kim, Young-Chul;Chong, Kil-To
- Proceedings of the KIEE Conference
- /
- 2006.10c
- /
- pp.527-529
- /
- 2006
In this paper, we find a robot's path using a Virtual Reality Modeling Language and overlay vision. For correct robot's path we describe a method for localizing a mobile robot in its working environment using a vision system and VRML. The robt identifies landmarks in the environment, using image processing and neural network pattern matching techniques, and then its performs self-positioning with a vision system based on a well-known localization algorithm. After the self-positioning procedure, the 2-D scene of the vision is overlaid with the VRML scene. This paper describes how to realize the self-positioning, and shows the overlap between the 2-D and VRML scenes. The method successfully defines a robot's path.
PDF

Sohn, Eun-Ho;Zhang, Yuanliang;Kim, Young-Chul;Chong, Kil-To
- Proceedings of the IEEK Conference
- /
- 2006.06a
- /
- pp.907-908
- /
- 2006
We describe a method for localizing a mobile robot in its working environment using a vision system and Virtual Reality Modeling Language (VRML). The robot identifies landmarks in the environment, using image processing and neural network pattern matching techniques, and then its performs self-positioning with a vision system based on a well-known localization algorithm. After the self-positioning procedure, the 2-D scene of the vision is overlaid with the VRML scene. This paper describes how to realize the self-positioning, and shows the overlap between the 2-D and VRML scenes. The method successfully defines a robot's path.
PDF

Son, Eun-Ho;Kim, Young-Chul;Chong, Kil-To
- Journal of Institute of Control, Robotics and Systems
- /
- v.14 no.2
- /
- pp.168-177
- /
- 2008
Path finding is a key element in the navigation of a mobile robot. To find a path, robot should know their position exactly, since the position error exposes a robot to many dangerous conditions. It could make a robot move to a wrong direction so that it may have damage by collision by the surrounding obstacles. We propose a method obtaining an accurate robot position. The localization of a mobile robot in its working environment performs by using a vision system and Virtual Reality Modeling Language(VRML). The robot identifies landmarks located in the environment. An image processing and neural network pattern matching techniques have been applied to find location of the robot. After the self-positioning procedure, the 2-D scene of the vision is overlaid onto a VRML scene. This paper describes how to realize the self-positioning, and shows the overlay between the 2-D and VRML scenes. The suggested method defines a robot's path successfully. An experiment using the suggested algorithm apply to a mobile robot has been performed and the result shows a good path tracking.
https://doi.org/10.5302/J.ICROS.2008.14.2.168 인용 PDF KSCI

Lee, Changki
- Journal of KIISE
- /
- v.43 no.8
- /
- pp.878-882
- /
- 2016
Automatic generation of captions for an image is a very difficult task, due to the necessity of computer vision and natural language processing technologies. However, this task has many important applications, such as early childhood education, image retrieval, and navigation for blind. In this paper, we describe a Recurrent Neural Network (RNN) model for generating image captions, which takes image features extracted from a Convolutional Neural Network (CNN). We demonstrate that our models produce state of the art results in image caption generation experiments on the Flickr 8K, Flickr 30K, and MS COCO datasets.
https://doi.org/10.5626/JOK.2016.43.8.878 인용 KSCI