Search | Korea Science

Real-Time Visual Grounding for Natural Language Instructions with Deep Neural Network (심층 신경망을 이용한 자연어 지시의 실시간 시각적 접지)

Hwang, Jisu;Kim, Incheol
- Proceedings of the Korea Information Processing Society Conference
- /
- 2019.05a
- /
- pp.487-490
- /
- 2019
시각과 언어 기반의 이동(VLN)은 3차원 실내 환경에서 실시간 입력 영상과 자연어 지시들을 이해함으로써, 에이전트 스스로 목적지까지 이동해야 하는 인공지능 문제이다. 이 문제는 에이전트의 영상 및 자연어 이해 능력뿐만 아니라, 상황 추론과 행동 계획 능력도 함께 요구하는 복합 지능 문제이다. 본 논문에서는 시각과 언어 기반의 이동(VLN) 작업을 위한 새로운 심층 신경망 모델을 제안한다. 제안모델에서는 입력 영상에서 합성곱 신경망을 통해 추출하는 시각적 특징과 자연어 지시에서 순환 신경망을 통해 추출하는 언어적 특징 외에, 자연어 지시에서 언급하는 장소와 랜드마크 물체들을 영상에서 별도로 탐지해내고 이들을 추가적으로 행동 선택을 위한 특징들로 이용한다. 다양한 3차원 실내 환경들을 제공하는 Matterport3D 시뮬레이터와 Room-to-Room(R2R) 벤치마크 데이터 집합을 이용한 실험들을 통해, 본 논문에서 제안하는 모델의 높은 성능과 효과를 확인할 수 있었다.
https://doi.org/10.3745/PKIPS.y2019m05a.487 인용 PDF

Combining Imitation Learning and Reinforcement Learning for Visual-Language Navigation Agents (시각-언어 이동 에이전트를 위한 모방 학습과 강화 학습의 결합)

Oh, Suntaek;Kim, Incheol
- Proceedings of the Korea Information Processing Society Conference
- /
- 2020.05a
- /
- pp.559-562
- /
- 2020
시각-언어 이동 문제는 시각 이해와 언어 이해 능력을 함께 요구하는 복합 지능 문제이다. 본 논문에서는 시각-언어 이동 에이전트를 위한 새로운 학습 모델을 제안한다. 이 모델은 데모 데이터에 기초한 모방 학습과 행동 보상에 기초한 강화 학습을 함께 결합한 복합 학습을 채택하고 있다. 따라서 이 모델은 데모 데이타에 편향될 수 있는 모방 학습의 문제와 상대적으로 낮은 데이터 효율성을 갖는 강화 학습의 문제를 상호 보완적으로 해소할 수 있다. 또한, 제안 모델은 서로 다른 두 학습 간에 발생 가능한 학습 불균형도 고려하여 손실 정규화를 포함하고 있다. 또, 제안 모델에서는 기존 연구들에서 사용되어온 목적지 기반 보상 함수의 문제점을 발견하고, 이를 해결하기 위해 설계된 새로은 최적 경로 기반 보상 함수를 이용한다. 본 논문에서는 Matterport3D 시뮬레이션 환경과 R2R 벤치마크 데이터 집합을 이용한 다양한 실들을 통해, 제안 모델의 높은 성능을 입증하였다.
https://doi.org/10.3745/PKIPS.y2020m05a.559 인용 PDF

Multiresolution 4- 8 Tile Hierarchy Construction for Realtime Visualization of Planetary Scale Geological Information (행성 규모 지리 정보의 실시간 시각화를 위한 다계층 4-8 타일 구조의 구축)

Jin, Jong-Wook;Wohn, Kwang-Yun
- Journal of the Korean Association of Geographic Information Studies
- /
- v.9 no.4
- /
- pp.12-21
- /
- 2006
Recently, Very large and high resolution geological data from aerial or satellite imagery are available. Many researches and applications require to do realtime visualization of interest geological area or entire planet. Important operation of wide-spreaded terrain realtime visualization technique is the appropriate model resolution selection from pre-processed multi-resolution model hierarchy depend upon participant's view. For embodying such realtime rendering system with large geometric data, Preprocessing multi-resolution hierarchy from large scale geological information of interest area is required. In this research, recent Cubic multiresolution 4-8 tile hierarchy is selected for global planetary applications. Based upon the tile hierarchy, It constructs the selective terminal level tile mesh for original geological information area and starts to sample individual generated tiles for terminal level tiles. It completes the hierarchy by constructing intermediate tiles with low pass filtering in bottom-up direction. This research embodies series of efficient cubic 4-8 tile hierarchy construction mechanism with out-of-core storage. The planetary scale Mars' geographical altitude data and image data were selected for the experiment.
PDF

A Study on the Efficiency Improvement of Information Materials for Diet of Inpatient Hospitalization (입원환자를 위한 효율적 정보전달의 식단안내문 현황분석 및 사용자 인식 조사)

Chun, Eunyoung;Paik, Jinkyung
- Design Convergence Study
- /
- v.14 no.4
- /
- pp.169-180
- /
- 2015
This research is to analyze the status of menu guides provided to inpatients in Korean hospitals, analyze the problem of delivering information, and propose improvement plans through information visualization as a part of improving service for inpatients. The nutrition department in hospitals were directly visited, data were collected, and the types and contents of menu guides were analyzed. Then, problems were found by focusing on whether there is an effective delivery of information, in terms of information visualization aspects. The menu provided to inpatients in Korean hospitals are divided into general, optional, and therapeutic menu depending on patients' conditions, and the menu guide is largely divided into a general guide and optional guide. The information covers menu, origin of ingredients, guide, withdrawal time, etc. However, there is no categorization according to the contents or importance of information, thus creating inefficient problems for patients to acquire information. The problem of menu guides is that they do not deliver information properly due to disorganized information. In order to improve this, information should be categorized and visualized according to its contents and importance for enhancing readability, which will then be able to provide menu guides that can deliver information more efficiently to patients.

Abstract Visualization for Effective Debugging of Parallel Programs Based on Multi-threading (멀티 스레딩 기반 병렬 프로그램의 효과적인 디버깅을 위한 추상적 시각화)

Kim, Young-Joo
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.20 no.3
- /
- pp.549-557
- /
- 2016
It is important for effective visualization to summarize not only a large amount of debugging information but also the mental models of abstract ideas. This paper presents an abstract visualization tool which provides effective visualization of thread structure and race information for OpenMP programs with critical sections and nested parallelism, using a partial order execution graph which captures logical concurrency among threads. This tool is supported by an on-the-fly trace-filtering technique to reduce space complexity of visualization information, and a graph abstraction technique to reduce visual complexity of nested parallelism and critical sections in the filtered trace. The graph abstraction of partial-order relation and race information is effective for understanding program execution and detecting to eliminate races, because the user can examine control flow of program and locations of races in a structural fashion.
https://doi.org/10.6109/jkiice.2016.20.3.549 인용 PDF KSCI

Method of Master Receiver Selection Using DOP for Time Synchronization in TDOA-Based Localization (TDOA 기반 위치탐지를 위한 DOP을 이용한 시각동기화 주수신기 선택 기법)

Kim, Sanhae;Song, Kyuha;Kwak, Hyungyu
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.41 no.9
- /
- pp.1069-1080
- /
- 2016
TDOA(Time Difference Of Arrival)-based localization system such as the passive surveillance system performs the time synchronization between the receivers after separated installing multiple receivers to set the same clock for all receivers. And it estimates 2D(or 3D) location of the target by solving intersection of the multiple hyperbola(or hyperboloid) using TDOA. To perform time synchronization, one receiver must be set to the master, and it provide the reference data to compensate the clock of the rest of the slaves. The positioning accuracy of TDOA-based localization system is changed in accordance with the master that is selected among multiple receivers. So, the optimum receiver which is selected among multiple receivers must be set to master to get best performance in the considered deployment of receivers. In this paper, we propose a selection scheme of master receiver for time synchronization using DOP(Dilution Of Precision) which is based on location of the target and the multiple receivers. The proposed scheme has low complexity and short processing time, and it is easy to automate in the TDOA-based localization systems.
https://doi.org/10.7840/kics.2016.41.9.1069 인용 PDF KSCI

Development of a lipsync algorithm based on A/V corpus (코퍼스 기반의 립싱크 알고리즘 개발)

하영민;김진영;정수경
- Proceedings of the IEEK Conference
- /
- 2000.09a
- /
- pp.145-148
- /
- 2000
이 논문에서는 2차원 얼굴 좌표데이터를 합성하기 위한 음성과 영상 동기화 알고리즘을 제안한다. 영상변수의 획득을 위해 화자의 얼굴에 부착된 표시를 추적함으로써 영상변수를 획득하였고, 음소정보뿐만 아니라 운율정보들과의 영상과의 상관관계를 분석하였으며 합성단위로 시각소에 기반한 코퍼스를 선택하고, 주변의 음운환경도 함께 고려하여 연음현상을 모델링하였다. 입력된 코퍼스에 해당되는 패턴들을 lookup table에서 선택하여 주변음소에 대해 기준패턴과의 음운거리를 계산하고 음성파일에서 운율정보들을 추출해 운율거리를 계산한 후 가중치를 주어 패턴과의 거리를 얻는다. 이중가장 근접한 다섯개의 패턴들의 연결부분에 대해 Viterbi Search를 수행하여 최적의 경로를 선택하고 주성분분석된 영상정보를 복구하고 시간정보를 조절한다.
PDF

Memory in visual search: Evidence from search efficiency (시각 탐색에서의 기억: 탐색 효율성에 근거한 증거)

Baek Jongsoo;Kim Min-Shik
- Korean Journal of Cognitive Science
- /
- v.16 no.1
- /
- pp.1-15
- /
- 2005
Since human visual system has limited capacity for visual information processing, it should select goal-relevant information for further processing. There have been several studies that emphasized the possible involvement of memory in spatial shift of selective attention (Chun & Jiang, 1998, 1999; Klein, 1988; Klein & MacInnes, 1999). However, other studies suggested the inferiority of human visual memory in change detection(Rensink, O'Regan, & Clark, 1997; Simons & Levin, 1997) and in visual search(Hotowitz & Wolfe, 1998). The present study examined the involvement of memory in visual search; whether memory for the previously searched items guides selective attentional shift or not. We investigated how search works by comparing visual search performances in three different conditions; full exposure condition, partial exposure condition, and partial-to-full exposure condition. Revisiting searched items was allowed only in full exposure condition and not in either partial or partial-to-full exposure condition. The results showed that the efficiencies of attentional shift were nearly identical for all conditions. This finding implies that even in full exposure condition the participants scarcely re-examined the previously searched items. The results suggest that instant memory can be formed and used in visual search process. These results disagree with the earlier studies claiming thar visual search has no memory. We discussed the problems of the previous research paradigms and suggested some alternative accounts.
PDF

Building a Stereoscopic Display System for 3-D Spatial Data Analysis (3차원 공간 자료 분석을 위한 입체형 시각화 시스템 구축)

Lee, Doo-Sung
- Geophysics and Geophysical Exploration
- /
- v.7 no.2
- /
- pp.105-108
- /
- 2004
Immersive virtual reality has been used in areas of oil and gas exploration for visualization and analysis of various spatial data, such as wireline logs, 3-dimensional seismic data volume, formational boundaries, fault, and some other reservoir characteristics. Although virtual reality is a valuable tool in this area, in most cases, it requires a large budget. This paper describes the construction of a single screen, passive stereo, virtual reality, display system based on commodity, or otherwise, low-cost components. The core elements of the system are a PC with a two-channel 3-D graphics, two projects, and a polarized stereo. There are many options available for the major elements of such a system, and the basic system can be modified or adapted to many different styles of use.
PDF KSCI

Visual-Attention Using Corner Feature Based SLAM in Indoor Environment (실내 환경에서 모서리 특징을 이용한 시각 집중 기반의 SLAM)

Shin, Yong-Min;Yi, Chu-Ho;Suh, Il-Hong;Choi, Byung-Uk
- Journal of the Institute of Electronics Engineers of Korea SC
- /
- v.49 no.4
- /
- pp.90-101
- /
- 2012
The landmark selection is crucial to successful perform in SLAM(Simultaneous Localization and Mapping) with a mono camera. Especially, in unknown environment, automatic landmark selection is needed since there is no advance information about landmark. In this paper, proposed visual attention system which modeled human's vision system will be used in order to select landmark automatically. The edge feature is one of the most important element for attention in previous visual attention system. However, when the edge feature is used in complicated indoor area, the response of complicated area disappears, and between flat surfaces are getting higher. Also, computation cost increases occurs due to the growth of the dimensionality since it uses the responses for 4 directions. This paper suggests to use a corner feature in order to solve or prevent the problems mentioned above. Using a corner feature can also increase the accuracy of data association by concentrating on area which is more complicated and informative in indoor environments. Finally, this paper will prove that visual attention system based on corner feature can be more effective in SLAM compared to previous method by experiment.
PDF KSCI

Search Result 359, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)