• Title/Summary/Keyword: 컴퓨터 응용공학

Search Result 704, Processing Time 0.032 seconds

Object Detection Based on Hellinger Distance IoU and Objectron Application (Hellinger 거리 IoU와 Objectron 적용을 기반으로 하는 객체 감지)

  • Kim, Yong-Gil;Moon, Kyung-Il
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.2
    • /
    • pp.63-70
    • /
    • 2022
  • Although 2D Object detection has been largely improved in the past years with the advance of deep learning methods and the use of large labeled image datasets, 3D object detection from 2D imagery is a challenging problem in a variety of applications such as robotics, due to the lack of data and diversity of appearances and shapes of objects within a category. Google has just announced the launch of Objectron that has a novel data pipeline using mobile augmented reality session data. However, it also is corresponding to 2D-driven 3D object detection technique. This study explores more mature 2D object detection method, and applies its 2D projection to Objectron 3D lifting system. Most object detection methods use bounding boxes to encode and represent the object shape and location. In this work, we explore a stochastic representation of object regions using Gaussian distributions. We also present a similarity measure for the Gaussian distributions based on the Hellinger Distance, which can be viewed as a stochastic Intersection-over-Union. Our experimental results show that the proposed Gaussian representations are closer to annotated segmentation masks in available datasets. Thus, less accuracy problem that is one of several limitations of Objectron can be relaxed.

Study on the Direction of Universal Big Data and Big Data Education-Based on the Survey of Big Data Experts (보편적 빅데이터와 빅데이터 교육의 방향성 연구 - 빅데이터 전문가의 인식 조사를 기반으로)

  • Park, Youn-Soo;Lee, Su-Jin
    • Journal of The Korean Association of Information Education
    • /
    • v.24 no.2
    • /
    • pp.201-214
    • /
    • 2020
  • Big data is gradually expanding in diverse fields, with changing the data-related legislation. Moreover it would be interest in big data education. However, it requires a high level of knowledge and skills in order to utilize Big Data and it takes a long time for education spends a lot of money for training. We study that in order to define Universal Big Data used to the industrial field in a wide range. As a result, we make the paradigm for Big Data education for college students. We survey to the professional the Big Data definition and the Big Data perception. According to the survey, the Big Data related-professional recognize that is a wider definition than Computer Science Big Data is. Also they recognize that the Big Data Processing dose not be required Big Data Processing Frameworks or High Performance Computers. This means that in order to educate Big Data, it is necessary to focus on the analysis methods and application methods of Universal Big Data rather than computer science (Engineering) knowledge and skills. Based on the our research, we propose the Universal Big Data education on the new paradigm.

Gaze Detection by Computing Facial and Eye Movement (얼굴 및 눈동자 움직임에 의한 시선 위치 추적)

  • 박강령
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.2
    • /
    • pp.79-88
    • /
    • 2004
  • Gaze detection is to locate the position on a monitor screen where a user is looking by computer vision. Gaze detection systems have numerous fields of application. They are applicable to the man-machine interface for helping the handicapped to use computers and the view control in three dimensional simulation programs. In our work, we implement it with a computer vision system setting a IR-LED based single camera. To detect the gaze position, we locate facial features, which is effectively performed with IR-LED based camera and SVM(Support Vector Machine). When a user gazes at a position of monitor, we can compute the 3D positions of those features based on 3D rotation and translation estimation and affine transform. Finally, the gaze position by the facial movements is computed from the normal vector of the plane determined by those computed 3D positions of features. In addition, we use a trained neural network to detect the gaze position by eye's movement. As experimental results, we can obtain the facial and eye gaze position on a monitor and the gaze position accuracy between the computed positions and the real ones is about 4.8 cm of RMS error.

Positive Random Forest based Robust Object Tracking (Positive Random Forest 기반의 강건한 객체 추적)

  • Cho, Yunsub;Jeong, Soowoong;Lee, Sangkeun
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.6
    • /
    • pp.107-116
    • /
    • 2015
  • In compliance with digital device growth, the proliferation of high-tech computers, the availability of high quality and inexpensive video cameras, the demands for automated video analysis is increasing, especially in field of intelligent monitor system, video compression and robot vision. That is why object tracking of computer vision comes into the spotlight. Tracking is the process of locating a moving object over time using a camera. The consideration of object's scale, rotation and shape deformation is the most important thing in robust object tracking. In this paper, we propose a robust object tracking scheme using Random Forest. Specifically, an object detection scheme based on region covariance and ZNCC(zeros mean normalized cross correlation) is adopted for estimating accurate object location. Next, the detected region will be divided into five regions for random forest-based learning. The five regions are verified by random forest. The verified regions are put into the model pool. Finally, the input model is updated for the object location correction when the region does not contain the object. The experiments shows that the proposed method produces better accurate performance with respect to object location than the existing methods.

The Design And Implementation of Robot Training Kit for Java Programming Learning (Java 프로그래밍 학습을 위한 로봇 트레이닝키트의 설계 및 구현)

  • Baek, Jeong-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.10
    • /
    • pp.97-107
    • /
    • 2013
  • The latest programming paradigm has been mostly geared toward object-oriented programming and visual programming based on the object-oriented programming. However, object-oriented programming has a more difficult and complicated concept compared with that of existing structural programming technique; thus it has been very difficult to educate students in the IT-related department. This study designed and implemented a Java robot training kit in which the Java virtual machine is built so that it may enhance the desire and motivation of students for learning the object-oriented programming using the training kit which is possible to attach various input and output devices and to control a robot. The developed Java robot training kit is able to communicate with a computer through the USB interface, and it also enables learners to manufacture a robot for education and to practice applied programming because there is a general purpose input and output port inside the kit, through which diverse input and output devices, DC motor, and servo motor can be operated. Accordingly, facing the IT fusion era, the wall between the academic circles and the major becomes lower and the need for introducing education about creative engineering object-oriented programming language is emerging. At this point, the Java robot training kit developed in this study is expected to make a great commitment in this regard.

Analysis of Research Trends in Deep Learning-Based Video Captioning (딥러닝 기반 비디오 캡셔닝의 연구동향 분석)

  • Lyu Zhi;Eunju Lee;Youngsoo Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.13 no.1
    • /
    • pp.35-49
    • /
    • 2024
  • Video captioning technology, as a significant outcome of the integration between computer vision and natural language processing, has emerged as a key research direction in the field of artificial intelligence. This technology aims to achieve automatic understanding and language expression of video content, enabling computers to transform visual information in videos into textual form. This paper provides an initial analysis of the research trends in deep learning-based video captioning and categorizes them into four main groups: CNN-RNN-based Model, RNN-RNN-based Model, Multimodal-based Model, and Transformer-based Model, and explain the concept of each video captioning model. The features, pros and cons were discussed. This paper lists commonly used datasets and performance evaluation methods in the video captioning field. The dataset encompasses diverse domains and scenarios, offering extensive resources for the training and validation of video captioning models. The model performance evaluation method mentions major evaluation indicators and provides practical references for researchers to evaluate model performance from various angles. Finally, as future research tasks for video captioning, there are major challenges that need to be continuously improved, such as maintaining temporal consistency and accurate description of dynamic scenes, which increase the complexity in real-world applications, and new tasks that need to be studied are presented such as temporal relationship modeling and multimodal data integration.

Subimage Detection of Window Image Using AdaBoost (AdaBoost를 이용한 윈도우 영상의 하위 영상 검출)

  • Gil, Jong In;Kim, Manbae
    • Journal of Broadcast Engineering
    • /
    • v.19 no.5
    • /
    • pp.578-589
    • /
    • 2014
  • Window image is displayed through a monitor screen when we execute the application programs on the computer. This includes webpage, video player and a number of applications. The webpage delivers a variety of information by various types in comparison with other application. Unlike a natural image captured from a camera, the window image like a webpage includes diverse components such as text, logo, icon, subimage and so on. Each component delivers various types of information to users. However, the components with different characteristic need to be divided locally, because text and image are served by various type. In this paper, we divide window images into many sub blocks, and classify each divided region into background, text and subimage. The detected subimages can be applied into 2D-to-3D conversion, image retrieval, image browsing and so forth. There are many subimage classification methods. In this paper, we utilize AdaBoost for verifying that the machine learning-based algorithm can be efficient for subimage detection. In the experiment, we showed that the subimage detection ratio is 93.4 % and false alarm is 13 %.

A design and implementation of the video conferencing system on the WWW (웹 기반의 화상회의 시스템의 설계 및 구현)

  • Kim, Sung-Jin;Park, Yong-Jin
    • Journal of the Korean Institute of Telematics and Electronics T
    • /
    • v.36T no.4
    • /
    • pp.123-132
    • /
    • 1999
  • A video conferencing system provides sharing the conference environment for geographically dispersed computer users who use the audio and video information. But the conventional video conferencing systems have some problems which are dependent on specific software and/or hardware and bound the certain platform and network environment. Furthermore the participants must know the information about other participants before joining the conference session and they have to use the same video conferencing system. This paper describes design and implementation of the video conferencing system on the WWW to solve the mentioned problems. The conference applications are transmitted from a WWW server and executed in the participants Web browsers. The participant can carry out conference services by using only the web browser. The WWW server takes charge of conferencing management including the information related to the participants and provides supported conference tools such as whiteboard, chatting and multimedia controls. Therefore the participants can easily join the conference sessions and perform conference working regardless of network connection situations. We used the Java to implement the seamless session connections and interaction between the conference participants which are the most important when implementing the video conferencing system on the WWW and used the ActiveX technology about the audio and video controls to make it easy the hardware control.

  • PDF

Development of an X3D Python Language Binding Viewer Providing a 3D Data Interface (3D 데이터 인터페이스를 제공하는 X3D Python 언어 바인딩 뷰어 개발)

  • Kim, Ha Seong;Lee, Myeong Won
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.6
    • /
    • pp.243-250
    • /
    • 2021
  • With the increased development of 3D VR applications augmented by recent VR/AR/MR technologies and by the advance of 3D devices, interchangeability and portability of 3D data have become essential. 3D files should be processed in a standard data format for common usage between applications. Providing standardized libraries and data structures along with the standard file format means that a more efficient system organization is possible and unnecessary processing due to the usage of different file formats and data structures depending on the applications can be omitted. In order to provide the function of using a common data file and data structure, this research is intended to provide a programming binding tool for generating and storing standardized data so that various services can be developed by accessing the common 3D files. To achieve this, this paper defines a common data structure including classes and functions to access X3D files with a standardized scheme using the Python programming language. It describes the implementation of a Python language binding viewer, which is an X3D VR viewer for rendering standard X3D data files based on the language binding interface. The VR viewer includes Python based 3D scene libraries and a data structure for creation, modification, exchange, and transfer of X3D objects. In addition, the viewer displays X3D objects and processes events using the libraries and data structure.

Line Tracer Modeling for Educational Virtual Experiment (교육용 가상실험 라인 트레이서 모델링)

  • Ki, Jang-Geun;Kwon, Kee-Young
    • Journal of Software Assessment and Valuation
    • /
    • v.17 no.2
    • /
    • pp.109-116
    • /
    • 2021
  • Traditionally, the engineering field has been dominated by face-to-face education focused on experimental practice, but demand for online learning has soared due to the rapid development of IT technology and Internet communication networks and recent changes in the social environment such as COVID-19. In order for efficient online education to be conducted in the engineering field, where the proportion of experimental practice is relatively high compared to other fields, virtual laboratory practice content that can replace actual experimental practice is very necessary. In this study, we developed a line tracer model and a virtual experimental software to simulate it for efficient online learning of microprocessor applications that are essential not only in the electric and electronic field but also in the overall engineering field where IT convergence takes place. In the developed line tracer model, the user can set various hardware parameter values in the desired form and write the software in assembly language or C language to test the operation on the computer. The developed line tracer virtual experimental software has been used in actual classes to verify its operation, and is expected to be an efficient virtual experimental practice tool in online non-face-to-face classes.