• 제목/요약/키워드: AI Video

검색결과 178건 처리시간 0.028초

테이블 균형맞춤 작업이 가능한 Q-학습 기반 협력로봇 개발 (Cooperative Robot for Table Balancing Using Q-learning)

  • 김예원;강보영
    • 로봇학회논문지
    • /
    • 제15권4호
    • /
    • pp.404-412
    • /
    • 2020
  • Typically everyday human life tasks involve at least two people moving objects such as tables and beds, and the balancing of such object changes based on one person's action. However, many studies in previous work performed their tasks solely on robots without factoring human cooperation. Therefore, in this paper, we propose cooperative robot for table balancing using Q-learning that enables cooperative work between human and robot. The human's action is recognized in order to balance the table by the proposed robot whose camera takes the image of the table's state, and it performs the table-balancing action according to the recognized human action without high performance equipment. The classification of human action uses a deep learning technology, specifically AlexNet, and has an accuracy of 96.9% over 10-fold cross-validation. The experiment of Q-learning was carried out over 2,000 episodes with 200 trials. The overall results of the proposed Q-learning show that the Q function stably converged at this number of episodes. This stable convergence determined Q-learning policies for the robot actions. Video of the robotic cooperation with human over the table balancing task using the proposed Q-Learning can be found at http://ibot.knu.ac.kr/videocooperation.html.

물체의 3-D 형상 복원을 위한 삼각측량 시스템 (A Study on the 3-D Information Abstraction of object using Triangulation System)

  • 김국세;이정기;조애리;배일호;이준
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2003년도 춘계학술발표논문집 (상)
    • /
    • pp.409-412
    • /
    • 2003
  • The 3-D shape use to effect of movie, animation, industrial design, medical treatment service, education, engineering etc... But it is not easy to make 3-D shape from the information of 2-D image. There are two methods in restoring 3-D video image through 2-D image; First the method of using a laser; Second, the method of acquiring 3-D image through stereo vision. Instead of doing two methods with many difficulties, I study the method of simple 3-D image in this research paper. We present here a simple and efficient method, called direct calibration, which does not require any equations at all. The direct calibration procedure builds a lookup table(LUT) linking image and 3-D coordinates by a real 3-D triangulation system. The LUT is built by measuring the image coordinates of a grid of known 3-D points, and recording both image and world coordinates for each point; the depth values of all other visible points are obtained by interpolation.

  • PDF

자율주행자동차 오픈플랫폼 온톨로지 구축을 위한 스마트디바이스 연구 (A Study on Smart Device for Open Platform Ontology Construction of Autonomous Vihicles)

  • 최병관
    • 디지털산업정보학회논문지
    • /
    • 제15권3호
    • /
    • pp.1-14
    • /
    • 2019
  • The 4th Industrial Revolution, intelligent automobile application technology is evolving beyond the limit of the mobile device to a variety of application software and multi-media collective technology with big data-based AI(artificial intelligence) technology. with the recent commercialization of 5G mobile communication service, artificial intelligent automobile technology, which is a fusion of automobile and IT technology, is evolving into more intelligent automobile service technology, and each multimedia platform service and application developed in such distributed environment is being developed Accordingly, application software technology developed with a single system SoC of a portable terminal device through various service technologies is absolutely required. In this paper, smart device design for ontology design of intelligent automobile open platform enables to design intelligent automobile middleware software design technology such as Android based SVC Codec and real time video and graphics processing that is not expressed in single ASIC application software technology as SoC based application designWe have experimented in smart device environment through researches, and newly designed service functions of various terminal devices provided as open platforms and application solutions in SoC environment and applied standardized interface analysis technique and proved this experiment.

Analysis of Google's success factors and direction

  • LEE, Sang-Youn;KIM, Se-Jin
    • 한국인공지능학회지
    • /
    • 제8권2호
    • /
    • pp.11-16
    • /
    • 2020
  • Among the innovative companies leading the era of the 4th industrial revolution, the world's largest Internet company is Google. Google has grown by providing convenient services such as Internet search, Android smartphone operating system, and video. Now, Google is leading the global IT industry by continuing to develop in various new business fields based on open service platforms, artificial intelligence, and big data. In this study, an exploratory discussion was conducted on Google's success factors and future directions. The purpose of the research is to understand the development process of the IT field from the successfactors of Google and to analyze the development direction of the future IT industry. Google's success factors were its open platform policy and successful acquisitions of external companies. In fact, most of the services Google offers come from companies that have acquired and acquired them. In addition, there was a corporate culture that values and supportsthe spirit of challenge and autonomy of members who are not afraid of failure. Based on this study's review of Google's direction analysis, the follow-up study will infer the direction of the IT industry in depth and look at the future technologies that IT majors need to prepare.

Real-Time CCTV Based Garbage Detection for Modern Societies using Deep Convolutional Neural Network with Person-Identification

  • Syed Muhammad Raza;Syed Ghazi Hassan;Syed Ali Hassan;Soo Young Shin
    • Journal of information and communication convergence engineering
    • /
    • 제22권2호
    • /
    • pp.109-120
    • /
    • 2024
  • Trash or garbage is one of the most dangerous health and environmental problems that affect pollution. Pollution affects nature, human life, and wildlife. In this paper, we propose modern solutions for cleaning the environment of trash pollution by enforcing strict action against people who dump trash inappropriately on streets, outside the home, and in unnecessary places. Artificial Intelligence (AI), especially Deep Learning (DL), has been used to automate and solve issues in the world. We availed this as an excellent opportunity to develop a system that identifies trash using a deep convolutional neural network (CNN). This paper proposes a real-time garbage identification system based on a deep CNN architecture with eight distinct classes for the training dataset. After identifying the garbage, the CCTV camera captures a video of the individual placing the trash in the incorrect location and sends an alert notice to the relevant authority.

한국어 립리딩: 데이터 구축 및 문장수준 립리딩 (Korean Lip-Reading: Data Construction and Sentence-Level Lip-Reading)

  • 조선영;윤수성
    • 한국군사과학기술학회지
    • /
    • 제27권2호
    • /
    • pp.167-176
    • /
    • 2024
  • Lip-reading is the task of inferring the speaker's utterance from silent video based on learning of lip movements. It is very challenging due to the inherent ambiguities present in the lip movement such as different characters that produce the same lip appearances. Recent advances in deep learning models such as Transformer and Temporal Convolutional Network have led to improve the performance of lip-reading. However, most previous works deal with English lip-reading which has limitations in directly applying to Korean lip-reading, and moreover, there is no a large scale Korean lip-reading dataset. In this paper, we introduce the first large-scale Korean lip-reading dataset with more than 120 k utterances collected from TV broadcasts containing news, documentary and drama. We also present a preprocessing method which uniformly extracts a facial region of interest and propose a transformer-based model based on grapheme unit for sentence-level Korean lip-reading. We demonstrate that our dataset and model are appropriate for Korean lip-reading through statistics of the dataset and experimental results.

Appeared In a Domestic YouTube Video A Study on Makeup Characteristics According to Emotional Emages

  • Na-Hyun, An
    • International Journal of Advanced Culture Technology
    • /
    • 제12권1호
    • /
    • pp.1-10
    • /
    • 2024
  • While technologies such as the 4th revolution and artificial intelligence (AI), which create new value through the convergence of intelligent information technology, are becoming hot topics, the beauty industry is rapidly developing and combining information and communication technology to produce beauty items based on smartphones among mobile technologies. As the area of expands, YouTube is forming a network through various means of information. In particular, beauty-related YouTube videos are a field of great interest and popularity among the public. By classifying the makeup characteristics according to the emotional images shown in domestic YouTube videos by emotional image and identifying the characteristics of makeup, the needs for watching YouTube makeup videos are identified. We aim to build trust in the delivery of information about makeup. The emotional images were divided into four types: 'modern', 'natural', 'gorgeous', and cute. Among the domestic makeup YouTubers, Pony, Isabe and Shinnim, Lamuque were selected. By organizing more diverse makeup-related content systematically and creatively, we expect to have a positive influence on k-makeup not only domestically but also overseas. We aim to provide basic data for follow-up research on makeup YouTuber videos in the field of cosmetology and contribute to marketing plans for the development of the beauty content industry and establishment of promotional strategies.

현장 조사와 ICT 동향 분석을 통한 스몸비 현황과 개선 방안 연구 (A Study on the Current Situation and Improved Method for the Smombie through Field Survey and ICT Trend Analysis)

  • 이동훈;오혜수;장재민;정종운;양상운
    • 한국안전학회지
    • /
    • 제35권5호
    • /
    • pp.74-85
    • /
    • 2020
  • Smart phone zombie or Smombie means pedestrians who walk without attention to their surroundings because they are focused upon their smart phone. Because the traffic accidents and injuries caused by Smombie have been increased rapidly in recent years, the social attention and policies are needed to prevent it. This study was conducted to analyze Smombie's current status and some solutions used before and to propose new improved method through the latest ICT trend. In this study, we did the field survey to check Smombies at several places in Seoul through people counting, and found that a lot of pedestrians still use the smart phone while walking. And we analyzed many case studies about some solutions to prevent Smombies previously. The case studies include legal regulations, government policies, smart phone app services and facilities that are used before. We studied them through internet searches and reference studies and we also checked the current operating situation as visiting several places that the solutions actually has been operated. Therefore, we found there are some limitations in previous solutions in terms of effectiveness and management. To consider new solution that can be expected to overcome the limitations, we analyzed the latest ICT trends focused on features to utilize the Smombie prevention, especially video recognition and digital signage. In these days, video recognition has been developed rapidly with assistance of AI technology and it can recognize the specific pedestrian's characteristics such as holding smart phone as well as hair style, clothes, backpack and etc. On the other hands, the digital signage is the convergence device that includes big display, network connection and various IoT sensors. It can be used as public media in many places for public services as well as advertising. Through these analysis results, we show the requirements and the user scenario for the improved method to prevent Smombie. Finally, we propose to develop R&D technology to recognize Smombie exactly as pedestrian attributes and to spread creative contents to increase pedestrian's interest and engagement for Smombie prevention through digital signage.

Computer Vision-based Continuous Large-scale Site Monitoring System through Edge Computing and Small-Object Detection

  • Kim, Yeonjoo;Kim, Siyeon;Hwang, Sungjoo;Hong, Seok Hwan
    • 국제학술발표논문집
    • /
    • The 9th International Conference on Construction Engineering and Project Management
    • /
    • pp.1243-1244
    • /
    • 2022
  • In recent years, the growing interest in off-site construction has led to factories scaling up their manufacturing and production processes in the construction sector. Consequently, continuous large-scale site monitoring in low-variability environments, such as prefabricated components production plants (precast concrete production), has gained increasing importance. Although many studies on computer vision-based site monitoring have been conducted, challenges for deploying this technology for large-scale field applications still remain. One of the issues is collecting and transmitting vast amounts of video data. Continuous site monitoring systems are based on real-time video data collection and analysis, which requires excessive computational resources and network traffic. In addition, it is difficult to integrate various object information with different sizes and scales into a single scene. Various sizes and types of objects (e.g., workers, heavy equipment, and materials) exist in a plant production environment, and these objects should be detected simultaneously for effective site monitoring. However, with the existing object detection algorithms, it is difficult to simultaneously detect objects with significant differences in size because collecting and training massive amounts of object image data with various scales is necessary. This study thus developed a large-scale site monitoring system using edge computing and a small-object detection system to solve these problems. Edge computing is a distributed information technology architecture wherein the image or video data is processed near the originating source, not on a centralized server or cloud. By inferring information from the AI computing module equipped with CCTVs and communicating only the processed information with the server, it is possible to reduce excessive network traffic. Small-object detection is an innovative method to detect different-sized objects by cropping the raw image and setting the appropriate number of rows and columns for image splitting based on the target object size. This enables the detection of small objects from cropped and magnified images. The detected small objects can then be expressed in the original image. In the inference process, this study used the YOLO-v5 algorithm, known for its fast processing speed and widely used for real-time object detection. This method could effectively detect large and even small objects that were difficult to detect with the existing object detection algorithms. When the large-scale site monitoring system was tested, it performed well in detecting small objects, such as workers in a large-scale view of construction sites, which were inaccurately detected by the existing algorithms. Our next goal is to incorporate various safety monitoring and risk analysis algorithms into this system, such as collision risk estimation, based on the time-to-collision concept, enabling the optimization of safety routes by accumulating workers' paths and inferring the risky areas based on workers' trajectory patterns. Through such developments, this continuous large-scale site monitoring system can guide a construction plant's safety management system more effectively.

  • PDF

분산 멀티미디어 스트리밍 시스템 설계 및 구현 (Design and Implementation of a Distribute Multimedia System)

  • 김상국;신화종;김세영;신동규;신동일
    • 한국멀티미디어학회:학술대회논문집
    • /
    • 한국멀티미디어학회 2000년도 추계학술발표논문집
    • /
    • pp.66-69
    • /
    • 2000
  • 웹이 등장하면서 지금까지 인터넷 상에서 텍스트와 이미지를 이용하여 정보를 표현하고 전달하는 방법이 가장 많이 사용되어왔다. 그러나 웹 관련 기술의 비약적인 발달과 네트워크 속도의 증가 및 인터넷의 급속한 보급으로 단순한 텍스트와 이미지 중심의 HTML 문서를 이용한 정보의 전달이 아닌 멀티미디어 데이터를 이용한 정보의 표현과 전달이 점차 증대되고 있다. 이에 따라 멀티미디어 데이터를 전송하기 위한 스트리밍 프로토콜도 등장하였다. 최근에는 컴퓨터의 성능 증가 및 네트워크 속도의 증가(초고속 통신 서비스의 보급)에 의해 멀티미디어 데이터의 전송이 가능하게 됨으로써 기존의 공중파나 CATV 방송국의 형태 지니고 인터넷 상에서 실시간 생방송 서비스와 VOD(Video On Demand) 서비스를 제공하는 인터넷 방송국이 급속하게 생겨나고 있다. (11) 인터넷 방송은 동영상과 오디오의 실시간 전달을 가능하게 하는 멀티미디어 스트리밍 기술과 멀티미디어를 실시 간으로 전송할 수 있는 실시간 전송 프로토콜을 기반으로 발전하고 있다. 인터넷 상에서 멀티미디어 스트리밍 서비스를 하는 대부분의 인터넷 방송은 스트리밍 서버로서 RealNetworks사의 RealSystem과 Microsoft사의 WMT(Windows Media Technologies)를 사용하고 있다. 본 논문은 Real Server와 WMT의 비교 분석을 통해 실시간 전송 프로토콜을 지원하고, 멀티미디어 스트리밍 기술을 지원하는 자바를 기반으로 한 분산 서버 구조의 스트리밍 서버, 서버간의 부하를 제어하는 미들웨어, 멀티미디어 스트림을 재생할 수 있는 클라이언트를 설계하고 구현한다.있다.구현한다. 이렇게 구현된 시스템은 전자 상거래, 가상 쇼핑몰, 가상 전시화, 또는 3차원 게임이나 가상교육 시스템과 같은 웹기반 응용프로그램에 사용될 수 있다.물을 보존·관리하는 것이 필요하다. 이는 도서관의 기능만으로는 감당하기 어렵기 때문에 대학정보화의 센터로서의 도서관과 공공기록물 전문 담당자로서의 대학아카이브즈가 함께 하여 대학의 공식적인 직무 관련 업무를 원활하게 지원하고, 그럼으로써 양 기관의 위상을 높이는 상승효과를 낼 수 있다.하여는, 인쇄된 일차적 정보자료의 검색방법등을 개선하고, 나아가서는 법령과 판례정보를 위한 효율적인 시스템을 구축하며, 뿐만 아니라 이용자의 요구에 충분히 대처할 수 잇는 도서관으로 변화되는 것이다. 이와 함께 가장 중요한 것은 법과대학과 사법연수원에서 법학 연구방법에 관한 강좌를 개설하여 각종 법률정보원의 활용 내지 도서관 이용방법에 관하여 교육하는 것이다.글을 연구하고, 그 결과에 의존하여서 우리의 실제의 생활에 사용하는 $\boxDr$한국어사전$\boxUl$등을 만드는 과정에서, 어떤 의미에서 실험되었다고 말할 수가 있는 언어과학의 연구의 결과에 의존하여서 수행되는 철학적인 작업이다. 여기에서는 하나의 철학적인 연구의 시작으로 받아들여지는 이 의미분석의 문제를 반성하여 본다. 것이 필요하다고 사료된다.크기에 의존하며, 또한 이러한 영향은 $(Ti_{1-x}AI_{x})N$ 피막에 존재하는 AI의 함량이 높고, 초기에 증착된 막의 업자 크기가 작을 수록 클 것으로 여겨진다. 그리고 환경의 의미의 차이에 따라 경관의 미학적 평가가 달라진 것으로

  • PDF