• Title/Summary/Keyword: Deep learning Convergence image processing

Search Result 91, Processing Time 0.023 seconds

Scientometrics-based R&D Topography Analysis to Identify Research Trends Related to Image Segmentation (이미지 분할(image segmentation) 관련 연구 동향 파악을 위한 과학계량학 기반 연구개발지형도 분석)

  • Young-Chan Kim;Byoung-Sam Jin;Young-Chul Bae
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.27 no.3
    • /
    • pp.563-572
    • /
    • 2024
  • Image processing and computer vision technologies are becoming increasingly important in a variety of application fields that require techniques and tools for sophisticated image analysis. In particular, image segmentation is a technology that plays an important role in image analysis. In this study, in order to identify recent research trends on image segmentation techniques, we used the Web of Science(WoS) database to analyze the R&D topography based on the network structure of the author's keyword co-occurrence matrix. As a result, from 2015 to 2023, as a result of the analysis of the R&D map of research articles on image segmentation, R&D in this field is largely focused on four areas of research and development: (1) researches on collecting and preprocessing image data to build higher-performance image segmentation models, (2) the researches on image segmentation using statistics-based models or machine learning algorithms, (3) the researches on image segmentation for medical image analysis, and (4) deep learning-based image segmentation-related R&D. The scientometrics-based analysis performed in this study can not only map the trajectory of R&D related to image segmentation, but can also serve as a marker for future exploration in this dynamic field.

Saliency Attention Method for Salient Object Detection Based on Deep Learning (딥러닝 기반의 돌출 객체 검출을 위한 Saliency Attention 방법)

  • Kim, Hoi-Jun;Lee, Sang-Hun;Han, Hyun Ho;Kim, Jin-Soo
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.12
    • /
    • pp.39-47
    • /
    • 2020
  • In this paper, we proposed a deep learning-based detection method using Saliency Attention to detect salient objects in images. The salient object detection separates the object where the human eye is focused from the background, and determines the highly relevant part of the image. It is usefully used in various fields such as object tracking, detection, and recognition. Existing deep learning-based methods are mostly Autoencoder structures, and many feature losses occur in encoders that compress and extract features and decoders that decompress and extend the extracted features. These losses cause the salient object area to be lost or detect the background as an object. In the proposed method, Saliency Attention is proposed to reduce the feature loss and suppress the background region in the Autoencoder structure. The influence of the feature values was determined using the ELU activation function, and Attention was performed on the feature values in the normalized negative and positive regions, respectively. Through this Attention method, the background area was suppressed and the projected object area was emphasized. Experimental results showed improved detection results compared to existing deep learning methods.

Modified Pyramid Scene Parsing Network with Deep Learning based Multi Scale Attention (딥러닝 기반의 Multi Scale Attention을 적용한 개선된 Pyramid Scene Parsing Network)

  • Kim, Jun-Hyeok;Lee, Sang-Hun;Han, Hyun-Ho
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.11
    • /
    • pp.45-51
    • /
    • 2021
  • With the development of deep learning, semantic segmentation methods are being studied in various fields. There is a problem that segmenation accuracy drops in fields that require accuracy such as medical image analysis. In this paper, we improved PSPNet, which is a deep learning based segmentation method to minimized the loss of features during semantic segmentation. Conventional deep learning based segmentation methods result in lower resolution and loss of object features during feature extraction and compression. Due to these losses, the edge and the internal information of the object are lost, and there is a problem that the accuracy at the time of object segmentation is lowered. To solve these problems, we improved PSPNet, which is a semantic segmentation model. The multi-scale attention proposed to the conventional PSPNet was added to prevent feature loss of objects. The feature purification process was performed by applying the attention method to the conventional PPM module. By suppressing unnecessary feature information, eadg and texture information was improved. The proposed method trained on the Cityscapes dataset and use the segmentation index MIoU for quantitative evaluation. As a result of the experiment, the segmentation accuracy was improved by about 1.5% compared to the conventional PSPNet.

Automated Story Generation with Image Captions and Recursiva Calls (이미지 캡션 및 재귀호출을 통한 스토리 생성 방법)

  • Isle Jeon;Dongha Jo;Mikyeong Moon
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.24 no.1
    • /
    • pp.42-50
    • /
    • 2023
  • The development of technology has achieved digital innovation throughout the media industry, including production techniques and editing technologies, and has brought diversity in the form of consumer viewing through the OTT service and streaming era. The convergence of big data and deep learning networks automatically generated text in format such as news articles, novels, and scripts, but there were insufficient studies that reflected the author's intention and generated story with contextually smooth. In this paper, we describe the flow of pictures in the storyboard with image caption generation techniques, and the automatic generation of story-tailored scenarios through language models. Image caption using CNN and Attention Mechanism, we generate sentences describing pictures on the storyboard, and input the generated sentences into the artificial intelligence natural language processing model KoGPT-2 in order to automatically generate scenarios that meet the planning intention. Through this paper, the author's intention and story customized scenarios are created in large quantities to alleviate the pain of content creation, and artificial intelligence participates in the overall process of digital content production to activate media intelligence.

Construction of CT Image data Automatic Recognition System for Diagnosis of Urinary Stone Based on AI Plaform (인공지능 플랫폼기반 요로결석진단을 위한 CT 영상 데이터 자동판독 시스템 구축)

  • Noh, Si-Hyeong;Lee, Chungsub;Kim, Tae-Hoon;Lee, Yun Oh;Park, Sung Bin;Yoon, Kwon-Ha;Jeong, Chang-Won
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.928-930
    • /
    • 2020
  • 본 논문은 인공지능 플랫폼 기반의 요로결석 진단을 위한 CT 영상 데이터 자동판독 시스템에 대해 기술하고자 한다. 제안한 시스템은 웹 기반의 플랫폼을 기반으로 하며, 인공지능 기반의 진단 알고리즘을 장착하여 빠르게 요로결석 환자의 스크리닝에 목적을 두고 있다. 병원정보시스템의 PACS와 EMR과 연계와 Deep learning 진단 알고리즘을 적용한 요로결석 자동판독 시스템을 개발하였다. 특히, 기 구축된 인공지능 플랫폼을 통해 추출한 데이터셋을 기반으로 진단 알고리즘 개발 방법과 수행 결과를 보인다. 제안한 시스템은 요로결석 진단과 수술여부에 의사결정지원 시스템으로 임상에서 활용될 것으로 기대하고 있다.

Machine Tool State Monitoring Using Hierarchical Convolution Neural Network (계층적 컨볼루션 신경망을 이용한 공작기계의 공구 상태 진단)

  • Kyeong-Min Lee
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.23 no.2
    • /
    • pp.84-90
    • /
    • 2022
  • Machine tool state monitoring is a process that automatically detects the states of machine. In the manufacturing process, the efficiency of machining and the quality of the product are affected by the condition of the tool. Wear and broken tools can cause more serious problems in process performance and lower product quality. Therefore, it is necessary to develop a system to prevent tool wear and damage during the process so that the tool can be replaced in a timely manner. This paper proposes a method for diagnosing five tool states using a deep learning-based hierarchical convolutional neural network to change tools at the right time. The one-dimensional acoustic signal generated when the machine cuts the workpiece is converted into a frequency-based power spectral density two-dimensional image and use as an input for a convolutional neural network. The learning model diagnoses five tool states through three hierarchical steps. The proposed method showed high accuracy compared to the conventional method. In addition, it will be able to be utilized in a smart factory fault diagnosis system that can monitor various machine tools through real-time connecting.

Automatic Generation of Video Metadata for the Super-personalized Recommendation of Media

  • Yong, Sung Jung;Park, Hyo Gyeong;You, Yeon Hwi;Moon, Il-Young
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.4
    • /
    • pp.288-294
    • /
    • 2022
  • The media content market has been growing, as various types of content are being mass-produced owing to the recent proliferation of the Internet and digital media. In addition, platforms that provide personalized services for content consumption are emerging and competing with each other to recommend personalized content. Existing platforms use a method in which a user directly inputs video metadata. Consequently, significant amounts of time and cost are consumed in processing large amounts of data. In this study, keyframes and audio spectra based on the YCbCr color model of a movie trailer were extracted for the automatic generation of metadata. The extracted audio spectra and image keyframes were used as learning data for genre recognition in deep learning. Deep learning was implemented to determine genres among the video metadata, and suggestions for utilization were proposed. A system that can automatically generate metadata established through the results of this study will be helpful for studying recommendation systems for media super-personalization.

Web Server based Hologram Image Production Pipeline System Implementation (웹 서버 기반의 홀로그램 영상 제작 파이프라인 시스템 구현)

  • Kim, Yongjung;Park, Chansoo;Shin, Seokyong;Kim, Jungho;Gentet, Philippe;Lee, Jiyoon;Kwon, Soonchul;Lee, Seunghyun
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.751-757
    • /
    • 2021
  • In this paper, we proposed a pipeline system for holographic image production in a web server-based environment. There are time and spatial constraints for the existing holographic image production. The purpose of the proposed system is to obtain high-quality holographic images by reducing accessibility to users. It is a structure in which a video captured by a user in a web environment is transmitted to a server and converted into a frame for holographic image production through post-production. For high-quality holographic image acquisition, post-processing uses a deep learning-based algorithm. The proposed system provides various service tools in the web environment for user convenience. Through this method, the user's accessibility is improved when producing holographic images because images are taken in a web environment rather than in a limited space.

Autonomous Vehicles as Safety and Security Agents in Real-Life Environments

  • Al-Absi, Ahmed Abdulhakim
    • International journal of advanced smart convergence
    • /
    • v.11 no.2
    • /
    • pp.7-12
    • /
    • 2022
  • Safety and security are the topmost priority in every environment. With the aid of Artificial Intelligence (AI), many objects are becoming more intelligent, conscious, and curious of their surroundings. The recent scientific breakthroughs in autonomous vehicular designs and development; powered by AI, network of sensors and the rapid increase of Internet of Things (IoTs) could be utilized in maintaining safety and security in our environments. AI based on deep learning architectures and models, such as Deep Neural Networks (DNNs), is being applied worldwide in the automotive design fields like computer vision, natural language processing, sensor fusion, object recognition and autonomous driving projects. These features are well known for their identification, detective and tracking abilities. With the embedment of sensors, cameras, GPS, RADAR, LIDAR, and on-board computers in many of these autonomous vehicles being developed, these vehicles can properly map their positions and proximity to everything around them. In this paper, we explored in detail several ways in which these enormous features embedded in these autonomous vehicles, such as the network of sensors fusion, computer vision and natural image processing, natural language processing, and activity aware capabilities of these automobiles, could be tapped and utilized in safeguarding our lives and environment.

Non-face-to-face online home training application study using deep learning-based image processing technique and standard exercise program (딥러닝 기반 영상처리 기법 및 표준 운동 프로그램을 활용한 비대면 온라인 홈트레이닝 어플리케이션 연구)

  • Shin, Youn-ji;Lee, Hyun-ju;Kim, Jun-hee;Kwon, Da-young;Lee, Seon-ae;Choo, Yun-jin;Park, Ji-hye;Jung, Ja-hyun;Lee, Hyoung-suk;Kim, Joon-ho
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.3
    • /
    • pp.577-582
    • /
    • 2021
  • Recently, with the development of AR, VR, and smart device technologies, the demand for services based on non-face-to-face environments is also increasing in the fitness industry. The non-face-to-face online home training service has the advantage of not being limited by time and place compared to the existing offline service. However, there are disadvantages including the absence of exercise equipment, difficulty in measuring the amount of exercise and chekcing whether the user maintains an accurate exercise posture or not. In this study, we develop a standard exercise program that can compensate for these shortcomings and propose a new non-face-to-face home training application by using a deep learning-based body posture estimation image processing algorithm. This application allows the user to directly watch and follow the trainer of the standard exercise program video, correct the user's own posture, and perform an accurate exercise. Furthermore, if the results of this study are customized according to their purpose, it will be possible to apply them to performances, films, club activities, and conferences