• Title/Summary/Keyword: learning through the image

Search Result 925, Processing Time 0.03 seconds

Evaluation of Deep Learning Model for Scoliosis Pre-Screening Using Preprocessed Chest X-ray Images

  • Min Gu Jang;Jin Woong Yi;Hyun Ju Lee;Ki Sik Tae
    • Journal of Biomedical Engineering Research
    • /
    • v.44 no.4
    • /
    • pp.293-301
    • /
    • 2023
  • Scoliosis is a three-dimensional deformation of the spine that is a deformity induced by physical or disease-related causes as the spine is rotated abnormally. Early detection has a significant influence on the possibility of nonsurgical treatment. To train a deep learning model with preprocessed images and to evaluate the results with and without data augmentation to enable the diagnosis of scoliosis based only on a chest X-ray image. The preprocessed images in which only the spine, rib contours, and some hard tissues were left from the original chest image, were used for learning along with the original images, and three CNN(Convolutional Neural Networks) models (VGG16, ResNet152, and EfficientNet) were selected to proceed with training. The results obtained by training with the preprocessed images showed a superior accuracy to those obtained by training with the original image. When the scoliosis image was added through data augmentation, the accuracy was further improved, ultimately achieving a classification accuracy of 93.56% with the ResNet152 model using test data. Through supplementation with future research, the method proposed herein is expected to allow the early diagnosis of scoliosis as well as cost reduction by reducing the burden of additional radiographic imaging for disease detection.

Spam Image Detection Model based on Deep Learning for Improving Spam Filter

  • Seong-Guk Nam;Dong-Gun Lee;Yeong-Seok Seo
    • Journal of Information Processing Systems
    • /
    • v.19 no.3
    • /
    • pp.289-301
    • /
    • 2023
  • Due to the development and dissemination of modern technology, anyone can easily communicate using services such as social network service (SNS) through a personal computer (PC) or smartphone. The development of these technologies has caused many beneficial effects. At the same time, bad effects also occurred, one of which was the spam problem. Spam refers to unwanted or rejected information received by unspecified users. The continuous exposure of such information to service users creates inconvenience in the user's use of the service, and if filtering is not performed correctly, the quality of service deteriorates. Recently, spammers are creating more malicious spam by distorting the image of spam text so that optical character recognition (OCR)-based spam filters cannot easily detect it. Fortunately, the level of transformation of image spam circulated on social media is not serious yet. However, in the mail system, spammers (the person who sends spam) showed various modifications to the spam image for neutralizing OCR, and therefore, the same situation can happen with spam images on social media. Spammers have been shown to interfere with OCR reading through geometric transformations such as image distortion, noise addition, and blurring. Various techniques have been studied to filter image spam, but at the same time, methods of interfering with image spam identification using obfuscated images are also continuously developing. In this paper, we propose a deep learning-based spam image detection model to improve the existing OCR-based spam image detection performance and compensate for vulnerabilities. The proposed model extracts text features and image features from the image using four sub-models. First, the OCR-based text model extracts the text-related features, whether the image contains spam words, and the word embedding vector from the input image. Then, the convolution neural network-based image model extracts image obfuscation and image feature vectors from the input image. The extracted feature is determined whether it is a spam image by the final spam image classifier. As a result of evaluating the F1-score of the proposed model, the performance was about 14 points higher than the OCR-based spam image detection performance.

Depth Map Estimation Model Using 3D Feature Volume (3차원 특징볼륨을 이용한 깊이영상 생성 모델)

  • Shin, Soo-Yeon;Kim, Dong-Myung;Suh, Jae-Won
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.11
    • /
    • pp.447-454
    • /
    • 2018
  • This paper proposes a depth image generation algorithm of stereo images using a deep learning model composed of a CNN (convolutional neural network). The proposed algorithm consists of a feature extraction unit which extracts the main features of each parallax image and a depth learning unit which learns the parallax information using extracted features. First, the feature extraction unit extracts a feature map for each parallax image through the Xception module and the ASPP(Atrous spatial pyramid pooling) module, which are composed of 2D CNN layers. Then, the feature map for each parallax is accumulated in 3D form according to the time difference and the depth image is estimated after passing through the depth learning unit for learning the depth estimation weight through 3D CNN. The proposed algorithm estimates the depth of object region more accurately than other algorithms.

Development of Retina Healthcare Service System Using Smart Phone

  • Park, Gi Hun;Han, Ju Hyuck;Kim, Yong Suk
    • International Journal of Advanced Culture Technology
    • /
    • v.7 no.2
    • /
    • pp.227-237
    • /
    • 2019
  • In this paper, we have developed a Retina Healthcare Service System through which the patient himself/herself can manage his/her retina health. In the case of conventional portable ophthalmic cameras, patients cannot check their eye health on their own because most of them are used by doctor in environments where ophthalmography cannot be performed properly. This system consists of web, app and camera modules, and when a patient mounts a camera module for fundus photography on his / her smart phone and then photographs his / her fundus through the app, the image is transmitted to a server, and the transmitted image reads the fundus the patient's fundus image status in the fundus image reading model learned using deep learning. When the doctor expresses his/her opinions about the patient 's eye condition based on the reading result and the fundus photograph, the patient can check through the app and judge whether to receive ophthalmologic treatment.

Deep Learning-based Professional Image Interpretation Using Expertise Transplant (전문성 이식을 통한 딥러닝 기반 전문 이미지 해석 방법론)

  • Kim, Taejin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.79-104
    • /
    • 2020
  • Recently, as deep learning has attracted attention, the use of deep learning is being considered as a method for solving problems in various fields. In particular, deep learning is known to have excellent performance when applied to applying unstructured data such as text, sound and images, and many studies have proven its effectiveness. Owing to the remarkable development of text and image deep learning technology, interests in image captioning technology and its application is rapidly increasing. Image captioning is a technique that automatically generates relevant captions for a given image by handling both image comprehension and text generation simultaneously. In spite of the high entry barrier of image captioning that analysts should be able to process both image and text data, image captioning has established itself as one of the key fields in the A.I. research owing to its various applicability. In addition, many researches have been conducted to improve the performance of image captioning in various aspects. Recent researches attempt to create advanced captions that can not only describe an image accurately, but also convey the information contained in the image more sophisticatedly. Despite many recent efforts to improve the performance of image captioning, it is difficult to find any researches to interpret images from the perspective of domain experts in each field not from the perspective of the general public. Even for the same image, the part of interests may differ according to the professional field of the person who has encountered the image. Moreover, the way of interpreting and expressing the image also differs according to the level of expertise. The public tends to recognize the image from a holistic and general perspective, that is, from the perspective of identifying the image's constituent objects and their relationships. On the contrary, the domain experts tend to recognize the image by focusing on some specific elements necessary to interpret the given image based on their expertise. It implies that meaningful parts of an image are mutually different depending on viewers' perspective even for the same image. So, image captioning needs to implement this phenomenon. Therefore, in this study, we propose a method to generate captions specialized in each domain for the image by utilizing the expertise of experts in the corresponding domain. Specifically, after performing pre-training on a large amount of general data, the expertise in the field is transplanted through transfer-learning with a small amount of expertise data. However, simple adaption of transfer learning using expertise data may invoke another type of problems. Simultaneous learning with captions of various characteristics may invoke so-called 'inter-observation interference' problem, which make it difficult to perform pure learning of each characteristic point of view. For learning with vast amount of data, most of this interference is self-purified and has little impact on learning results. On the contrary, in the case of fine-tuning where learning is performed on a small amount of data, the impact of such interference on learning can be relatively large. To solve this problem, therefore, we propose a novel 'Character-Independent Transfer-learning' that performs transfer learning independently for each character. In order to confirm the feasibility of the proposed methodology, we performed experiments utilizing the results of pre-training on MSCOCO dataset which is comprised of 120,000 images and about 600,000 general captions. Additionally, according to the advice of an art therapist, about 300 pairs of 'image / expertise captions' were created, and the data was used for the experiments of expertise transplantation. As a result of the experiment, it was confirmed that the caption generated according to the proposed methodology generates captions from the perspective of implanted expertise whereas the caption generated through learning on general data contains a number of contents irrelevant to expertise interpretation. In this paper, we propose a novel approach of specialized image interpretation. To achieve this goal, we present a method to use transfer learning and generate captions specialized in the specific domain. In the future, by applying the proposed methodology to expertise transplant in various fields, we expected that many researches will be actively conducted to solve the problem of lack of expertise data and to improve performance of image captioning.

Single Image Super Resolution Reconstruction Based on Recursive Residual Convolutional Neural Network

  • Cao, Shuyi;Wee, Seungwoo;Jeong, Jechang
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2019.06a
    • /
    • pp.98-101
    • /
    • 2019
  • At present, deep convolutional neural networks have made a very important contribution in single-image super-resolution. Through the learning of the neural networks, the features of input images are transformed and combined to establish a nonlinear mapping of low-resolution images to high-resolution images. Some previous methods are difficult to train and take up a lot of memory. In this paper, we proposed a simple and compact deep recursive residual network learning the features for single image super resolution. Global residual learning and local residual learning are used to reduce the problems of training deep neural networks. And the recursive structure controls the number of parameters to save memory. Experimental results show that the proposed method improved image qualities that occur in previous methods.

  • PDF

Fashion Image Searching Website based on Deep Learning Image Classification (딥러닝 기반의 이미지 분류를 이용한 패션 이미지 검색 웹사이트)

  • Lee, Hak-Jae;Lee, Seok-Jun;Choi, Moon-Hyuk;Kim, So-Yeong;Moon, Il-Young
    • Journal of Practical Engineering Education
    • /
    • v.11 no.2
    • /
    • pp.175-180
    • /
    • 2019
  • Existing fashion web sites show only the search results for one type of clothes in items such as tops and bottoms. As the fashion market grows, consumers are demanding a platform to find a variety of fashion information. To solve this problem, we devised the idea of linking image classification through deep learning with a website and integrating SNS functions. User uploads their own image to the web site and uses the deep learning server to identify, classify and store the image's characteristics. Users can use the stored information to search for the images in various combinations. In addition, communication between users can be actively performed through the SNS function. Through this, the plan to solve the problem of existing fashion-related sites was prepared.

Methods of Classification and Character Recognition for Table Items through Deep Learning (딥러닝을 통한 문서 내 표 항목 분류 및 인식 방법)

  • Lee, Dong-Seok;Kwon, Soon-Kak
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.5
    • /
    • pp.651-658
    • /
    • 2021
  • In this paper, we propose methods for character recognition and classification for table items through deep learning. First, table areas are detected in a document image through CNN. After that, table areas are separated by separators such as vertical lines. The text in document is recognized through a neural network combined with CNN and RNN. To correct errors in the character recognition, multiple candidates for the recognized result are provided for a sentence which has low recognition accuracy.

A Study on the Evaluation of Optimal Program Applicability for Face Recognition Using Machine Learning (기계학습을 이용한 얼굴 인식을 위한 최적 프로그램 적용성 평가에 대한 연구)

  • Kim, Min-Ho;Jo, Ki-Yong;You, Hee-Won;Lee, Jung-Yeal;Baek, Un-Bae
    • Korean Journal of Artificial Intelligence
    • /
    • v.5 no.1
    • /
    • pp.10-17
    • /
    • 2017
  • This study is the first attempt to raise face recognition ability through machine learning algorithm and apply to CRM's information gathering, analysis and application. In other words, through face recognition of VIP customer in distribution field, we can proceed more prompt and subdivided customized services. The interest in machine learning, which is used to implement artificial intelligence, has increased, and it has become an age to automate it by using machine learning beyond the way that a person directly models an object recognition process. Among them, Deep Learning is evaluated as an advanced technology that shows amazing performance in various fields, and is applied to various fields of image recognition. Face recognition, which is widely used in real life, has been developed to recognize criminals' faces and catch criminals. In this study, two image analysis models, TF-SLIM and Inception-V3, which are likely to be used for criminal face recognition, were selected, analyzed, and implemented. As an evaluation criterion, the image recognition model was evaluated based on the accuracy of the face recognition program which is already being commercialized. In this experiment, it was evaluated that the recognition accuracy was good when the accuracy of the image classification was more than 90%. A limit of our study which is a way to raise face recognition is left as a further research subjects.

CNN-Based Fake Image Identification with Improved Generalization (일반화 능력이 향상된 CNN 기반 위조 영상 식별)

  • Lee, Jeonghan;Park, Hanhoon
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.12
    • /
    • pp.1624-1631
    • /
    • 2021
  • With the continued development of image processing technology, we live in a time when it is difficult to visually discriminate processed (or tampered) images from real images. However, as the risk of fake images being misused for crime increases, the importance of image forensic science for identifying fake images is emerging. Currently, various deep learning-based identifiers have been studied, but there are still many problems to be used in real situations. Due to the inherent characteristics of deep learning that strongly relies on given training data, it is very vulnerable to evaluating data that has never been viewed. Therefore, we try to find a way to improve generalization ability of deep learning-based fake image identifiers. First, images with various contents were added to the training dataset to resolve the over-fitting problem that the identifier can only classify real and fake images with specific contents but fails for those with other contents. Next, color spaces other than RGB were exploited. That is, fake image identification was attempted on color spaces not considered when creating fake images, such as HSV and YCbCr. Finally, dropout, which is commonly used for generalization of neural networks, was used. Through experimental results, it has been confirmed that the color space conversion to HSV is the best solution and its combination with the approach of increasing the training dataset significantly can greatly improve the accuracy and generalization ability of deep learning-based identifiers in identifying fake images that have never been seen before.