• Title/Summary/Keyword: 지능영상처리

Search Result 607, Processing Time 0.022 seconds

Classification of Diabetic Retinopathy using Mask R-CNN and Random Forest Method

  • Jung, Younghoon;Kim, Daewon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.12
    • /
    • pp.29-40
    • /
    • 2022
  • In this paper, we studied a system that detects and analyzes the pathological features of diabetic retinopathy using Mask R-CNN and a Random Forest classifier. Those are one of the deep learning techniques and automatically diagnoses diabetic retinopathy. Diabetic retinopathy can be diagnosed through fundus images taken with special equipment. Brightness, color tone, and contrast may vary depending on the device. Research and development of an automatic diagnosis system using artificial intelligence to help ophthalmologists make medical judgments possible. This system detects pathological features such as microvascular perfusion and retinal hemorrhage using the Mask R-CNN technique. It also diagnoses normal and abnormal conditions of the eye by using a Random Forest classifier after pre-processing. In order to improve the detection performance of the Mask R-CNN algorithm, image augmentation was performed and learning procedure was conducted. Dice similarity coefficients and mean accuracy were used as evaluation indicators to measure detection accuracy. The Faster R-CNN method was used as a control group, and the detection performance of the Mask R-CNN method through this study showed an average of 90% accuracy through Dice coefficients. In the case of mean accuracy it showed 91% accuracy. When diabetic retinopathy was diagnosed by learning a Random Forest classifier based on the detected pathological symptoms, the accuracy was 99%.

Efficient Poisoning Attack Defense Techniques Based on Data Augmentation (데이터 증강 기반의 효율적인 포이즈닝 공격 방어 기법)

  • So-Eun Jeon;Ji-Won Ock;Min-Jeong Kim;Sa-Ra Hong;Sae-Rom Park;Il-Gu Lee
    • Convergence Security Journal
    • /
    • v.22 no.3
    • /
    • pp.25-32
    • /
    • 2022
  • Recently, the image processing industry has been activated as deep learning-based technology is introduced in the image recognition and detection field. With the development of deep learning technology, learning model vulnerabilities for adversarial attacks continue to be reported. However, studies on countermeasures against poisoning attacks that inject malicious data during learning are insufficient. The conventional countermeasure against poisoning attacks has a limitation in that it is necessary to perform a separate detection and removal operation by examining the training data each time. Therefore, in this paper, we propose a technique for reducing the attack success rate by applying modifications to the training data and inference data without a separate detection and removal process for the poison data. The One-shot kill poison attack, a clean label poison attack proposed in previous studies, was used as an attack model. The attack performance was confirmed by dividing it into a general attacker and an intelligent attacker according to the attacker's attack strategy. According to the experimental results, when the proposed defense mechanism is applied, the attack success rate can be reduced by up to 65% compared to the conventional method.

Implementation of a walking-aid light with machine vision-based pedestrian signal detection (머신비전 기반 보행신호등 검출 기능을 갖는 보행등 구현)

  • Jihun Koo;Juseong Lee;Hongrae Cho;Ho-Myoung An
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.17 no.1
    • /
    • pp.31-37
    • /
    • 2024
  • In this study, we propose a machine vision-based pedestrian signal detection algorithm that operates efficiently even in computing resource-constrained environments. This algorithm demonstrates high efficiency within limited resources and is designed to minimize the impact of ambient lighting by sequentially applying HSV color space-based image processing, binarization, morphological operations, labeling, and other steps to address issues such as light glare. Particularly, this algorithm is structured in a relatively simple form to ensure smooth operation within embedded system environments, considering the limitations of computing resources. Consequently, it possesses a structure that operates reliably even in environments with low computing resources. Moreover, the proposed pedestrian signal system not only includes pedestrian signal detection capabilities but also incorporates IoT functionality, allowing wireless integration with a web server. This integration enables users to conveniently monitor and control the status of the signal system through the web server. Additionally, successful implementation has been achieved for effectively controlling 50W LED pedestrian signals. This proposed system aims to provide a rapid and efficient pedestrian signal detection and control system within resource-constrained environments, contemplating its potential applicability in real-world road scenarios. Anticipated contributions include fostering the establishment of safer and more intelligent traffic systems.

A Study on the Effect of Using Sentiment Lexicon in Opinion Classification (오피니언 분류의 감성사전 활용효과에 대한 연구)

  • Kim, Seungwoo;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.133-148
    • /
    • 2014
  • Recently, with the advent of various information channels, the number of has continued to grow. The main cause of this phenomenon can be found in the significant increase of unstructured data, as the use of smart devices enables users to create data in the form of text, audio, images, and video. In various types of unstructured data, the user's opinion and a variety of information is clearly expressed in text data such as news, reports, papers, and various articles. Thus, active attempts have been made to create new value by analyzing these texts. The representative techniques used in text analysis are text mining and opinion mining. These share certain important characteristics; for example, they not only use text documents as input data, but also use many natural language processing techniques such as filtering and parsing. Therefore, opinion mining is usually recognized as a sub-concept of text mining, or, in many cases, the two terms are used interchangeably in the literature. Suppose that the purpose of a certain classification analysis is to predict a positive or negative opinion contained in some documents. If we focus on the classification process, the analysis can be regarded as a traditional text mining case. However, if we observe that the target of the analysis is a positive or negative opinion, the analysis can be regarded as a typical example of opinion mining. In other words, two methods (i.e., text mining and opinion mining) are available for opinion classification. Thus, in order to distinguish between the two, a precise definition of each method is needed. In this paper, we found that it is very difficult to distinguish between the two methods clearly with respect to the purpose of analysis and the type of results. We conclude that the most definitive criterion to distinguish text mining from opinion mining is whether an analysis utilizes any kind of sentiment lexicon. We first established two prediction models, one based on opinion mining and the other on text mining. Next, we compared the main processes used by the two prediction models. Finally, we compared their prediction accuracy. We then analyzed 2,000 movie reviews. The results revealed that the prediction model based on opinion mining showed higher average prediction accuracy compared to the text mining model. Moreover, in the lift chart generated by the opinion mining based model, the prediction accuracy for the documents with strong certainty was higher than that for the documents with weak certainty. Most of all, opinion mining has a meaningful advantage in that it can reduce learning time dramatically, because a sentiment lexicon generated once can be reused in a similar application domain. Additionally, the classification results can be clearly explained by using a sentiment lexicon. This study has two limitations. First, the results of the experiments cannot be generalized, mainly because the experiment is limited to a small number of movie reviews. Additionally, various parameters in the parsing and filtering steps of the text mining may have affected the accuracy of the prediction models. However, this research contributes a performance and comparison of text mining analysis and opinion mining analysis for opinion classification. In future research, a more precise evaluation of the two methods should be made through intensive experiments.

Research on Generative AI for Korean Multi-Modal Montage App (한국형 멀티모달 몽타주 앱을 위한 생성형 AI 연구)

  • Lim, Jeounghyun;Cha, Kyung-Ae;Koh, Jaepil;Hong, Won-Kee
    • Journal of Service Research and Studies
    • /
    • v.14 no.1
    • /
    • pp.13-26
    • /
    • 2024
  • Multi-modal generation is the process of generating results based on a variety of information, such as text, images, and audio. With the rapid development of AI technology, there is a growing number of multi-modal based systems that synthesize different types of data to produce results. In this paper, we present an AI system that uses speech and text recognition to describe a person and generate a montage image. While the existing montage generation technology is based on the appearance of Westerners, the montage generation system developed in this paper learns a model based on Korean facial features. Therefore, it is possible to create more accurate and effective Korean montage images based on multi-modal voice and text specific to Korean. Since the developed montage generation app can be utilized as a draft montage, it can dramatically reduce the manual labor of existing montage production personnel. For this purpose, we utilized persona-based virtual person montage data provided by the AI-Hub of the National Information Society Agency. AI-Hub is an AI integration platform aimed at providing a one-stop service by building artificial intelligence learning data necessary for the development of AI technology and services. The image generation system was implemented using VQGAN, a deep learning model used to generate high-resolution images, and the KoDALLE model, a Korean-based image generation model. It can be confirmed that the learned AI model creates a montage image of a face that is very similar to what was described using voice and text. To verify the practicality of the developed montage generation app, 10 testers used it and more than 70% responded that they were satisfied. The montage generator can be used in various fields, such as criminal detection, to describe and image facial features.

System Development for Measuring Group Engagement in the Art Center (공연장에서 다중 몰입도 측정을 위한 시스템 개발)

  • Ryu, Joon Mo;Choi, Il Young;Choi, Lee Kwon;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.45-58
    • /
    • 2014
  • The Korean Culture Contents spread out to Worldwide, because the Korean wave is sweeping in the world. The contents stand in the middle of the Korean wave that we are used it. Each country is ongoing to keep their Culture industry improve the national brand and High added value. Performing contents is important factor of arousal in the enterprise industry. To improve high arousal confidence of product and positive attitude by populace is one of important factor by advertiser. Culture contents is the same situation. If culture contents have trusted by everyone, they will give information their around to spread word-of-mouth. So, many researcher study to measure for person's arousal analysis by statistical survey, physiological response, body movement and facial expression. First, Statistical survey has a problem that it is not possible to measure each person's arousal real time and we cannot get good survey result after they watched contents. Second, physiological response should be checked with surround because experimenter sets sensors up their chair or space by each of them. Additionally it is difficult to handle provided amount of information with real time from their sensor. Third, body movement is easy to get their movement from camera but it difficult to set up experimental condition, to measure their body language and to get the meaning. Lastly, many researcher study facial expression. They measures facial expression, eye tracking and face posed. Most of previous studies about arousal and interest are mostly limited to reaction of just one person and they have problems with application multi audiences. They have a particular method, for example they need room light surround, but set limits only one person and special environment condition in the laboratory. Also, we need to measure arousal in the contents, but is difficult to define also it is not easy to collect reaction by audiences immediately. Many audience in the theater watch performance. We suggest the system to measure multi-audience's reaction with real-time during performance. We use difference image analysis method for multi-audience but it weaks a dark field. To overcome dark environment during recoding IR camera can get the photo from dark area. In addition we present Multi-Audience Engagement Index (MAEI) to calculate algorithm which sources from sound, audience' movement and eye tracking value. Algorithm calculates audience arousal from the mobile survey, sound value, audience' reaction and audience eye's tracking. It improves accuracy of Multi-Audience Engagement Index, we compare Multi-Audience Engagement Index with mobile survey. And then it send the result to reporting system and proposal an interested persons. Mobile surveys are easy, fast, and visitors' discomfort can be minimized. Also additional information can be provided mobile advantage. Mobile application to communicate with the database, real-time information on visitors' attitudes focused on the content stored. Database can provide different survey every time based on provided information. The example shown in the survey are as follows: Impressive scene, Satisfied, Touched, Interested, Didn't pay attention and so on. The suggested system is combine as 3 parts. The system consist of three parts, External Device, Server and Internal Device. External Device can record multi-Audience in the dark field with IR camera and sound signal. Also we use survey with mobile application and send the data to ERD Server DB. The Server part's contain contents' data, such as each scene's weights value, group audience weights index, camera control program, algorithm and calculate Multi-Audience Engagement Index. Internal Device presents Multi-Audience Engagement Index with Web UI, print and display field monitor. Our system is test-operated by the Mogencelab in the DMC display exhibition hall which is located in the Sangam Dong, Mapo Gu, Seoul. We have still gotten from visitor daily. If we find this system audience arousal factor with this will be very useful to create contents.

Dynamic Traffic Assignment Using Genetic Algorithm (유전자 알고리즘을 이용한 동적통행배정에 관한 연구)

  • Park, Kyung-Chul;Park, Chang-Ho;Chon, Kyung-Soo;Rhee, Sung-Mo
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.8 no.1 s.15
    • /
    • pp.51-63
    • /
    • 2000
  • Dynamic traffic assignment(DTA) has been a topic of substantial research during the past decade. While DTA is gradually maturing, many aspects of DTA still need improvement, especially regarding its formulation and solution algerian Recently, with its promise for In(Intelligent Transportation System) and GIS(Geographic Information System) applications, DTA have received increasing attention. This potential also implies higher requirement for DTA modeling, especially regarding its solution efficiency for real-time implementation. But DTA have many mathematical difficulties in searching process due to the complexity of spatial and temporal variables. Although many solution algorithms have been studied, conventional methods cannot iud the solution in case that objective function or constraints is not convex. In this paper, the genetic algorithm to find the solution of DTA is applied and the Merchant-Nemhauser model is used as DTA model because it has a nonconvex constraint set. To handle the nonconvex constraint set the GENOCOP III system which is a kind of the genetic algorithm is used in this study. Results for the sample network have been compared with the results of conventional method.

  • PDF