• Title/Summary/Keyword: CNN algorithm

Search Result 477, Processing Time 0.029 seconds

Development of an AI-based Waterside Environment and Suspended Solids Detection Algorithm for the Use of Water Resource Satellite (수자원위성 활용을 위한 AI기반 수변환경 및 부유물 탐지 알고리즘 개발)

  • Jung Ho Im;Kyung Hwa Cho;Seon Young Park;Jae Se Lee;Duk Won Bae;Do Hyuck Kwon;Seok Min Hong;Byeong Cheol Kim
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.4-4
    • /
    • 2023
  • C-band SAR 센서를 탑재한 수자원위성은 한반도 수자원 모니터링을 위해 개발되어 2025년 발사가 계획되어 있으며, 수변환경 및 부유물 탐지 및 다양한 활용이 기대되고 있다. 그 중 수변환경은 수변 생태계 안정성을 유지하는 역할을 담당하여 이에 대한 모니터링은 중요하다. s현장 관측 기반 탐지 방법과 비교하여 위성 원격탐사는 광범위한 지역을 반복적으로 관측하여, 연속적인 수변환경 및 부유물 정보를 제공할 수 있다. 이러한 특성에 기반하여 다양한 다중분광 및 SAR (Synthetic Aperture Radar) 위성 원격탐사 자료를 바탕으로 수변환경 및 부유물의 탐지 연구가 이루어졌다. 특히 단일 영상만을 사용하는 기법에 비해 다중분광 및 SAR 영상을 융합하여 높은 정확도를 보인 바 있다. 초기 연구에서는 임계값 알고리즘 또는 현장관측 기반의 부유물 농도와 위성 자료간의 선형관계를 분석하는 단순한 알고리즘이 주를 이루었으나, 최근에는 RF, CNN 등 보다 복잡하고 다양한 인공지능 알고리즘이 적용되어 높은 정확도로 해당 문제들을 해결하고 있다. 본 연구에서는 수자원위성 활용을 위해 인공지능 기반 수변환경 및 부유물 탐지 알고리즘을 개발하고자 한다. 수자원위성의 대체 자료로 유럽우주국의 Sentinel-1 A/B 위성의 C-band SAR 영상을 이용하였으며, 보조자료로 Sentinel-2 다중분광 영상을 이용하였다. 개발된 알고리즘은 수자원 관리를 위한 환경변화 탐지에 유용한 정보로 활용될 수 있을 것으로 기대된다.

  • PDF

Development of Radar Super Resolution Algorithm based on a Deep Learning (딥러닝 기술 기반의 레이더 초해상화 알고리즘 기술 개발)

  • Ho-Jun Kim;Sumiya Uranchimeg;Hemie Cho;Hyun-Han Kwon
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.417-417
    • /
    • 2023
  • 도시홍수는 도시의 주요 기능을 마비시킬 수 있는 수재해로서, 최근 집중호우로 인해 홍수 및 침수 위험도가 증가하고 있다. 집중호우는 한정된 지역에 단시간 동안 집중적으로 폭우가 발생하는 현상을 의미하며, 도시 지역에서 강우 추정 및 예보를 위해 레이더의 활용이 증대되고 있다. 레이더는 수상체 또는 구름으로부터 반사되는 신호를 분석해서 강우량을 측정하는 장비이다. 기상청의 기상레이더(S밴드)의 주요 목적은 남한에 발생하는 기상현상 탐지 및 악기상 대비이다. 관측반경이 넓기에 도시 지역에 적합하지 않는 반면, X밴드 이중편파레이더는 높은 시공간 해상도를 갖는 관측자료를 제공하기에 도시 지역에 대한 강우 추정 및 예보의 정확도가 상대적으로 높다. 따라서, 본 연구에서는 딥러닝 기반 초해상화(Super Resolution) 기술을 활용하여 저해상도(Low Resolution. LR) 영상인 S밴드 레이더 자료로부터 고해상도(High Resolution, HR) 영상을 생성하는 기술을 개발하였다. 초해상도 연구는 Nearest Neighbor, Bicubic과 같은 간단한 보간법(interpolation)에서 시작하여, 최근 딥러닝 기반의 초해상화 알고리즘은 가장 일반화된 합성곱 신경망(CNN)을 통해 연구가 이루어지고 있다. X밴드 레이더 반사도 자료를 고해상도(HR), S밴드 레이더 반사도 자료를 저해상도(LR) 입력자료로 사용하여 초해상화 모형을 구성하였다. 2018~2020년에 발생한 서울시 호우 사례를 중심으로 데이터를 구축하였다. 구축된 데이터로부터 훈련된 초해상도 심층신경망 모형으로부터 저해상도 이미지를 고해상도로 변환한 결과를 PSNR(Peak Signal-to-noise Ratio), SSIM(Structural SIMilarity)와 같은 평가지표로 결과를 평가하였다. 본 연구를 통해 기존 방법들에 비해 높은 공간적 해상도를 갖는 레이더 자료를 생산할 수 있을 것으로 기대된다.

  • PDF

Implementation of an alarm system with AI image processing to detect whether a helmet is worn or not and a fall accident (헬멧 착용 여부 및 쓰러짐 사고 감지를 위한 AI 영상처리와 알람 시스템의 구현)

  • Yong-Hwa Jo;Hyuek-Jae Lee
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.23 no.3
    • /
    • pp.150-159
    • /
    • 2022
  • This paper presents an implementation of detecting whether a helmet is worn and there is a fall accident through individual image analysis in real-time from extracting the image objects of several workers active in the industrial field. In order to detect image objects of workers, YOLO, a deep learning-based computer vision model, was used, and for whether a helmet is worn or not, the extracted images with 5,000 different helmet learning data images were applied. For whether a fall accident occurred, the position of the head was checked using the Pose real-time body tracking algorithm of Mediapipe, and the movement speed was calculated to determine whether the person fell. In addition, to give reliability to the result of a falling accident, a method to infer the posture of an object by obtaining the size of YOLO's bounding box was proposed and implemented. Finally, Telegram API Bot and Firebase DB server were implemented for notification service to administrators.

Identification of Multiple Cancer Cell Lines from Microscopic Images via Deep Learning (심층 학습을 통한 암세포 광학영상 식별기법)

  • Park, Jinhyung;Choe, Se-woon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.374-376
    • /
    • 2021
  • For the diagnosis of cancer-related diseases in clinical practice, pathological examination using biopsy is essential after basic diagnosis using imaging equipment. In order to proceed with such a biopsy, the assistance of an oncologist, clinical pathologist, etc. with specialized knowledge and the minimum required time are essential for confirmation. In recent years, research related to the establishment of a system capable of automatic classification of cancer cells using artificial intelligence is being actively conducted. However, previous studies show limitations in the type and accuracy of cells based on a limited algorithm. In this study, we propose a method to identify a total of 4 cancer cells through a convolutional neural network, a kind of deep learning. The optical images obtained through cell culture were learned through EfficientNet after performing pre-processing such as identification of the location of cells and image segmentation using OpenCV. The model used various hyper parameters based on EfficientNet, and trained InceptionV3 to compare and analyze the performance. As a result, cells were classified with a high accuracy of 96.8%, and this analysis method is expected to be helpful in confirming cancer.

  • PDF

Correlation Extraction from KOSHA to enable the Development of Computer Vision based Risks Recognition System

  • Khan, Numan;Kim, Youjin;Lee, Doyeop;Tran, Si Van-Tien;Park, Chansik
    • International conference on construction engineering and project management
    • /
    • 2020.12a
    • /
    • pp.87-95
    • /
    • 2020
  • Generally, occupational safety and particularly construction safety is an intricate phenomenon. Industry professionals have devoted vital attention to enforcing Occupational Safety and Health (OHS) from the last three decades to enhance safety management in construction. Despite the efforts of the safety professionals and government agencies, current safety management still relies on manual inspections which are infrequent, time-consuming and prone to error. Extensive research has been carried out to deal with high fatality rates confronting by the construction industry. Sensor systems, visualization-based technologies, and tracking techniques have been deployed by researchers in the last decade. Recently in the construction industry, computer vision has attracted significant attention worldwide. However, the literature revealed the narrow scope of the computer vision technology for safety management, hence, broad scope research for safety monitoring is desired to attain a complete automatic job site monitoring. With this regard, the development of a broader scope computer vision-based risk recognition system for correlation detection between the construction entities is inevitable. For this purpose, a detailed analysis has been conducted and related rules which depict the correlations (positive and negative) between the construction entities were extracted. Deep learning supported Mask R-CNN algorithm is applied to train the model. As proof of concept, a prototype is developed based on real scenarios. The proposed approach is expected to enhance the effectiveness of safety inspection and reduce the encountered burden on safety managers. It is anticipated that this approach may enable a reduction in injuries and fatalities by implementing the exact relevant safety rules and will contribute to enhance the overall safety management and monitoring performance.

  • PDF

Artificial Neural Network-based Thermal Environment Prediction Model for Energy Saving of Data Center Cooling Systems (데이터센터 냉각 시스템의 에너지 절약을 위한 인공신경망 기반 열환경 예측 모델)

  • Chae-Young Lim;Chae-Eun Yeo;Seong-Yool Ahn;Sang-Hyun Lee
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.6
    • /
    • pp.883-888
    • /
    • 2023
  • Since data centers are places that provide IT services 24 hours a day, 365 days a year, data center power consumption is expected to increase to approximately 10% by 2030, and the introduction of high-density IT equipment will gradually increase. In order to ensure the stable operation of IT equipment, various types of research are required to conserve energy in cooling and improve energy management. This study proposes the following process for energy saving in data centers. We conducted CFD modeling of the data center, proposed an artificial intelligence-based thermal environment prediction model, compared actual measured data, the predicted model, and the CFD results, and finally evaluated the data center's thermal management performance. It can be seen that the predicted values of RCI, RTI, and PUE are also similar according to the normalization used in the normalization method. Therefore, it is judged that the algorithm proposed in this study can be applied and provided as a thermal environment prediction model applied to data centers.

Object Pose Estimation and Motion Planning for Service Automation System (서비스 자동화 시스템을 위한 물체 자세 인식 및 동작 계획)

  • Youngwoo Kwon;Dongyoung Lee;Hosun Kang;Jiwook Choi;Inho Lee
    • The Journal of Korea Robotics Society
    • /
    • v.19 no.2
    • /
    • pp.176-187
    • /
    • 2024
  • Recently, automated solutions using collaborative robots have been emerging in various industries. Their primary functions include Pick & Place, Peg in the Hole, fastening and assembly, welding, and more, which are being utilized and researched in various fields. The application of these robots varies depending on the characteristics of the grippers attached to the end of the collaborative robots. To grasp a variety of objects, a gripper with a high degree of freedom is required. In this paper, we propose a service automation system using a multi-degree-of-freedom gripper, collaborative robots, and vision sensors. Assuming various products are placed at a checkout counter, we use three cameras to recognize the objects, estimate their pose, and create grasping points for grasping. The grasping points are grasped by the multi-degree-of-freedom gripper, and experiments are conducted to recognize barcodes, a key task in service automation. To recognize objects, we used a CNN (Convolutional Neural Network) based algorithm and point cloud to estimate the object's 6D pose. Using the recognized object's 6d pose information, we create grasping points for the multi-degree-of-freedom gripper and perform re-grasping in a direction that facilitates barcode scanning. The experiment was conducted with four selected objects, progressing through identification, 6D pose estimation, and grasping, recording the success and failure of barcode recognition to prove the effectiveness of the proposed system.

Recommender system using BERT sentiment analysis (BERT 기반 감성분석을 이용한 추천시스템)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.2
    • /
    • pp.1-15
    • /
    • 2021
  • If it is difficult for us to make decisions, we ask for advice from friends or people around us. When we decide to buy products online, we read anonymous reviews and buy them. With the advent of the Data-driven era, IT technology's development is spilling out many data from individuals to objects. Companies or individuals have accumulated, processed, and analyzed such a large amount of data that they can now make decisions or execute directly using data that used to depend on experts. Nowadays, the recommender system plays a vital role in determining the user's preferences to purchase goods and uses a recommender system to induce clicks on web services (Facebook, Amazon, Netflix, Youtube). For example, Youtube's recommender system, which is used by 1 billion people worldwide every month, includes videos that users like, "like" and videos they watched. Recommended system research is deeply linked to practical business. Therefore, many researchers are interested in building better solutions. Recommender systems use the information obtained from their users to generate recommendations because the development of the provided recommender systems requires information on items that are likely to be preferred by the user. We began to trust patterns and rules derived from data rather than empirical intuition through the recommender systems. The capacity and development of data have led machine learning to develop deep learning. However, such recommender systems are not all solutions. Proceeding with the recommender systems, there should be no scarcity in all data and a sufficient amount. Also, it requires detailed information about the individual. The recommender systems work correctly when these conditions operate. The recommender systems become a complex problem for both consumers and sellers when the interaction log is insufficient. Because the seller's perspective needs to make recommendations at a personal level to the consumer and receive appropriate recommendations with reliable data from the consumer's perspective. In this paper, to improve the accuracy problem for "appropriate recommendation" to consumers, the recommender systems are proposed in combination with context-based deep learning. This research is to combine user-based data to create hybrid Recommender Systems. The hybrid approach developed is not a collaborative type of Recommender Systems, but a collaborative extension that integrates user data with deep learning. Customer review data were used for the data set. Consumers buy products in online shopping malls and then evaluate product reviews. Rating reviews are based on reviews from buyers who have already purchased, giving users confidence before purchasing the product. However, the recommendation system mainly uses scores or ratings rather than reviews to suggest items purchased by many users. In fact, consumer reviews include product opinions and user sentiment that will be spent on evaluation. By incorporating these parts into the study, this paper aims to improve the recommendation system. This study is an algorithm used when individuals have difficulty in selecting an item. Consumer reviews and record patterns made it possible to rely on recommendations appropriately. The algorithm implements a recommendation system through collaborative filtering. This study's predictive accuracy is measured by Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). Netflix is strategically using the referral system in its programs through competitions that reduce RMSE every year, making fair use of predictive accuracy. Research on hybrid recommender systems combining the NLP approach for personalization recommender systems, deep learning base, etc. has been increasing. Among NLP studies, sentiment analysis began to take shape in the mid-2000s as user review data increased. Sentiment analysis is a text classification task based on machine learning. The machine learning-based sentiment analysis has a disadvantage in that it is difficult to identify the review's information expression because it is challenging to consider the text's characteristics. In this study, we propose a deep learning recommender system that utilizes BERT's sentiment analysis by minimizing the disadvantages of machine learning. This study offers a deep learning recommender system that uses BERT's sentiment analysis by reducing the disadvantages of machine learning. The comparison model was performed through a recommender system based on Naive-CF(collaborative filtering), SVD(singular value decomposition)-CF, MF(matrix factorization)-CF, BPR-MF(Bayesian personalized ranking matrix factorization)-CF, LSTM, CNN-LSTM, GRU(Gated Recurrent Units). As a result of the experiment, the recommender system based on BERT was the best.

A Deep Learning Based Approach to Recognizing Accompanying Status of Smartphone Users Using Multimodal Data (스마트폰 다종 데이터를 활용한 딥러닝 기반의 사용자 동행 상태 인식)

  • Kim, Kilho;Choi, Sangwoo;Chae, Moon-jung;Park, Heewoong;Lee, Jaehong;Park, Jonghun
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.163-177
    • /
    • 2019
  • As smartphones are getting widely used, human activity recognition (HAR) tasks for recognizing personal activities of smartphone users with multimodal data have been actively studied recently. The research area is expanding from the recognition of the simple body movement of an individual user to the recognition of low-level behavior and high-level behavior. However, HAR tasks for recognizing interaction behavior with other people, such as whether the user is accompanying or communicating with someone else, have gotten less attention so far. And previous research for recognizing interaction behavior has usually depended on audio, Bluetooth, and Wi-Fi sensors, which are vulnerable to privacy issues and require much time to collect enough data. Whereas physical sensors including accelerometer, magnetic field and gyroscope sensors are less vulnerable to privacy issues and can collect a large amount of data within a short time. In this paper, a method for detecting accompanying status based on deep learning model by only using multimodal physical sensor data, such as an accelerometer, magnetic field and gyroscope, was proposed. The accompanying status was defined as a redefinition of a part of the user interaction behavior, including whether the user is accompanying with an acquaintance at a close distance and the user is actively communicating with the acquaintance. A framework based on convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent networks for classifying accompanying and conversation was proposed. First, a data preprocessing method which consists of time synchronization of multimodal data from different physical sensors, data normalization and sequence data generation was introduced. We applied the nearest interpolation to synchronize the time of collected data from different sensors. Normalization was performed for each x, y, z axis value of the sensor data, and the sequence data was generated according to the sliding window method. Then, the sequence data became the input for CNN, where feature maps representing local dependencies of the original sequence are extracted. The CNN consisted of 3 convolutional layers and did not have a pooling layer to maintain the temporal information of the sequence data. Next, LSTM recurrent networks received the feature maps, learned long-term dependencies from them and extracted features. The LSTM recurrent networks consisted of two layers, each with 128 cells. Finally, the extracted features were used for classification by softmax classifier. The loss function of the model was cross entropy function and the weights of the model were randomly initialized on a normal distribution with an average of 0 and a standard deviation of 0.1. The model was trained using adaptive moment estimation (ADAM) optimization algorithm and the mini batch size was set to 128. We applied dropout to input values of the LSTM recurrent networks to prevent overfitting. The initial learning rate was set to 0.001, and it decreased exponentially by 0.99 at the end of each epoch training. An Android smartphone application was developed and released to collect data. We collected smartphone data for a total of 18 subjects. Using the data, the model classified accompanying and conversation by 98.74% and 98.83% accuracy each. Both the F1 score and accuracy of the model were higher than the F1 score and accuracy of the majority vote classifier, support vector machine, and deep recurrent neural network. In the future research, we will focus on more rigorous multimodal sensor data synchronization methods that minimize the time stamp differences. In addition, we will further study transfer learning method that enables transfer of trained models tailored to the training data to the evaluation data that follows a different distribution. It is expected that a model capable of exhibiting robust recognition performance against changes in data that is not considered in the model learning stage will be obtained.

Deep Learning Architectures and Applications (딥러닝의 모형과 응용사례)

  • Ahn, SungMahn
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.127-142
    • /
    • 2016
  • Deep learning model is a kind of neural networks that allows multiple hidden layers. There are various deep learning architectures such as convolutional neural networks, deep belief networks and recurrent neural networks. Those have been applied to fields like computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics where they have been shown to produce state-of-the-art results on various tasks. Among those architectures, convolutional neural networks and recurrent neural networks are classified as the supervised learning model. And in recent years, those supervised learning models have gained more popularity than unsupervised learning models such as deep belief networks, because supervised learning models have shown fashionable applications in such fields mentioned above. Deep learning models can be trained with backpropagation algorithm. Backpropagation is an abbreviation for "backward propagation of errors" and a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of an error function with respect to all the weights in the network. The gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the error function. Convolutional neural networks use a special architecture which is particularly well-adapted to classify images. Using this architecture makes convolutional networks fast to train. This, in turn, helps us train deep, muti-layer networks, which are very good at classifying images. These days, deep convolutional networks are used in most neural networks for image recognition. Convolutional neural networks use three basic ideas: local receptive fields, shared weights, and pooling. By local receptive fields, we mean that each neuron in the first(or any) hidden layer will be connected to a small region of the input(or previous layer's) neurons. Shared weights mean that we're going to use the same weights and bias for each of the local receptive field. This means that all the neurons in the hidden layer detect exactly the same feature, just at different locations in the input image. In addition to the convolutional layers just described, convolutional neural networks also contain pooling layers. Pooling layers are usually used immediately after convolutional layers. What the pooling layers do is to simplify the information in the output from the convolutional layer. Recent convolutional network architectures have 10 to 20 hidden layers and billions of connections between units. Training deep learning networks has taken weeks several years ago, but thanks to progress in GPU and algorithm enhancement, training time has reduced to several hours. Neural networks with time-varying behavior are known as recurrent neural networks or RNNs. A recurrent neural network is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Unlike feedforward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs. Early RNN models turned out to be very difficult to train, harder even than deep feedforward networks. The reason is the unstable gradient problem such as vanishing gradient and exploding gradient. The gradient can get smaller and smaller as it is propagated back through layers. This makes learning in early layers extremely slow. The problem actually gets worse in RNNs, since gradients aren't just propagated backward through layers, they're propagated backward through time. If the network runs for a long time, that can make the gradient extremely unstable and hard to learn from. It has been possible to incorporate an idea known as long short-term memory units (LSTMs) into RNNs. LSTMs make it much easier to get good results when training RNNs, and many recent papers make use of LSTMs or related ideas.