• Title/Summary/Keyword: Deep Learning based System

Search Result 1,194, Processing Time 0.024 seconds

Deep Learning-based Approach for Visitor Detection and Path Tracking to Enhance Safety in Indoor Cultural Facilities (실내 문화시설 안전을 위한 딥러닝 기반 방문객 검출 및 동선 추적에 관한 연구)

  • Wonseop Shin;Seungmin, Rho
    • Journal of Platform Technology
    • /
    • v.11 no.4
    • /
    • pp.3-12
    • /
    • 2023
  • In the post-COVID era, the importance of quarantine measures is greatly emphasized, and accordingly, research related to the detection of mask wearing conditions and prevention of other infectious diseases using deep learning is being conducted. However, research on the detection and tracking of visitors to cultural facilities to prevent the spread of diseases is equally important, so research on this should be conducted. In this paper, a convolutional neural network-based object detection model is trained through transfer learning using a pre-collected dataset. The weights of the trained detection model are then applied to a multi-object tracking model to monitor visitors. The visitor detection model demonstrates results with a precision of 96.3%, recall of 85.2%, and an F1-score of 90.4%. Quantitative results of the tracking model include a MOTA (Multiple Object Tracking Accuracy) of 65.6%, IDF1 (ID F1 Score) of 68.3%, and HOTA (Higher Order Tracking Accuracy) of 57.2%. Furthermore, a qualitative comparison with other multi-object tracking models showcased superior results for the model proposed in this paper. The research of this paper can be applied to the hygiene systems within cultural facilities in the post-COVID era.

  • PDF

Estimation of Heading Date of Paddy Rice from Slanted View Images Using Deep Learning Classification Model

  • Hyeokjin Bak;Hoyoung Ban;SeongryulChang;Dongwon Gwon;Jae-Kyeong Baek;Jeong-Il Cho;Wan-Gyu Sang
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2022.10a
    • /
    • pp.80-80
    • /
    • 2022
  • Estimation of heading date of paddy rice is laborious and time consuming. Therefore, automatic estimation of heading date of paddy rice is highly essential. In this experiment, deep learning classification models were used to classify two difference categories of rice (vegetative and reproductive stage) based on the panicle initiation of paddy field. Specifically, the dataset includes 444 slanted view images belonging to two categories and was then expanded to include 1,497 images via IMGAUG data augmentation technique. We adopt two transfer learning strategies: (First, used transferring model weights already trained on ImageNet to six classification network models: VGGNet, ResNet, DenseNet, InceptionV3, Xception and MobileNet, Second, fine-tuned some layers of the network according to our dataset). After training the CNN model, we used several evaluation metrics commonly used for classification tasks, including Accuracy, Precision, Recall, and F1-score. In addition, GradCAM was used to generate visual explanations for each image patch. Experimental results showed that the InceptionV3 is the best performing model in terms of the accuracy, average recall, precision, and F1-score. The fine-tuned InceptionV3 model achieved an overall classification accuracy of 0.95 with a high F1-score of 0.95. Our CNN model also represented the change of rice heading date under different date of transplanting. This study demonstrated that image based deep learning model can reliably be used as an automatic monitoring system to detect the heading date of rice crops using CCTV camera.

  • PDF

Deep Learning-based system for plant disease detection and classification (딥러닝 기반 작물 질병 탐지 및 분류 시스템)

  • YuJin Ko;HyunJun Lee;HeeJa Jeong;Li Yu;NamHo Kim
    • Smart Media Journal
    • /
    • v.12 no.7
    • /
    • pp.9-17
    • /
    • 2023
  • Plant diseases and pests affect the growth of various plants, so it is very important to identify pests at an early stage. Although many machine learning (ML) models have already been used for the inspection and classification of plant pests, advances in deep learning (DL), a subset of machine learning, have led to many advances in this field of research. In this study, disease and pest inspection of abnormal crops and maturity classification were performed for normal crops using YOLOX detector and MobileNet classifier. Through this method, various plant pest features can be effectively extracted. For the experiment, image datasets of various resolutions related to strawberries, peppers, and tomatoes were prepared and used for plant pest classification. According to the experimental results, it was confirmed that the average test accuracy was 84% and the maturity classification accuracy was 83.91% in images with complex background conditions. This model was able to effectively detect 6 diseases of 3 plants and classify the maturity of each plant in natural conditions.

Deep Video Stabilization via Optical Flow in Unstable Scenes (동영상 안정화를 위한 옵티컬 플로우의 비지도 학습 방법)

  • Bohee Lee;Kwangsu Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.115-127
    • /
    • 2023
  • Video stabilization is one of the camera technologies that the importance is gradually increasing as the personal media market has recently become huge. For deep learning-based video stabilization, existing methods collect pairs of video datas before and after stabilization, but it takes a lot of time and effort to create synchronized datas. Recently, to solve this problem, unsupervised learning method using only unstable video data has been proposed. In this paper, we propose a network structure that learns the stabilized trajectory only with the unstable video image without the pair of unstable and stable video pair using the Convolutional Auto Encoder structure, one of the unsupervised learning methods. Optical flow data is used as network input and output, and optical flow data was mapped into grid units to simplify the network and minimize noise. In addition, to generate a stabilized trajectory with an unsupervised learning method, we define the loss function that smoothing the input optical flow data. And through comparison of the results, we confirmed that the network is learned as intended by the loss function.

Multi-channel Video Analysis Based on Deep Learning for Video Surveillance (보안 감시를 위한 심층학습 기반 다채널 영상 분석)

  • Park, Jang-Sik;Wiranegara, Marshall;Son, Geum-Young
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.6
    • /
    • pp.1263-1268
    • /
    • 2018
  • In this paper, a video analysis is proposed to implement video surveillance system with deep learning object detection and probabilistic data association filter for tracking multiple objects, and suggests its implementation using GPU. The proposed video analysis technique involves object detection and object tracking sequentially. The deep learning network architecture uses ResNet for object detection and applies probabilistic data association filter for multiple objects tracking. The proposed video analysis technique can be used to detect intruders illegally trespassing any restricted area or to count the number of people entering a specified area. As a results of simulations and experiments, 48 channels of videos can be analyzed at a speed of about 27 fps and real-time video analysis is possible through RTSP protocol.

Deep Learning Research on Vessel Trajectory Prediction Based on AIS Data with Interpolation Techniques

  • Won-Hee Lee;Seung-Won Yoon;Da-Hyun Jang;Kyu-Chul Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.3
    • /
    • pp.1-10
    • /
    • 2024
  • The research on predicting the routes of ships, which constitute the majority of maritime transportation, can detect potential hazards at sea in advance and prevent accidents. Unlike roads, there is no distinct signal system at sea, and traffic management is challenging, making ship route prediction essential for maritime safety. However, the time intervals of the ship route datasets are irregular due to communication disruptions. This study presents a method to adjust the time intervals of data using appropriate interpolation techniques for ship route prediction. Additionally, a deep learning model for predicting ship routes has been developed. This model is an LSTM model that predicts the future GPS coordinates of ships by understanding their movement patterns through real-time route information contained in AIS data. This paper presents a data preprocessing method using linear interpolation and a suitable deep learning model for ship route prediction. The experimental results demonstrate the effectiveness of the proposed method with an MSE of 0.0131 and an Accuracy of 0.9467.

Speaker verification with ECAPA-TDNN trained on new dataset combined with Voxceleb and Korean (Voxceleb과 한국어를 결합한 새로운 데이터셋으로 학습된 ECAPA-TDNN을 활용한 화자 검증)

  • Keumjae Yoon;Soyoung Park
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.2
    • /
    • pp.209-224
    • /
    • 2024
  • Speaker verification is becoming popular as a method of non-face-to-face identity authentication. It involves determining whether two voice data belong to the same speaker. In cases where the criminal's voice remains at the crime scene, it is vital to establish a speaker verification system that can accurately compare the two voice evidence. In this study, to achieve this, a new speaker verification system was built using a deep learning model for Korean language. High-dimensional voice data with a high variability like background noise made it necessary to use deep learning-based methods for speaker matching. To construct the matching algorithm, the ECAPA-TDNN model, known as the most famous deep learning system for speaker verification, was selected. A large dataset of the voice data, Voxceleb, collected from people of various nationalities without Korean. To study the appropriate form of datasets necessary for learning the Korean language, experiments were carried out to find out how Korean voice data affects the matching performance. The results showed that when comparing models learned only with Voxceleb and models learned with datasets combining Voxceleb and Korean datasets to maximize language and speaker diversity, the performance of learning data, including Korean, is improved for all test sets.

Physical-Layer Technology Trend and Prospect for AI-based Mobile Communication (AI 기반 이동통신 물리계층 기술 동향과 전망)

  • Chang, K.;Ko, Y.J.;Kim, I.G.
    • Electronics and Telecommunications Trends
    • /
    • v.35 no.5
    • /
    • pp.14-29
    • /
    • 2020
  • The 6G mobile communication system will become a backbone infrastructure around 2030 for the future digital world by providing distinctive services such as five-sense holograms, ultra-high reliability/low-latency, ultra-high-precision positioning, ultra-massive connectivity, and gigabit-per-second data rate for aerial and maritime terminals. The recent remarkable advances in machine learning (ML) technology have recognized its efficiency in wireless networking fields such as resource management and cell-configuration optimization. Further innovation in ML is expected to play an important role in solving new problems arising from 6G network management and service delivery. In contrast, an approach to apply ML to a physical-layer (PHY) target tackles the basic problems in radio links, such as overcoming signal distortion and interference. This paper reviews the methodologies of ML-based PHY, relevant industrial trends, and candiate technologies, including future research directions and standardization impacts.

Multi-modal Sensor System and Database for Human Detection and Activity Learning of Robot in Outdoor (실외에서 로봇의 인간 탐지 및 행위 학습을 위한 멀티모달센서 시스템 및 데이터베이스 구축)

  • Uhm, Taeyoung;Park, Jeong-Woo;Lee, Jong-Deuk;Bae, Gi-Deok;Choi, Young-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.12
    • /
    • pp.1459-1466
    • /
    • 2018
  • Robots which detect human and recognize action are important factors for human interaction, and many researches have been conducted. Recently, deep learning technology has developed and learning based robot's technology is a major research area. These studies require a database to learn and evaluate for intelligent human perception. In this paper, we propose a multi-modal sensor-based image database condition considering the security task by analyzing the image database to detect the person in the outdoor environment and to recognize the behavior during the running of the robot.

Multiple Fusion-based Deep Cross-domain Recommendation (다중 융합 기반 심층 교차 도메인 추천)

  • Hong, Minsung;Lee, WonJin
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.6
    • /
    • pp.819-832
    • /
    • 2022
  • Cross-domain recommender system transfers knowledge across different domains to improve the recommendation performance in a target domain that has a relatively sparse model. However, they suffer from the "negative transfer" in which transferred knowledge operates as noise. This paper proposes a novel Multiple Fusion-based Deep Cross-Domain Recommendation named MFDCR. We exploit Doc2Vec, one of the famous word embedding techniques, to fuse data user-wise and transfer knowledge across multi-domains. It alleviates the "negative transfer" problem. Additionally, we introduce a simple multi-layer perception to learn the user-item interactions and predict the possibility of preferring items by users. Extensive experiments with three domain datasets from one of the most famous services Amazon demonstrate that MFDCR outperforms recent single and cross-domain recommendation algorithms. Furthermore, experimental results show that MFDCR can address the problem of "negative transfer" and improve recommendation performance for multiple domains simultaneously. In addition, we show that our approach is efficient in extending toward more domains.