• Title/Summary/Keyword: data preprocessing

Search Result 997, Processing Time 0.023 seconds

Computer vision-based remote displacement monitoring system for in-situ bridge bearings robust to large displacement induced by temperature change

  • Kim, Byunghyun;Lee, Junhwa;Sim, Sung-Han;Cho, Soojin;Park, Byung Ho
    • Smart Structures and Systems
    • /
    • v.30 no.5
    • /
    • pp.521-535
    • /
    • 2022
  • Efficient management of deteriorating civil infrastructure is one of the most important research topics in many developed countries. In particular, the remote displacement measurement of bridges using linear variable differential transformers, global positioning systems, laser Doppler vibrometers, and computer vision technologies has been attempted extensively. This paper proposes a remote displacement measurement system using closed-circuit televisions (CCTVs) and a computer-vision-based method for in-situ bridge bearings having relatively large displacement due to temperature change in long term. The hardware of the system is composed of a reference target for displacement measurement, a CCTV to capture target images, a gateway to transmit images via a mobile network, and a central server to store and process transmitted images. The usage of CCTV capable of night vision capture and wireless data communication enable long-term 24-hour monitoring on wide range of bridge area. The computer vision algorithm to estimate displacement from the images involves image preprocessing for enhancing the circular features of the target, circular Hough transformation for detecting circles on the target in the whole field-of-view (FOV), and homography transformation for converting the movement of the target in the images into an actual expansion displacement. The simple target design and robust circle detection algorithm help to measure displacement using target images where the targets are far apart from each other. The proposed system is installed at the Tancheon Overpass located in Seoul, and field experiments are performed to evaluate the accuracy of circle detection and displacement measurements. The circle detection accuracy is evaluated using 28,542 images captured from 71 CCTVs installed at the testbed, and only 48 images (0.168%) fail to detect the circles on the target because of subpar imaging conditions. The accuracy of displacement measurement is evaluated using images captured for 17 days from three CCTVs; the average and root-mean-square errors are 0.10 and 0.131 mm, respectively, compared with a similar displacement measurement. The long-term operation of the system, as evaluated using 8-month data, shows high accuracy and stability of the proposed system.

K-Means Clustering Algorithm and CPA based Collinear Multiple Static Obstacle Collision Avoidance for UAVs (K-평균 군집화 알고리즘 및 최근접점 기반 무인항공기용 공선상의 다중 정적 장애물 충돌 회피)

  • Hyeji Kim;Hyeok Kang;Seongbong Lee;Hyeongseok Kim;Dongjin Lee
    • Journal of Advanced Navigation Technology
    • /
    • v.26 no.6
    • /
    • pp.427-433
    • /
    • 2022
  • Obstacle detection, collision recognition, and avoidance technologies are required the collision avoidance technology for UAVs. In this paper, considering collinear multiple static obstacle, we propose an obstacle detection algorithm using LiDAR and a collision recognition and avoidance algorithm based on CPA. Preprocessing is performed to remove the ground from the LiDAR measurement data before obstacle detection. And we detect and classify obstacles in the preprocessed data using the K-means clustering algorithm. Also, we estimate the absolute positions of detected obstacles using relative navigation and correct the estimated positions using a low-pass filter. For collision avoidance with the detected multiple static obstacle, we use a collision recognition and avoidance algorithm based on CPA. Information of obstacles to be avoided is updated using distance between each obstacle, and collision recognition and avoidance are performed through the updated obstacles information. Finally, through obstacle location estimation, collision recognition, and collision avoidance result analysis in the Gazebo simulation environment, we verified that collision avoidance is performed successfully.

Study on Applicability of Cloth Simulation Filtering Algorithm for Segmentation of Ground Points from Drone LiDAR Point Clouds in Mountainous Areas (산악지형 드론 라이다 데이터 점군 분리를 위한 CSF 알고리즘 적용에 관한 연구)

  • Seul Koo ;Eon Taek Lim ;Yong Han Jung ;Jae Wook Suk ;Seong Sam Kim
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_2
    • /
    • pp.827-835
    • /
    • 2023
  • Drone light detection and ranging (LiDAR) is a state-of-the-art surveying technology that enables close investigation of the top of the mountain slope or the inaccessible slope, and is being used for field surveys in mountainous terrain. To build topographic information using Drone LiDAR, a preprocessing process is required to effectively separate ground and non-ground points from the acquired point cloud. Therefore, in this study, the point group data of the mountain topography was acquired using an aerial LiDAR mounted on a commercial drone, and the application and accuracy of the cloth simulation filtering algorithm, one of the ground separation techniques, was verified. As a result of applying the algorithm, the separation accuracy of the ground and the non-ground was 84.3%, and the kappa coefficient was 0.71, and drone LiDAR data could be effectively used for landslide field surveys in mountainous terrain.

Analysis of Research Trends Related to drug Repositioning Based on Machine Learning (머신러닝 기반의 신약 재창출 관련 연구 동향 분석)

  • So Yeon Yoo;Gyoo Gun Lim
    • Information Systems Review
    • /
    • v.24 no.1
    • /
    • pp.21-37
    • /
    • 2022
  • Drug repositioning, one of the methods of developing new drugs, is a useful way to discover new indications by allowing drugs that have already been approved for use in people to be used for other purposes. Recently, with the development of machine learning technology, the case of analyzing vast amounts of biological information and using it to develop new drugs is increasing. The use of machine learning technology to drug repositioning will help quickly find effective treatments. Currently, the world is having a difficult time due to a new disease caused by coronavirus (COVID-19), a severe acute respiratory syndrome. Drug repositioning that repurposes drugsthat have already been clinically approved could be an alternative to therapeutics to treat COVID-19 patients. This study intends to examine research trends in the field of drug repositioning using machine learning techniques. In Pub Med, a total of 4,821 papers were collected with the keyword 'Drug Repositioning'using the web scraping technique. After data preprocessing, frequency analysis, LDA-based topic modeling, random forest classification analysis, and prediction performance evaluation were performed on 4,419 papers. Associated words were analyzed based on the Word2vec model, and after reducing the PCA dimension, K-Means clustered to generate labels, and then the structured organization of the literature was visualized using the t-SNE algorithm. Hierarchical clustering was applied to the LDA results and visualized as a heat map. This study identified the research topics related to drug repositioning, and presented a method to derive and visualize meaningful topics from a large amount of literature using a machine learning algorithm. It is expected that it will help to be used as basic data for establishing research or development strategies in the field of drug repositioning in the future.

Deep Learning-based UWB Distance Measurement for Wireless Power Transfer of Autonomous Vehicles in Indoor Environment (실내환경에서의 자율주행차 무선 전력 전송을 위한 딥러닝 기반 UWB 거리 측정)

  • Hye-Jung Kim;Yong-ju Park;Seung-Jae Han
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.13 no.1
    • /
    • pp.21-30
    • /
    • 2024
  • As the self-driving car market continues to grow, the need for charging infrastructure is growing. However, in the case of a wireless charging system, stability issues are being raised because it requires a large amount of power compared with conventional wired charging. SAE J2954 is a standard for building autonomous vehicle wireless charging infrastructure, and the standard defines a communication method between a vehicle and a power transmission system. SAE J2954 recommends using physical media such as Wi-Fi, Bluetooth, and UWB as a wireless charging communication method for autonomous vehicles to enable communication between the vehicle and the charging pad. In particular, UWB is a suitable solution for indoor and outdoor charging environments because it exhibits robust communication capabilities in indoor environments and is not sensitive to interference. In this standard, the process for building a wireless power transmission system is divided into several stages from the start to the completion of charging. In this study, UWB technology is used as a means of fine alignment, a process in the wireless power transmission system. To determine the applicability to an actual autonomous vehicle wireless power transmission system, experiments were conducted based on distance, and the distance information was collected from UWB. To improve the accuracy of the distance data obtained from UWB, we propose a Single Model and Multi Model that apply machine learning and deep learning techniques to the collected data through a three-step preprocessing process.

A Study on Analysis of Research Trends and Intellectual Structure in the Overseas Cataloging Research (해외 목록학 연구동향 및 지적구조 분석)

  • Ji Won Lee;Sung Sook Lee
    • Journal of the Korean Society for information Management
    • /
    • v.41 no.1
    • /
    • pp.367-387
    • /
    • 2024
  • This study aims to identify the recent trends and intellectual structure of international research in the field of catalog, which is undergoing a major change due to the enactment of new standards and rules and the anticipated future. For this purpose, we collected 680 articles published in the 14 years since 2010 and analyzed 1,942 author keywords extracted from them after preprocessing. The main findings of the analysis are as follows First, overseas cataloging research has seen notable growth since 2017. Second, the most frequent research topics were: cataloging, metadata, RDA, university libraries, authority control, linked data, FRBR, catalog, LCSH, libraries, andonline cataloging. Third, the research themes were divided into two clusters, one related to the traditional aspects of library cataloging and the other related to the more recently discussed topics of authority control, cooperative cataloging, RDA, and linked data, which were further subdivided into 14 subclusters. Fourth, we looked at the growth index and standard performance index of the 14 keyword clusters and found that all but one cluster showed growth in terms of discipline growth. This study is significant in that it can be used as a basis for predicting the future development of inventories for Korean academia and the field and for related education.

What Concerns Does ChatGPT Raise for Us?: An Analysis Centered on CTM (Correlated Topic Modeling) of YouTube Video News Comments (ChatGPT는 우리에게 어떤 우려를 초래하는가?: 유튜브 영상 뉴스 댓글의 CTM(Correlated Topic Modeling) 분석을 중심으로)

  • Song, Minho;Lee, Soobum
    • Informatization Policy
    • /
    • v.31 no.1
    • /
    • pp.3-31
    • /
    • 2024
  • This study aimed to examine public concerns in South Korea considering the country's unique context, triggered by the advent of generative artificial intelligence such as ChatGPT. To achieve this, comments from 102 YouTube video news related to ethical issues were collected using a Python scraper, and morphological analysis and preprocessing were carried out using Textom on 15,735 comments. These comments were then analyzed using a Correlated Topic Model (CTM). The analysis identified six primary topics within the comments: "Legal and Ethical Considerations"; "Intellectual Property and Technology"; "Technological Advancement and the Future of Humanity"; "Potential of AI in Information Processing"; "Emotional Intelligence and Ethical Regulations in AI"; and "Human Imitation."Structuring these topics based on a correlation coefficient value of over 10% revealed 3 main categories: "Legal and Ethical Considerations"; "Issues Related to Data Generation by ChatGPT (Intellectual Property and Technology, Potential of AI in Information Processing, and Human Imitation)"; and "Fear for the Future of Humanity (Technological Advancement and the Future of Humanity, Emotional Intelligence, and Ethical Regulations in AI)."The study confirmed the coexistence of various concerns along with the growing interest in generative AI like ChatGPT, including worries specific to the historical and social context of South Korea. These findings suggest the need for national-level efforts to ensure data fairness.

A Study on Developing a Web Care Model for Audiobook Platforms Using Machine Learning (머신러닝을 이용한 오디오북 플랫폼 기반의 웹케어 모형 구축에 관한 연구)

  • Dahoon Jeong;Minhyuk Lee;Taewon Lee
    • Information Systems Review
    • /
    • v.26 no.1
    • /
    • pp.337-353
    • /
    • 2024
  • The purpose of this study is to investigate the relationship between consumer reviews and managerial responses, aiming to explore the necessity of webcare for efficiently managing consumer reviews. We intend to propose a methodology for effective webcare and to construct a webcare model using machine learning techniques based on an audiobook platform. In this study, we selected four audiobook platforms and conducted data collection and preprocessing for consumer reviews and managerial responses. We utilized techniques such as topic modeling, topic inconsistency analysis, and DBSCAN, along with various machine learning methods for analysis. The experimental results yielded significant findings in clustering managerial responses and predicting responses to consumer reviews, proposing an efficient methodology considering resource constraints and costs. This research provides academic insights by constructing a webcare model through machine learning techniques and practical implications by suggesting an efficient methodology, considering the limited resources and personnel of companies. The proposed webcare model in this study can be utilized as strategic foundational data for consumer engagement and providing useful information, offering both personalized responses and standardized managerial responses.

A Study on People Counting in Public Metro Service using Hybrid CNN-LSTM Algorithm (Hybrid CNN-LSTM 알고리즘을 활용한 도시철도 내 피플 카운팅 연구)

  • Choi, Ji-Hye;Kim, Min-Seung;Lee, Chan-Ho;Choi, Jung-Hwan;Lee, Jeong-Hee;Sung, Tae-Eung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.131-145
    • /
    • 2020
  • In line with the trend of industrial innovation, IoT technology utilized in a variety of fields is emerging as a key element in creation of new business models and the provision of user-friendly services through the combination of big data. The accumulated data from devices with the Internet-of-Things (IoT) is being used in many ways to build a convenience-based smart system as it can provide customized intelligent systems through user environment and pattern analysis. Recently, it has been applied to innovation in the public domain and has been using it for smart city and smart transportation, such as solving traffic and crime problems using CCTV. In particular, it is necessary to comprehensively consider the easiness of securing real-time service data and the stability of security when planning underground services or establishing movement amount control information system to enhance citizens' or commuters' convenience in circumstances with the congestion of public transportation such as subways, urban railways, etc. However, previous studies that utilize image data have limitations in reducing the performance of object detection under private issue and abnormal conditions. The IoT device-based sensor data used in this study is free from private issue because it does not require identification for individuals, and can be effectively utilized to build intelligent public services for unspecified people. Especially, sensor data stored by the IoT device need not be identified to an individual, and can be effectively utilized for constructing intelligent public services for many and unspecified people as data free form private issue. We utilize the IoT-based infrared sensor devices for an intelligent pedestrian tracking system in metro service which many people use on a daily basis and temperature data measured by sensors are therein transmitted in real time. The experimental environment for collecting data detected in real time from sensors was established for the equally-spaced midpoints of 4×4 upper parts in the ceiling of subway entrances where the actual movement amount of passengers is high, and it measured the temperature change for objects entering and leaving the detection spots. The measured data have gone through a preprocessing in which the reference values for 16 different areas are set and the difference values between the temperatures in 16 distinct areas and their reference values per unit of time are calculated. This corresponds to the methodology that maximizes movement within the detection area. In addition, the size of the data was increased by 10 times in order to more sensitively reflect the difference in temperature by area. For example, if the temperature data collected from the sensor at a given time were 28.5℃, the data analysis was conducted by changing the value to 285. As above, the data collected from sensors have the characteristics of time series data and image data with 4×4 resolution. Reflecting the characteristics of the measured, preprocessed data, we finally propose a hybrid algorithm that combines CNN in superior performance for image classification and LSTM, especially suitable for analyzing time series data, as referred to CNN-LSTM (Convolutional Neural Network-Long Short Term Memory). In the study, the CNN-LSTM algorithm is used to predict the number of passing persons in one of 4×4 detection areas. We verified the validation of the proposed model by taking performance comparison with other artificial intelligence algorithms such as Multi-Layer Perceptron (MLP), Long Short Term Memory (LSTM) and RNN-LSTM (Recurrent Neural Network-Long Short Term Memory). As a result of the experiment, proposed CNN-LSTM hybrid model compared to MLP, LSTM and RNN-LSTM has the best predictive performance. By utilizing the proposed devices and models, it is expected various metro services will be provided with no illegal issue about the personal information such as real-time monitoring of public transport facilities and emergency situation response services on the basis of congestion. However, the data have been collected by selecting one side of the entrances as the subject of analysis, and the data collected for a short period of time have been applied to the prediction. There exists the limitation that the verification of application in other environments needs to be carried out. In the future, it is expected that more reliability will be provided for the proposed model if experimental data is sufficiently collected in various environments or if learning data is further configured by measuring data in other sensors.

A Deep Learning Based Approach to Recognizing Accompanying Status of Smartphone Users Using Multimodal Data (스마트폰 다종 데이터를 활용한 딥러닝 기반의 사용자 동행 상태 인식)

  • Kim, Kilho;Choi, Sangwoo;Chae, Moon-jung;Park, Heewoong;Lee, Jaehong;Park, Jonghun
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.163-177
    • /
    • 2019
  • As smartphones are getting widely used, human activity recognition (HAR) tasks for recognizing personal activities of smartphone users with multimodal data have been actively studied recently. The research area is expanding from the recognition of the simple body movement of an individual user to the recognition of low-level behavior and high-level behavior. However, HAR tasks for recognizing interaction behavior with other people, such as whether the user is accompanying or communicating with someone else, have gotten less attention so far. And previous research for recognizing interaction behavior has usually depended on audio, Bluetooth, and Wi-Fi sensors, which are vulnerable to privacy issues and require much time to collect enough data. Whereas physical sensors including accelerometer, magnetic field and gyroscope sensors are less vulnerable to privacy issues and can collect a large amount of data within a short time. In this paper, a method for detecting accompanying status based on deep learning model by only using multimodal physical sensor data, such as an accelerometer, magnetic field and gyroscope, was proposed. The accompanying status was defined as a redefinition of a part of the user interaction behavior, including whether the user is accompanying with an acquaintance at a close distance and the user is actively communicating with the acquaintance. A framework based on convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent networks for classifying accompanying and conversation was proposed. First, a data preprocessing method which consists of time synchronization of multimodal data from different physical sensors, data normalization and sequence data generation was introduced. We applied the nearest interpolation to synchronize the time of collected data from different sensors. Normalization was performed for each x, y, z axis value of the sensor data, and the sequence data was generated according to the sliding window method. Then, the sequence data became the input for CNN, where feature maps representing local dependencies of the original sequence are extracted. The CNN consisted of 3 convolutional layers and did not have a pooling layer to maintain the temporal information of the sequence data. Next, LSTM recurrent networks received the feature maps, learned long-term dependencies from them and extracted features. The LSTM recurrent networks consisted of two layers, each with 128 cells. Finally, the extracted features were used for classification by softmax classifier. The loss function of the model was cross entropy function and the weights of the model were randomly initialized on a normal distribution with an average of 0 and a standard deviation of 0.1. The model was trained using adaptive moment estimation (ADAM) optimization algorithm and the mini batch size was set to 128. We applied dropout to input values of the LSTM recurrent networks to prevent overfitting. The initial learning rate was set to 0.001, and it decreased exponentially by 0.99 at the end of each epoch training. An Android smartphone application was developed and released to collect data. We collected smartphone data for a total of 18 subjects. Using the data, the model classified accompanying and conversation by 98.74% and 98.83% accuracy each. Both the F1 score and accuracy of the model were higher than the F1 score and accuracy of the majority vote classifier, support vector machine, and deep recurrent neural network. In the future research, we will focus on more rigorous multimodal sensor data synchronization methods that minimize the time stamp differences. In addition, we will further study transfer learning method that enables transfer of trained models tailored to the training data to the evaluation data that follows a different distribution. It is expected that a model capable of exhibiting robust recognition performance against changes in data that is not considered in the model learning stage will be obtained.