• Title/Summary/Keyword: data preprocessing

Search Result 997, Processing Time 0.026 seconds

Localized reliability analysis on a large-span rigid frame bridge based on monitored strains from the long-term SHM system

  • Liu, Zejia;Li, Yinghua;Tang, Liqun;Liu, Yiping;Jiang, Zhenyu;Fang, Daining
    • Smart Structures and Systems
    • /
    • v.14 no.2
    • /
    • pp.209-224
    • /
    • 2014
  • With more and more built long-term structural health monitoring (SHM) systems, it has been considered to apply monitored data to learn the reliability of bridges. In this paper, based on a long-term SHM system, especially in which the sensors were embedded from the beginning of the construction of the bridge, a method to calculate the localized reliability around an embedded sensor is recommended and implemented. In the reliability analysis, the probability distribution of loading can be the statistics of stress transferred from the monitored strain which covered the effects of both the live and dead loads directly, and it means that the mean value and deviation of loads are fully derived from the monitored data. The probability distribution of resistance may be the statistics of strength of the material of the bridge accordingly. With five years' monitored strains, the localized reliabilities around the monitoring sensors of a bridge were computed by the method. Further, the monitored stresses are classified into two time segments in one year period to count the loading probability distribution according to the local climate conditions, which helps us to learn the reliability in different time segments and their evolvement trends. The results show that reliabilities and their evolvement trends in different parts of the bridge are different though they are all reliable yet. The method recommended in this paper is feasible to learn the localized reliabilities revealed from monitored data of a long-term SHM system of bridges, which would help bridge engineers and managers to decide a bridge inspection or maintenance strategy.

A Path Travel Time Estimation Study on Expressways using TCS Link Travel Times (TCS 링크통행시간을 이용한 고속도로 경로통행시간 추정)

  • Lee, Hyeon-Seok;Jeon, Gyeong-Su
    • Journal of Korean Society of Transportation
    • /
    • v.27 no.5
    • /
    • pp.209-221
    • /
    • 2009
  • Travel time estimation under given traffic conditions is important for providing drivers with travel time prediction information. But the present expressway travel time estimation process cannot calculate a reliable travel time. The objective of this study is to estimate the path travel time spent in a through lane between origin tollgates and destination tollgates on an expressway as a prerequisite result to offer reliable prediction information. Useful and abundant toll collection system (TCS) data were used. When estimating the path travel time, the path travel time is estimated combining the link travel time obtained through a preprocessing process. In the case of a lack of TCS data, the TCS travel time for previous intervals is referenced using the linear interpolation method after analyzing the increase pattern for the travel time. When the TCS data are absent over a long-term period, the dynamic travel time using the VDS time space diagram is estimated. The travel time estimated by the model proposed can be validated statistically when compared to the travel time obtained from vehicles traveling the path directly. The results show that the proposed model can be utilized for estimating a reliable travel time for a long-distance path in which there are a variaty of travel times from the same departure time, the intervals are large and the change in the representative travel time is irregular for a short period.

Multi-modal Image Processing for Improving Recognition Accuracy of Text Data in Images (이미지 내의 텍스트 데이터 인식 정확도 향상을 위한 멀티 모달 이미지 처리 프로세스)

  • Park, Jungeun;Joo, Gyeongdon;Kim, Chulyun
    • Database Research
    • /
    • v.34 no.3
    • /
    • pp.148-158
    • /
    • 2018
  • The optical character recognition (OCR) is a technique to extract and recognize texts from images. It is an important preprocessing step in data analysis since most actual text information is embedded in images. Many OCR engines have high recognition accuracy for images where texts are clearly separable from background, such as white background and black lettering. However, they have low recognition accuracy for images where texts are not easily separable from complex background. To improve this low accuracy problem with complex images, it is necessary to transform the input image to make texts more noticeable. In this paper, we propose a method to segment an input image into text lines to enable OCR engines to recognize each line more efficiently, and to determine the final output by comparing the recognition rates of CLAHE module and Two-step module which distinguish texts from background regions based on image processing techniques. Through thorough experiments comparing with well-known OCR engines, Tesseract and Abbyy, we show that our proposed method have the best recognition accuracy with complex background images.

Analysis of the Status of Natural Language Processing Technology Based on Deep Learning (딥러닝 중심의 자연어 처리 기술 현황 분석)

  • Park, Sang-Un
    • The Journal of Bigdata
    • /
    • v.6 no.1
    • /
    • pp.63-81
    • /
    • 2021
  • The performance of natural language processing is rapidly improving due to the recent development and application of machine learning and deep learning technologies, and as a result, the field of application is expanding. In particular, as the demand for analysis on unstructured text data increases, interest in NLP(Natural Language Processing) is also increasing. However, due to the complexity and difficulty of the natural language preprocessing process and machine learning and deep learning theories, there are still high barriers to the use of natural language processing. In this paper, for an overall understanding of NLP, by examining the main fields of NLP that are currently being actively researched and the current state of major technologies centered on machine learning and deep learning, We want to provide a foundation to understand and utilize NLP more easily. Therefore, we investigated the change of NLP in AI(artificial intelligence) through the changes of the taxonomy of AI technology. The main areas of NLP which consists of language model, text classification, text generation, document summarization, question answering and machine translation were explained with state of the art deep learning models. In addition, major deep learning models utilized in NLP were explained, and data sets and evaluation measures for performance evaluation were summarized. We hope researchers who want to utilize NLP for various purposes in their field be able to understand the overall technical status and the main technologies of NLP through this paper.

Research of Water-related Disaster Monitoring Using Satellite Bigdata Based on Google Earth Engine Cloud Computing Platform (구글어스엔진 클라우드 컴퓨팅 플랫폼 기반 위성 빅데이터를 활용한 수재해 모니터링 연구)

  • Park, Jongsoo;Kang, Ki-mook
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_3
    • /
    • pp.1761-1775
    • /
    • 2022
  • Due to unpredictable climate change, the frequency of occurrence of water-related disasters and the scale of damage are also continuously increasing. In terms of disaster management, it is essential to identify the damaged area in a wide area and monitor for mid-term and long-term forecasting. In the field of water disasters, research on remote sensing technology using Synthetic Aperture Radar (SAR) satellite images for wide-area monitoring is being actively conducted. Time-series analysis for monitoring requires a complex preprocessing process that collects a large amount of images and considers the noisy radar characteristics, and for this, a considerable amount of time is required. With the recent development of cloud computing technology, many platforms capable of performing spatiotemporal analysis using satellite big data have been proposed. Google Earth Engine (GEE)is a representative platform that provides about 600 satellite data for free and enables semi real time space time analysis based on the analysis preparation data of satellite images. Therefore, in this study, immediate water disaster damage detection and mid to long term time series observation studies were conducted using GEE. Through the Otsu technique, which is mainly used for change detection, changes in river width and flood area due to river flooding were confirmed, centered on the torrential rains that occurred in 2020. In addition, in terms of disaster management, the change trend of the time series waterbody from 2018 to 2022 was confirmed. The short processing time through javascript based coding, and the strength of spatiotemporal analysis and result expression, are expected to enable use in the field of water disasters. In addition, it is expected that the field of application will be expanded through connection with various satellite bigdata in the future.

Analyzing the Phenomena of Hate in Korea by Text Mining Techniques (텍스트마이닝 기법을 이용한 한국 사회의 혐오 양상 분석)

  • Hea-Jin, Kim
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.56 no.4
    • /
    • pp.431-453
    • /
    • 2022
  • Hate is a collective expression of exclusivity toward others and it is fostered and reproduced through false public perception. This study aims to explore the objects and issues of hate discussed in our society using text mining techniques. To this end, we collected 17,867 news data published from 1990 to 2020 and constructed a co-word network and cluster analysis. In order to derive an explicit co-word network highly related to hate, we carried out sentence split and extracted a total of 52,520 sentences containing the words 'hate', 'prejudice' and 'discrimination' in the preprocessing phase. As a result of analyzing the frequency of words in the collected news data, the subjects that appeared most frequently in relation to hate in our society were women, race, and sexual minorities, and the related issues were related laws and crimes. As a result of cluster analysis based on the co-word network, we found a total of six hate-related clusters. The largest cluster was 'genderphobic', accounting for 41.4% of the total, followed by 'sexual minority hatred' at 28.7%, 'racial hatred' at 15.1%, 'selective hatred' at 8.5%, 'political hatred' accounted for 5.7% and 'environmental hatred' accounted for 0.3%. In the discussion, we comprehensively extracted all specific hate target names from the collected news data, which were not specifically revealed as a result of the cluster analysis.

Implementation of CNN-based Classification Training Model for Unstructured Fashion Image Retrieval using Preprocessing with MASK R-CNN (비정형 패션 이미지 검색을 위한 MASK R-CNN 선형처리 기반 CNN 분류 학습모델 구현)

  • Seunga, Cho;Hayoung, Lee;Hyelim, Jang;Kyuri, Kim;Hyeon-Ji, Lee;Bong-Ki, Son;Jaeho, Lee
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.6
    • /
    • pp.13-23
    • /
    • 2022
  • In this paper, we propose a detailed component image classification algorithm by fashion item for unstructured data retrieval in the fashion field. Due to the COVID-19 environment, AI-based online shopping malls are increasing recently. However, there is a limit to accurate unstructured data search with existing keyword search and personalized style recommendations based on user surfing behavior. In this study, pre-processing using Mask R-CNN was conducted using images crawled from online shopping sites and then classified components for each fashion item through CNN. We obtain the accuaracy for collar of the shirt's as 93.28%, the pattern of the shirt as 98.10%, the 3 classese fit of the jeans as 91.73%, And, we further obtained one for the 4 classes fit of jeans as 81.59% and the color of the jeans as 93.91%. At the results for the decorated items, we also obtained the accuract of the washing of the jeans as 91.20% and the demage of jeans accuaracy as 92.96%.

Analysis of Deep Learning Research Trends Applied to Remote Sensing through Paper Review of Korean Domestic Journals (국내학회지 논문 리뷰를 통한 원격탐사 분야 딥러닝 연구 동향 분석)

  • Lee, Changhui;Yun, Yerin;Bae, Saejung;Eo, Yang Dam;Kim, Changjae;Shin, Sangho;Park, Soyoung;Han, Youkyung
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.39 no.6
    • /
    • pp.437-456
    • /
    • 2021
  • In the field of remote sensing in Korea, starting in 2017, deep learning has begun to show efficient research results compared to existing research methods. Currently, research is being conducted to apply deep learning in almost all fields of remote sensing, from image preprocessing to applications. To analyze the research trend of deep learning applied to the remote sensing field, Korean domestic journal papers, published until October 2021, related to deep learning applied to the remote sensing field were collected. Based on the collected 60 papers, research trend analysis was performed while focusing on deep learning network purpose, remote sensing application field, and remote sensing image acquisition platform. In addition, open source data that can be effectively used to build training data for performing deep learning were summarized in the paper. Through this study, we presented the problems that need to be solved in order for deep learning to be established in the remote sensing field. Moreover, we intended to provide help in finding research directions for researchers to apply deep learning technology into the remote sensing field in the future.

Ground Subsidence Risk Grade Prediction Model Based on Machine Learning According to the Underground Facility Properties and Density (기계학습 기반 지하매설물 속성 및 밀집도를 활용한 지반함몰 위험도 예측 모델)

  • Sungyeol Lee;Jaemo Kang;Jinyoung Kim
    • Journal of the Korean GEO-environmental Society
    • /
    • v.24 no.4
    • /
    • pp.23-29
    • /
    • 2023
  • Ground subsidence shows a mechanism in which the upper ground collapses due to the formation of a cavity due to the movement of soil particles in the ground due to the formation of a waterway because of damage to the water supply/sewer pipes. As a result, cavity is created in the ground and the upper ground is collapsing. Therefore, ground subsidence frequently occurs mainly in downtown areas where a large amount of underground facilities are buried. Accordingly, research to predict the risk of ground subsidence is continuously being conducted. This study tried to present a ground subsidence risk prediction model for two districts of ○○ city. After constructing a data set and performing preprocessing, using the property data of underground facilities in the target area (year of service, pipe diameter), density of underground facilities, and ground subsidence history data. By applying the dataset to the machine learning model, it is evaluated the reliability of the selected model and the importance of the influencing factors used in predicting the ground subsidence risk derived from the model is presented.

Development of a Real-time Ship Operational Efficiency Analysis Model (선박운항데이터 기반 실시간 선박운항효율 분석 모델 개발)

  • Taemin Hwang;Hyoseon Hwang;Ik-Hyun Youn
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.29 no.1
    • /
    • pp.60-66
    • /
    • 2023
  • Currently, the maritime industry is focusing on developing technologies that promote autonomy and intelligence, such as smart ships, autonomous ships, and eco-friendly technologies, to enhance ship operational efficiency. Many countries are conducting research on different methods to ensure ship safety while increasing operational efficiency. This study aims to develop a real-time ship operational efficiency analysis model using data analysis methods to address the current limitations of the present technologies in the real-time evaluation of operational efficiency. The model selected ship operational efficiency factors and ship operational condition factors to compare the operational efficiency of the ship with present and classified factors to determine whether the present ship operational efficiency is appropriate. The study involved selecting a target ship, collecting data, preprocessing data, and developing classification models. The results of the research were obtained by determining the improved ship operational efficiency based on the ship operational condition factors to support ship operators.