• Title/Summary/Keyword: data pre-processing

Search Result 813, Processing Time 0.028 seconds

On the Analysis of Natural Language Processing Morphology for the Specialized Corpus in the Railway Domain

  • Won, Jong Un;Jeon, Hong Kyu;Kim, Min Joong;Kim, Beak Hyun;Kim, Young Min
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.4
    • /
    • pp.189-197
    • /
    • 2022
  • Today, we are exposed to various text-based media such as newspapers, Internet articles, and SNS, and the amount of text data we encounter has increased exponentially due to the recent availability of Internet access using mobile devices such as smartphones. Collecting useful information from a lot of text information is called text analysis, and in order to extract information, it is performed using technologies such as Natural Language Processing (NLP) for processing natural language with the recent development of artificial intelligence. For this purpose, a morpheme analyzer based on everyday language has been disclosed and is being used. Pre-learning language models, which can acquire natural language knowledge through unsupervised learning based on large numbers of corpus, are a very common factor in natural language processing recently, but conventional morpheme analysts are limited in their use in specialized fields. In this paper, as a preliminary work to develop a natural language analysis language model specialized in the railway field, the procedure for construction a corpus specialized in the railway field is presented.

Forming Limit Diagram of an Aluminum Tube Through Hydroforming Tests (액압성형 시험을 통한 알루미늄 튜브 재료의 성형한계도)

  • Kim J. S.;Lee J. K.;Park J. Y.;Lee D. J.;Kim H. Y.;Kim H. J.
    • Transactions of Materials Processing
    • /
    • v.14 no.6 s.78
    • /
    • pp.514-519
    • /
    • 2005
  • A tube hydroformability testing system was designed and fabricated enabling to apply the forming condition along arbitrarily pre-programmed internal pressure-axial feed path. The free-bulging and T-forming tests were carried out on the extruded aluminum (A6063) tube specimens with 40.6 mm outer diameter and 2.25 mm thickness. Nine different combinations of internal pressure and axial feed, yielding different strain paths from one another, were taken into consideration in order to induce bursting at various deformation modes. Major and minor strains were automatically measured from deformed grids around the fracture using a stereo-vision-based surface strain measurement system, named ASIAS. The forming limit diagram of the A6063 tube material was successfully obtained. Most of the data points acquired from free bulging and T-forming tests appeared in the range of negative minor strain on the FLD and are mostly located near the strain paths calculated from explicit finite element simulations. The forming limit obtained from tests after pre-tension was considerably lower than that from tests without pre-tension, which showed the strain path-dependency of the forming limit as well known in the sheet forming fold.

PRE-PROCESSING OF GALAXIES IN THE FILAMENTS AROUND THE VIRGO CLUSTER

  • YOON, HYEIN;CHUNG, AEREE;SENGUPTA, CHANDREYEE;WONG, O. IVY;BUREAU, MARTIN;REY, SOO-CHANG;VAN GORKOM, J.H.
    • Publications of The Korean Astronomical Society
    • /
    • v.30 no.2
    • /
    • pp.495-497
    • /
    • 2015
  • Galaxies can be "pre-processed" in the low-density outskirts by ambient medium in the filaments or tidal interactions with other galaxies while falling into the cluster. In order to probe how early on and by which mechanisms galaxies can be affected before they enter high-density cluster environments, we are carrying out an atomic hydrogen ($H\small{I}$) imaging study of a sample of galaxies selected from three filamentary structures around the Virgo cluster. Our sample consists of 14 late-type galaxies, which are potentially interacting with their surroundings. The $H\small{I}$ observations have been done using the Westerbork Synthesis Radio Telescope, the Giant Metrewave Radio Telescope, and the Jansky Very Large Array with column density sensitivity of ${\approx}3-5{\times}10^{19}cm^{-2}$ in $3{\sigma}$ per channel, which is low enough to detect faint $H\small{I}$ features in the outer disks of galaxies. In this work, we present the Hi data of two galaxies that were observed with GMRT. We examine the $H\small{I}$ morphology and kinematics to find the evidence for gas-gas and/or tidal interactions, and discuss which mechanism(s) could be responsible for pre-processing in these cases.

Comparison of CNN Structures for Detection of Surface Defects (표면 결함 검출을 위한 CNN 구조의 비교)

  • Choi, Hakyoung;Seo, Kisung
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.66 no.7
    • /
    • pp.1100-1104
    • /
    • 2017
  • A detector-based approach shows the limited performances for the defect inspections such as shallow fine cracks and indistinguishable defects from background. Deep learning technique is widely used for object recognition and it's applications to detect defects have been gradually attempted. Deep learning requires huge scale of learning data, but acquisition of data can be limited in some industrial application. The possibility of applying CNN which is one of the deep learning approaches for surface defect inspection is investigated for industrial parts whose detection difficulty is challenging and learning data is not sufficient. VOV is adopted for pre-processing and to obtain a resonable number of ROIs for a data augmentation. Then CNN method is applied for the classification. Three CNN networks, AlexNet, VGGNet, and mofified VGGNet are compared for experiments of defects detection.

Urban Growth Monitoring using Multi-temporal Satellite Images and Geographic Information

  • Lee, Kwang-Jae;Kim, Youn-Soo;Kim, Byung-Kyo
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.470-472
    • /
    • 2003
  • The primary goal in this paper is to analyze urban growth patterns using multi-temporal remote sensing images and geographic information data. In order to accomplish this purpose, firstly data pre-processing is carried out, and then land-use maps are generated with ancillary data source by heads-up on-screen digitizing. Lastly, using the results of the previous stages, the patterns of land-use and urban changes are monitored by the proposed scheme. In this research, using the multi-temporal images and geographic information data, monitoring of urban growth was carried out with the application of urban land-use changes.

  • PDF

Using Machine Learning Algorithms for Housing Price Prediction: The Case of Islamabad Housing Data

  • Imran, Imran;Zaman, Umar;Waqar, Muhammad;Zaman, Atif
    • Soft Computing and Machine Intelligence
    • /
    • v.1 no.1
    • /
    • pp.11-23
    • /
    • 2021
  • House price prediction is a significant financial decision for individuals working in the housing market as well as for potential buyers. From investment to buying a house for residence, a person investing in the housing market is interested in the potential gain. This paper presents machine learning algorithms to develop intelligent regressions models for House price prediction. The proposed research methodology consists of four stages, namely Data Collection, Pre Processing the data collected and transforming it to the best format, developing intelligent models using machine learning algorithms, training, testing, and validating the model on house prices of the housing market in the Capital, Islamabad. The data used for model validation and testing is the asking price from online property stores, which provide a reasonable estimate of the city housing market. The prediction model can significantly assist in the prediction of future housing prices in Pakistan. The regression results are encouraging and give promising directions for future prediction work on the collected dataset.

A Study on Building a Model for Safety Management of Small Buildings using Big Data (빅데이터를 활용한 소규모 건축물 안전관리 모델에 관한 연구)

  • Shin, Dongyoun
    • Journal of KIBIM
    • /
    • v.13 no.1
    • /
    • pp.13-21
    • /
    • 2023
  • The purpose of this study is to establish a system that manages the safety of buildings efficiently by finding the correlation of elements related to the safety of buildings and intuitively visualizing them. Data were collected using the data of small-scale buildings managed by public institutions and the government, and an effective analysis visualization environment was established through pre-processing. We selected safety-vulnerable factors such as the structure of the building and completion date to find the relationship, and established a model to prioritize management to find vulnerable buildings.

A Hand Gesture Recognition Method using Inertial Sensor for Rapid Operation on Embedded Device

  • Lee, Sangyub;Lee, Jaekyu;Cho, Hyeonjoong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.2
    • /
    • pp.757-770
    • /
    • 2020
  • We propose a hand gesture recognition method that is compatible with a head-up display (HUD) including small processing resource. For fast link adaptation with HUD, it is necessary to rapidly process gesture recognition and send the minimum amount of driver hand gesture data from the wearable device. Therefore, we use a method that recognizes each hand gesture with an inertial measurement unit (IMU) sensor based on revised correlation matching. The method of gesture recognition is executed by calculating the correlation between every axis of the acquired data set. By classifying pre-defined gesture values and actions, the proposed method enables rapid recognition. Furthermore, we evaluate the performance of the algorithm, which can be implanted within wearable bands, requiring a minimal process load. The experimental results evaluated the feasibility and effectiveness of our decomposed correlation matching method. Furthermore, we tested the proposed algorithm to confirm the effectiveness of the system using pre-defined gestures of specific motions with a wearable platform device. The experimental results validated the feasibility and effectiveness of the proposed hand gesture recognition system. Despite being based on a very simple concept, the proposed algorithm showed good performance in recognition accuracy.

Range-Doppler Clustering of Radar Data for Detecting Moving Objects (이동물체 탐지를 위한 레이다 데이터의 거리-도플러 클러스터링 기법)

  • Kim, Seongjoon;Yang, Dongwon;Jung, Younghun;Kim, Sujin;Yoon, Joohong
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.17 no.6
    • /
    • pp.810-820
    • /
    • 2014
  • Recently many studies of Radar systems mounted on ground vehicles for autonomous driving, SLAM (Simultaneous localization and mapping) and collision avoidance are reported. In near field, several hits per an object are generated after signal processing of Radar data. Hence, clustering is an essential technique to estimate their shapes and positions precisely. This paper proposes a method of grouping hits in range-doppler domains into clusters which represent each object, according to the pre-defined rules. The rules are based on the perceptual cues to separate hits by object. The morphological connectedness between hits and the characteristics of SNR distribution of hits are adopted as the perceptual cues for clustering. In various simulations for the performance assessment, the proposed method yielded more effective performance than other techniques.

Development of Distributed Hydrological Analysis Tool for Future Climate Change Impacts Assessment of South Korea (전국 기후변화 영향평가를 위한 분포형 수문분석 툴 개발)

  • Kim, Seong Joon;Kim, Sang Ho;Joh, Hyung Kyung;Ahn, So Ra
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.57 no.2
    • /
    • pp.15-26
    • /
    • 2015
  • The purpose of this paper is to develop a software tool, PGA-CC (Projection of hydrology via Grid-based Assessment for Climate Change) to evaluate the present hydrologic cycle and the future watershed hydrology by climate change. PGA-CC is composed of grid-based input data pre-processing module, hydrologic cycle calculation module, output analysis module, and output data post-processing module. The grid-based hydrological model was coded by Fortran and compiled using Compaq Fortran 6.6c, and the Graphic User Interface was developed by using Visual C#. Other most elements viz. Table and Graph, and GIS functions were implemented by MapWindow. The applicability of PGA-CC was tested by assessing the future hydrology of South Korea by HadCM3 SRES B1 and A2 climate change scenarios. For the whole country, the tool successfully assessed the future hydrological components including input data and evapotranspiration, soil moisture, surface runoff, lateral flow, base flow etc. From the spatial outputs, we could understand the hydrological changes both seasonally and regionally.