• Title/Summary/Keyword: software-engineering

Search Result 12,443, Processing Time 0.034 seconds

The Application Methods of FarmMap Reading in Agricultural Land Using Deep Learning (딥러닝을 이용한 농경지 팜맵 판독 적용 방안)

  • Wee Seong Seung;Jung Nam Su;Lee Won Suk;Shin Yong Tae
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.2
    • /
    • pp.77-82
    • /
    • 2023
  • The Ministry of Agriculture, Food and Rural Affairs established the FarmMap, an digital map of agricultural land. In this study, using deep learning, we suggest the application of farm map reading to farmland such as paddy fields, fields, ginseng, fruit trees, facilities, and uncultivated land. The farm map is used as spatial information for planting status and drone operation by digitizing agricultural land in the real world using aerial and satellite images. A reading manual has been prepared and updated every year by demarcating the boundaries of agricultural land and reading the attributes. Human reading of agricultural land differs depending on reading ability and experience, and reading errors are difficult to verify in reality because of budget limitations. The farmmap has location information and class information of the corresponding object in the image of 5 types of farmland properties, so the suitable AI technique was tested with ResNet50, an instance segmentation model. The results of attribute reading of agricultural land using deep learning and attribute reading by humans were compared. If technology is developed by focusing on attribute reading that shows different results in the future, it is expected that it will play a big role in reducing attribute errors and improving the accuracy of digital map of agricultural land.

Apartment Price Prediction Using Deep Learning and Machine Learning (딥러닝과 머신러닝을 이용한 아파트 실거래가 예측)

  • Hakhyun Kim;Hwankyu Yoo;Hayoung Oh
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.2
    • /
    • pp.59-76
    • /
    • 2023
  • Since the COVID-19 era, the rise in apartment prices has been unconventional. In this uncertain real estate market, price prediction research is very important. In this paper, a model is created to predict the actual transaction price of future apartments after building a vast data set of 870,000 from 2015 to 2020 through data collection and crawling on various real estate sites and collecting as many variables as possible. This study first solved the multicollinearity problem by removing and combining variables. After that, a total of five variable selection algorithms were used to extract meaningful independent variables, such as Forward Selection, Backward Elimination, Stepwise Selection, L1 Regulation, and Principal Component Analysis(PCA). In addition, a total of four machine learning and deep learning algorithms were used for deep neural network(DNN), XGBoost, CatBoost, and Linear Regression to learn the model after hyperparameter optimization and compare predictive power between models. In the additional experiment, the experiment was conducted while changing the number of nodes and layers of the DNN to find the most appropriate number of nodes and layers. In conclusion, as a model with the best performance, the actual transaction price of apartments in 2021 was predicted and compared with the actual data in 2021. Through this, I am confident that machine learning and deep learning will help investors make the right decisions when purchasing homes in various economic situations.

Malicious Traffic Classification Using Mitre ATT&CK and Machine Learning Based on UNSW-NB15 Dataset (마이터 어택과 머신러닝을 이용한 UNSW-NB15 데이터셋 기반 유해 트래픽 분류)

  • Yoon, Dong Hyun;Koo, Ja Hwan;Won, Dong Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.2
    • /
    • pp.99-110
    • /
    • 2023
  • This study proposed a classification of malicious network traffic using the cyber threat framework(Mitre ATT&CK) and machine learning to solve the real-time traffic detection problems faced by current security monitoring systems. We applied a network traffic dataset called UNSW-NB15 to the Mitre ATT&CK framework to transform the label and generate the final dataset through rare class processing. After learning several boosting-based ensemble models using the generated final dataset, we demonstrated how these ensemble models classify network traffic using various performance metrics. Based on the F-1 score, we showed that XGBoost with no rare class processing is the best in the multi-class traffic environment. We recognized that machine learning ensemble models through Mitre ATT&CK label conversion and oversampling processing have differences over existing studies, but have limitations due to (1) the inability to match perfectly when converting between existing datasets and Mitre ATT&CK labels and (2) the presence of excessive sparse classes. Nevertheless, Catboost with B-SMOTE achieved the classification accuracy of 0.9526, which is expected to be able to automatically detect normal/abnormal network traffic.

Quantitative Estimation Method for ML Model Performance Change, Due to Concept Drift (Concept Drift에 의한 ML 모델 성능 변화의 정량적 추정 방법)

  • Soon-Hong An;Hoon-Suk Lee;Seung-Hoon Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.6
    • /
    • pp.259-266
    • /
    • 2023
  • It is very difficult to measure the performance of the machine learning model in the business service stage. Therefore, managing the performance of the model through the operational department is not done effectively. Academically, various studies have been conducted on the concept drift detection method to determine whether the model status is appropriate. The operational department wants to know quantitatively the performance of the operating model, but concept drift can only detect the state of the model in relation to the data, it cannot estimate the quantitative performance of the model. In this study, we propose a performance prediction model (PPM) that quantitatively estimates precision through the statistics of concept drift. The proposed model induces artificial drift in the sampling data extracted from the training data, measures the precision of the sampling data, creates a dataset of drift and precision, and learns it. Then, the difference between the actual precision and the predicted precision is compared through the test data to correct the error of the performance prediction model. The proposed PPM was applied to two models, a loan underwriting model and a credit card fraud detection model that can be used in real business. It was confirmed that the precision was effectively predicted.

Estimation of Illuminant Chromaticity by Equivalent Distance Reference Illumination Map and Color Correlation (균등거리 기준 조명 맵과 색 상관성을 이용한 조명 색도 추정)

  • Kim Jeong Yeop
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.6
    • /
    • pp.267-274
    • /
    • 2023
  • In this paper, a method for estimating the illuminant chromaticity of a scene for an input image is proposed. The illuminant chromaticity is estimated using the illuminant reference region. The conventional method uses a certain number of reference lighting information. By comparing the chromaticity distribution of pixels from the input image with the chromaticity set prepared in advance for the reference illuminant, the reference illuminant with the largest overlapping area is regarded as the scene illuminant for the corresponding input image. In the process of calculating the overlapping area, the weights for each reference light were applied in the form of a Gaussian distribution, but a clear standard for the variance value could not be presented. The proposed method extracts an independent reference chromaticity region from a given reference illuminant, calculates the characteristic values in the r-g chromaticity plane of the RGB color coordinate system for all pixels of the input image, and then calculates the independent chromaticity region and features from the input image. The similarity is evaluated and the illuminant with the highest similarity was estimated as the illuminant chromaticity component of the image. The performance of the proposed method was evaluated using the database image and showed an average of about 60% improvement compared to the conventional basic method and showed an improvement performance of around 53% compared to the conventional Gaussian weight of 0.1.

Predicting the Number of Confirmed COVID-19 Cases Using Deep Learning Models with Search Term Frequency Data (검색어 빈도 데이터를 반영한 코로나 19 확진자수 예측 딥러닝 모델)

  • Sungwook Jung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.9
    • /
    • pp.387-398
    • /
    • 2023
  • The COVID-19 outbreak has significantly impacted human lifestyles and patterns. It was recommended to avoid face-to-face contact and over-crowded indoor places as much as possible as COVID-19 spreads through air, as well as through droplets or aerosols. Therefore, if a person who has contacted a COVID-19 patient or was at the place where the COVID-19 patient occurred is concerned that he/she may have been infected with COVID-19, it can be fully expected that he/she will search for COVID-19 symptoms on Google. In this study, an exploratory data analysis using deep learning models(DNN & LSTM) was conducted to see if we could predict the number of confirmed COVID-19 cases by summoning Google Trends, which played a major role in surveillance and management of influenza, again and combining it with data on the number of confirmed COVID-19 cases. In particular, search term frequency data used in this study are available publicly and do not invade privacy. When the deep neural network model was applied, Seoul (9.6 million) with the largest population in South Korea and Busan (3.4 million) with the second largest population recorded lower error rates when forecasting including search term frequency data. These analysis results demonstrate that search term frequency data plays an important role in cities with a population above a certain size. We also hope that these predictions can be used as evidentiary materials to decide policies, such as the deregulation or implementation of stronger preventive measures.

Multi-Object Goal Visual Navigation Based on Multimodal Context Fusion (멀티모달 맥락정보 융합에 기초한 다중 물체 목표 시각적 탐색 이동)

  • Jeong Hyun Choi;In Cheol Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.9
    • /
    • pp.407-418
    • /
    • 2023
  • The Multi-Object Goal Visual Navigation(MultiOn) is a visual navigation task in which an agent must visit to multiple object goals in an unknown indoor environment in a given order. Existing models for the MultiOn task suffer from the limitation that they cannot utilize an integrated view of multimodal context because use only a unimodal context map. To overcome this limitation, in this paper, we propose a novel deep neural network-based agent model for MultiOn task. The proposed model, MCFMO, uses a multimodal context map, containing visual appearance features, semantic features of environmental objects, and goal object features. Moreover, the proposed model effectively fuses these three heterogeneous features into a global multimodal context map by using a point-wise convolutional neural network module. Lastly, the proposed model adopts an auxiliary task learning module to predict the observation status, goal direction and the goal distance, which can guide to learn the navigational policy efficiently. Conducting various quantitative and qualitative experiments using the Habitat-Matterport3D simulation environment and scene dataset, we demonstrate the superiority of the proposed model.

Real Estate Asset NFT Tokenization and FT Asset Portfolio Management (부동산 유동화 NFT와 FT 분할 거래 시스템 설계 및 구현)

  • Young-Gun Kim;Seong-Whan Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.9
    • /
    • pp.419-430
    • /
    • 2023
  • Currently, NFTs have no dominant application except for the proof of ownership for digital content, and it also have small liquidity problem, which makes their price difficult to predict. Real estate usually has very high barriers to investment due to its high pricing. Real estate can be converted into NFTs and also divided into small value fungible tokens (FTs), and it can increase the the volume of the investor community due to more liquidity and better accessibility. In this document, we implement and design a system that allows ordinary users can invest on high priced real estate utilizing Black Litterman (BL) model-based Portfolio investment interface. To this end, we target a set of real estates pegged as collateral and issue NFT for the collateral using blockchain. We use oracle to get the current real estate information and to monitor varying real estate prices. After tokenizing real estate into NFTs, we divide the NFTs into easily accessible price FTs, thereby, we can lower prices and provide large liquidity with price volatility limited. In addition, we also implemented BL based asset portfolio interface for effective portfolio composition for investing in multiple of real estates with small investments. Using BL model, investors can fix the asset portfolio. We implemented the whole system using Solidity smart contracts on Flask web framework with public data portals as oracle interfaces.

Intrusion Detection Method Using Unsupervised Learning-Based Embedding and Autoencoder (비지도 학습 기반의 임베딩과 오토인코더를 사용한 침입 탐지 방법)

  • Junwoo Lee;Kangseok Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.8
    • /
    • pp.355-364
    • /
    • 2023
  • As advanced cyber threats continue to increase in recent years, it is difficult to detect new types of cyber attacks with existing pattern or signature-based intrusion detection method. Therefore, research on anomaly detection methods using data learning-based artificial intelligence technology is increasing. In addition, supervised learning-based anomaly detection methods are difficult to use in real environments because they require sufficient labeled data for learning. Research on an unsupervised learning-based method that learns from normal data and detects an anomaly by finding a pattern in the data itself has been actively conducted. Therefore, this study aims to extract a latent vector that preserves useful sequence information from sequence log data and develop an anomaly detection learning model using the extracted latent vector. Word2Vec was used to create a dense vector representation corresponding to the characteristics of each sequence, and an unsupervised autoencoder was developed to extract latent vectors from sequence data expressed as dense vectors. The developed autoencoder model is a recurrent neural network GRU (Gated Recurrent Unit) based denoising autoencoder suitable for sequence data, a one-dimensional convolutional neural network-based autoencoder to solve the limited short-term memory problem that GRU can have, and an autoencoder combining GRU and one-dimensional convolution was used. The data used in the experiment is time-series-based NGIDS (Next Generation IDS Dataset) data, and as a result of the experiment, an autoencoder that combines GRU and one-dimensional convolution is better than a model using a GRU-based autoencoder or a one-dimensional convolution-based autoencoder. It was efficient in terms of learning time for extracting useful latent patterns from training data, and showed stable performance with smaller fluctuations in anomaly detection performance.

General Relation Extraction Using Probabilistic Crossover (확률적 교차 연산을 이용한 보편적 관계 추출)

  • Je-Seung Lee;Jae-Hoon Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.8
    • /
    • pp.371-380
    • /
    • 2023
  • Relation extraction is to extract relationships between named entities from text. Traditionally, relation extraction methods only extract relations between predetermined subject and object entities. However, in end-to-end relation extraction, all possible relations must be extracted by considering the positions of the subject and object for each pair of entities, and so this method uses time and resources inefficiently. To alleviate this problem, this paper proposes a method that sets directions based on the positions of the subject and object, and extracts relations according to the directions. The proposed method utilizes existing relation extraction data to generate direction labels indicating the direction in which the subject points to the object in the sentence, adds entity position tokens and entity type to sentences to predict the directions using a pre-trained language model (KLUE-RoBERTa-base, RoBERTa-base), and generates representations of subject and object entities through probabilistic crossover operation. Then, we make use of these representations to extract relations. Experimental results show that the proposed model performs about 3 ~ 4%p better than a method for predicting integrated labels. In addition, when learning Korean and English data using the proposed model, the performance was 1.7%p higher in English than in Korean due to the number of data and language disorder and the values of the parameters that produce the best performance were different. By excluding the number of directional cases, the proposed model can reduce the waste of resources in end-to-end relation extraction.