• Title/Summary/Keyword: 피처모델

Search Result 24, Processing Time 0.028 seconds

A Study on Production Prediction Model using a Energy Big Data based on Machine Learning (에너지 빅데이터를 활용한 머신러닝 기반의 생산 예측 모형 연구)

  • Kang, Mi-Young;Kim, Suk
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.453-456
    • /
    • 2022
  • The role of the power grid is to ensure stable power supply. It is necessary to take various measures to prepare for unstable situations without notice. After identifying the relationship between features through exploratory data analysis using weather data, a machine learning based energy production prediction model is modeled. In this study, the prediction reliability was increased by extracting the features that affect energy production prediction using principal component analysis and then applying it to the machine learning model. By using the proposed model to predict the production energy for a specific period and compare it with the actual production value at that time, the performance of the energy production prediction applying the principal component analysis was confirmed.

  • PDF

Automated Scoring of Argumentation Levels and Analysis of Argumentation Patterns Using Machine Learning (기계 학습을 활용한 논증 수준 자동 채점 및 논증 패턴 분석)

  • Lee, Manhyoung;Ryu, Suna
    • Journal of The Korean Association For Science Education
    • /
    • v.41 no.3
    • /
    • pp.203-220
    • /
    • 2021
  • We explored the performance improvement method of automated scoring for scientific argumentation. We analyzed the pattern of argumentation using automated scoring models. For this purpose, we assessed the level of argumentation for student's scientific discourses in classrooms. The dataset consists of four units of argumentation features and argumentation levels for episodes. We utilized argumentation clusters and n-gram to enhance automated scoring accuracy. We used the three supervised learning algorithms resulting in 33 automatic scoring models. As a result of automated scoring, we got a good scoring accuracy of 77.59% on average and up to 85.37%. In this process, we found that argumentation cluster patterns could enhance automated scoring performance accuracy. Then, we analyzed argumentation patterns using the model of decision tree and random forest. Our results were consistent with the previous research in which justification in coordination with claim and evidence determines scientific argumentation quality. Our research method suggests a novel approach for analyzing the quality of scientific argumentation in classrooms.

The Enhancement of intrusion detection reliability using Explainable Artificial Intelligence(XAI) (설명 가능한 인공지능(XAI)을 활용한 침입탐지 신뢰성 강화 방안)

  • Jung Il Ok;Choi Woo Bin;Kim Su Chul
    • Convergence Security Journal
    • /
    • v.22 no.3
    • /
    • pp.101-110
    • /
    • 2022
  • As the cases of using artificial intelligence in various fields increase, attempts to solve various issues through artificial intelligence in the intrusion detection field are also increasing. However, the black box basis, which cannot explain or trace the reasons for the predicted results through machine learning, presents difficulties for security professionals who must use it. To solve this problem, research on explainable AI(XAI), which helps interpret and understand decisions in machine learning, is increasing in various fields. Therefore, in this paper, we propose an explanatory AI to enhance the reliability of machine learning-based intrusion detection prediction results. First, the intrusion detection model is implemented through XGBoost, and the description of the model is implemented using SHAP. And it provides reliability for security experts to make decisions by comparing and analyzing the existing feature importance and the results using SHAP. For this experiment, PKDD2007 dataset was used, and the association between existing feature importance and SHAP Value was analyzed, and it was verified that SHAP-based explainable AI was valid to give security experts the reliability of the prediction results of intrusion detection models.

Spatial Big Data Query Processing System Supporting SQL-based Query Language in Hadoop (Hadoop에서 SQL 기반 질의언어를 지원하는 공간 빅데이터 질의처리 시스템)

  • Joo, In-Hak
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.10 no.1
    • /
    • pp.1-8
    • /
    • 2017
  • In this paper we present a spatial big data query processing system that can store spatial data in Hadoop and query the data with SQL-based query language. The system stores large-scale spatial data in HDFS-based storage system, and supports spatial queries expressed in SQL-based query language extended for spatial data processing. It supports standard spatial data types and functions defined in OGC simple feature model in the query language. This paper presents the development of core functions of the system including query language parsing, query validation, query planning, and connection with storage system. We compares the performance of the suggested system with an existing system, and our experiments show that the system shows about 58% performance improvement of query execution time over the existing system when executing region query for spatial data stored in Hadoop.

A label induction method in the conditional random fields expressing long distance transition between separate entities in clinical narratives (임상 문서에서 서로 떨어진 개체명 간 전이 관계 표현을 위한 조건부무작위장 내 라벨 유도 기법 연구)

  • Lee, Wangjin;Choi, Jinwook
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.172-175
    • /
    • 2018
  • 환자의 병력을 서술하는 임상문서에서 임상 개체명들은 그들 사이에 개체명이 아닌 단어들이 위치하기 때문에 거리상으로 서로 떨어져 있고, 임상 개체명인식에 많이 사용되는 조건부무작위장(conditional random fields; CRF) 모델은 Markov 속성을 따르기 때문에 서로 떨어져 있는 개체명 라벨 간의 전이 정보는 모델의 계산에서 무시된다. 본 논문에서는 라벨링 모델에 서로 떨어진 개체명 간 전이 관계를 표현하기 위하여 CRF 모델의 구조를 변경하는 방법론을 소개한다. 제안된 CRF 모델 디자인에서는 모델의 계산효율성을 빠르게 유지하기 위하여 Markov 속성을 유지하는 1차 모델 구조를 유지한다. 모델은 선행하는 개체명의 라벨 정보를 후행하는 개체명 엔터티에게 전달하기 위하여 선행 개체명의 라벨을 뒤 따르는 비개체명 라벨에 전이시키고 이를 통해 후행하는 개체명은 선행하는 개체명의 라벨 정보를 알 수 있게 된다. 라벨의 고차 전이 정보를 전달함에도 모델의 구조는 1차 전이 구조를 유지함으로 n차 구조의 모델보다 빠른 계산 속도를 유지할 수 있게 된다. 모델의 성능 평가를 위하여 서울대학교병원 류머티즘내과에서 퇴원한 환자들의 퇴원요약지에 병력과 관련된 엔터티가 태깅된 평가 데이터와 i2b2 2012/VA 임상자연어처리 shared task의 임상 개체명 추출 데이터를 사용하였고 기본 CRF 모델들(1차, 2차)과 비교하였다. 피처 조합에 따라 모델들을 평가한 결과 제안한 모델이 거의 모든 경우에서 기본 모델들에 비하여 F1-score의 성능을 향상시킴을 관찰할 수 있었다.

  • PDF

Data Preprocessing Method for Lightweight Automotive Intrusion Detection System (차량용 경량화 침입 탐지 시스템을 위한 데이터 전처리 기법)

  • Sangmin Park;Hyungchul Im;Seongsoo Lee
    • Journal of IKEEE
    • /
    • v.27 no.4
    • /
    • pp.531-536
    • /
    • 2023
  • This paper proposes a sliding window method with frame feature insertion for immediate attack detection on in-vehicle networks. This method guarantees real-time attack detection by labeling based on the attack status of the current frame. Experiments show that the proposed method improves detection performance by giving more weight to the current frame in CNN computation. The proposed model was designed based on a lightweight LeNet-5 architecture and it achieves 100% detection for DoS attacks. Additionally, by comparing the complexity with conventional models, the proposed model has been proven to be more suitable for resource-constrained devices like ECUs.

블록체인 미디어 적용 가능성과 블록체인 아키텍처 분석

  • Geum, Chang-Seop
    • Broadcasting and Media Magazine
    • /
    • v.23 no.3
    • /
    • pp.85-91
    • /
    • 2018
  • 최근 미디어 산업 분야에서 블록체인에 대한 관심이 점차 고조되고 있다. 본 고에서는 초연결 사회에서 믿음을 만들어주는 신뢰 인프라인 블록체인 기술의 미디어 산업 적용 가능성을 알아보고자 한다. 이를 위해 4차 산업혁명 시대에 블록체인이 신뢰 기술로 각광 받게 된 이유를 고찰하고, 미디어 산업 분야에서 활용된 사례를 살펴본다. 실제로 블록체인을 미디어 산업에 도입하기 위해서는 기술성숙도에 대한 분석이 필요하다. 현재는 블록체인에 대한 전체적인 시각의 부재로 인한 미디어와 블록체인의 만남에 대해 지나친 낙관 혹은 비관이 난무하고 있다. 이러한 문제에 대한 돌파구로 대표적인 오픈소스 블록체인 플랫폼인 이더리움과 하이퍼레저 패브릭 아키텍처를 재구축하고 분석했다. 아키텍처 분석결과 미디어 산업계에서 요구하는 피처를 충족하기에는 현재의 블록체인 플랫폼이 아직 부족함을 발견하였다. 블록체인 기술이 마치 인터넷 초창기 시절과 유사하다는 점을 인지할 필요가 있다. 중장기적으로 블록체인 기반의 직거래 비즈니스 모델 연구와 함께 이를 실현해줄 블록체인 미디어 서비스플랫폼 연구개발이 수반된다면 미래의 미디어 산업은 블록체인에 의해 크게 요동칠 것으로 예측된다.

CMDNet: Single Shot Architecture for Clickable Mobile Screen Object Detection (CMDNet: 클릭 가능한 모바일 화면 객체 탐지를 위한 싱글 샷 아키텍처)

  • Jo, Min-Seok;Han, Seong-Soo;Jeong, Chang-Sung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.05a
    • /
    • pp.418-421
    • /
    • 2021
  • 모바일 디바이스 화면에 대하여 클릭 가능한 객체를 인식하기 위한 Object detection network architecture 를 제안한다. DSSD 를 Baseline 으로 SE block 이 추가된 Backbone network 와 SSD layer, FPN 구조를 사용한다. 기존의 1:1 비율의 네트워크의 Input resolution 을 모바일 화면과 유사한 1:2 비율로 변경하여 효율적으로 피처를 추출한다. 또한 해당 모델을 학습하기 위한 효율적인 데이터셋을 구축한다. 모바일 화면에서 클릭 가능한 객체를 기준으로 데이터를 수집하여 총 24,937 개의 Annotation data 를 Text, Image, Button, Region 등 8 개의 카테고리로 세분화하였다.

Navigable Space-Relation Model for Indoor Space Analysis (실내 공간 분석을 위한 보행 공간관계 모델)

  • Lee, Seul-Ji;Lee, Ji-Yeong
    • Spatial Information Research
    • /
    • v.19 no.5
    • /
    • pp.75-86
    • /
    • 2011
  • Three-dimensional modeling of cities in the real-world is an essential task for city planning and decision-making. And many three-dimensional city models are being developed with the development of wireless Internet and location-based services that identify the location of users and provide the information increases for consumers. Especially, in case of urban areas of Korea, indoor space modeling as well as outdoor is needed due to the high-rise buildings densities. Also location-based services should be provided through spatial analysis such as the shortest path based on a space model. Many studies of three-dimensional city models are feature models. In a feature model, space is represented by combining primitives, and relationships among spaces are represented only if shared primitives are detected. So relationships between complex three-dimensional objects in space is difficult to be defined through the feature models. In this study, Navigable space-relation model(NSRM) is developed, which is topological data model for efficient representation of spatial relationships between objects based on the network structure.

A Study on Machine Learning Based Anti-Analysis Technique Detection Using N-gram Opcode (N-gram Opcode를 활용한 머신러닝 기반의 분석 방지 보호 기법 탐지 방안 연구)

  • Kim, Hee Yeon;Lee, Dong Hoon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.2
    • /
    • pp.181-192
    • /
    • 2022
  • The emergence of new malware is incapacitating existing signature-based malware detection techniques., and applying various anti-analysis techniques makes it difficult to analyze. Recent studies related to signature-based malware detection have limitations in that malware creators can easily bypass them. Therefore, in this study, we try to build a machine learning model that can detect and classify the anti-analysis techniques of packers applied to malware, not using the characteristics of the malware itself. In this study, the n-gram opcodes are extracted from the malicious binary to which various anti-analysis techniques of the commercial packers are applied, and the features are extracted by using TF-IDF, and through this, each anti-analysis technique is detected and classified. In this study, real-world malware samples packed using The mida and VMProtect with multiple anti-analysis techniques were trained and tested with 6 machine learning models, and it constructed the optimal model showing 81.25% accuracy for The mida and 95.65% accuracy for VMProtect.