• Title/Summary/Keyword: 이항 분류

Search Result 51, Processing Time 0.022 seconds

Novel Intent Category Discovery using Contrastive Learning (대조학습을 활용한 새로운 의도 카테고리 발견)

  • Seungyeon Seo;Gary Geunbae Lee
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.107-112
    • /
    • 2023
  • 라벨 데이터 수집의 어려움에 따라 라벨이 없는 데이터로 학습하는 준지도학습, 비지도학습에 대한 연구가 활발하게 진행되고 있다. 본 논문에서는 그의 일환으로 Novel Intent Category Discovery(NICD) 문제를 제안하고 NICD 연구의 베이스라인이 될 모델을 소개한다. NICD 문제는 라벨이 있는 데이터와 라벨이 없는 데이터의 클래스 셋이 겹치지 않는다는 점에서 기존 준지도학습의 문제들과 차이가 있다. 제안 모델은 RoBERTa를 기반으로 두 개의 분류기를 추가하여 구성되며 라벨이 있는 데이터셋과 라벨이 없는 데이터셋에서 각각 다른 분류기를 사용하여 라벨을 예측한다. 학습방법은 2단계로 먼저 라벨이 있는 데이터셋으로 요인표현을 학습한다. 두 번째 단계에서는 교차 엔트로피, 이항교차 엔트로피, 평균제곱오차, 지도 대조 손실함수를 NICD 문제에 맞게 변형하여 학습에 사용한다. 논문에서 제안된 모델은 라벨이 없는 데이터셋에 대해 이미지 최고성능 모델보다 24.74 더 높은 정확도를 기록했다.

  • PDF

A Study on Improvement of Design Method for Freeway Diverging Areas (고속도로 분류부 설계기법 개선 연구)

  • Park, Jae-Beom;Lee, Seung-Jun;Gang, Jeong-Gyu;Kim, Il-Hwan
    • Journal of Korean Society of Transportation
    • /
    • v.25 no.1 s.94
    • /
    • pp.23-35
    • /
    • 2007
  • Freeway diverging areas are very vulnerable to traffic accidents due to abrupt vehicle speed changes and geometric changes. Therefore, in designing diverging areas, much attention should be Paid to safety The Present design criteria about freeway diverging areas regulate transition sections for lane changes, deceleration lanes, transition corves for direction changes. and other similar items. However, the design criteria were often violated in implementation because of ambiguities in the criteria. This study aims at clarifying and improving the present design criteria for freeway diverging areas. For this, field survey data and traffic accident data for diverging areas were analyzed.

A study on variable selection and classification in dynamic analysis data for ransomware detection (랜섬웨어 탐지를 위한 동적 분석 자료에서의 변수 선택 및 분류에 관한 연구)

  • Lee, Seunghwan;Hwang, Jinsoo
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.4
    • /
    • pp.497-505
    • /
    • 2018
  • Attacking computer systems using ransomware is very common all over the world. Since antivirus and detection methods are constantly improved in order to detect and mitigate ransomware, the ransomware itself becomes equally better to avoid detection. Several new methods are implemented and tested in order to optimize the protection against ransomware. In our work, 582 of ransomware and 942 of normalware sample data along with 30,967 dynamic action sequence variables are used to detect ransomware efficiently. Several variable selection techniques combined with various machine learning based classification techniques are tried to protect systems from ransomwares. Among various combinations, chi-square variable selection and random forest gives the best detection rates and accuracy.

An Analysis of Factors Affecting Fintech Payment Service Acceptance Using Logistic Regression (로지스틱 회귀분석을 이용한 핀테크 결제 서비스 수용 요인 분석)

  • Hwang, Sin-Hae;Kim, Jeoung Kun
    • Journal of the Korea Society for Simulation
    • /
    • v.27 no.1
    • /
    • pp.51-60
    • /
    • 2018
  • This study aims to understand crucial factors affecting user's Fintech payment service adoption. On the basis of innovation diffusion theory and prior Fintech literature, this study classifies the influence factors of users' adoption of Fintech payment service into two dimensions - service dimension containing complexity, perceived benefit, trust in service provider and user dimension containing personal innovativeness and security breach experience. The data analysis results using binary logistic regression shows the negative direct effects of perceived risk, complexity, security accident experience on user's service adoption are statistically significant. Personal innovativeness has a positive effect on user's Fintech payment service adoption. The moderation effect of security accident experience is also significant at p<0.05.

The Detection of Unreliable Data in Survey Database (조사자료 데이터베이스의 허위 잠재 가능성 분류군 탐지)

  • Byon, Lu-Na;Han, Jeong-Hye
    • The KIPS Transactions:PartD
    • /
    • v.12D no.4 s.100
    • /
    • pp.657-662
    • /
    • 2005
  • The Non-Sampling Error can happen any time by means of the intended or unintended error by the interviewer or respondent, but it is very difficult to find the error in survey database because it can hardly be computed mathematically and systematically. Until now, we have found it accidentally through the simple relation between the items or through the inspection from the random field. Therefore we introduced an heuristic methodology that can detect the interviewer's error by statistical decision-making or data mining techniques with a case study. It will be helpful so as to improve the statistical duality and provide efficient field management for the supervisor.

Fraud detection support vector machines with a functional predictor: application to defective wafer detection problem (불량 웨이퍼 탐지를 위한 함수형 부정 탐지 지지 벡터기계)

  • Park, Minhyoung;Shin, Seung Jun
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.5
    • /
    • pp.593-601
    • /
    • 2022
  • We call "fruad" the cases that are not frequently occurring but cause significant losses. Fraud detection is commonly encountered in various applications, including wafer production in the semiconductor industry. It is not trivial to directly extend the standard binary classification methods to the fraud detection context because the misclassification cost is much higher than the normal class. In this article, we propose the functional fraud detection support vector machine (F2DSVM) that extends the fraud detection support vector machine (FDSVM) to handle functional covariates. The proposed method seeks a classifier for a function predictor that achieves optimal performance while achieving the desired sensitivity level. F2DSVM, like the conventional SVM, has piece-wise linear solution paths, allowing us to develop an efficient algorithm to recover entire solution paths, resulting in significantly improved computational efficiency. Finally, we apply the proposed F2DSVM to the defective wafer detection problem and assess its potential applicability.

Factors Affecting Injury Severity in Pedestrian-Vehicle Crash by Novice Driver (초보 운전자에 의한 보행자-차량 교통사고의 심각도 영향 요인 분석)

  • Choe, Sae-Ro-Na;Park, Jun-Hyeong;O, Cheol
    • Journal of Korean Society of Transportation
    • /
    • v.29 no.4
    • /
    • pp.43-51
    • /
    • 2011
  • Since a variety of factors are associated with crash occurrence, the analysis of causes of crash is a hard task for traffic researchers and engineers. Among contributing factors leading to crash, the characteristics of driver is of keen interest. This study attempted to identify factors affecting the severity of pedestrian in the collision between pedestrian and vehicle. In particular, our analyses were focused on the novice driver. A binary logistic regression technique was adopted for the analyses. The results showed that driver's age, crash location, and the frequency of violations were dominant factors for the severity. Findings are expected to be useful information for deffective policy- and education-based countermeasures.

Comparison of Methodologies for Characterizing Pedestrian-Vehicle Collisions (보행자-차량 충돌사고 특성분석 방법론 비교 연구)

  • Choi, Saerona;Jeong, Eunbi;Oh, Cheol
    • Journal of Korean Society of Transportation
    • /
    • v.31 no.6
    • /
    • pp.53-66
    • /
    • 2013
  • The major purpose of this study is to evaluate methodologies to predict the injury severity of pedestrian-vehicle collisions. Methodologies to be evaluated and compared in this study include Binary Logistic Regression(BLR), Ordered Probit Model(OPM), Support Vector Machine(SVM) and Decision Tree(DT) method. Valuable insights into applying methodologies to analyze the characteristics of pedestrian injury severity are derived. For the purpose of identifying causal factors affecting the injury severity, statistical approaches such as BLR and OPM are recommended. On the other hand, to achieve better prediction performance, heuristic approaches such as SVM and DT are recommended. It is expected that the outcome of this study would be useful in developing various countermeasures for enhancing pedestrian safety.

Estimation of Freeway Accident Likelihood using Real-time Traffic Data (실시간 교통자료 기반 고속도로 교통사고 발생 가능성 추정 모형)

  • Park, Joon-Hyung;Oh, Cheol;NamKoong, Seong
    • Journal of Korean Society of Transportation
    • /
    • v.26 no.2
    • /
    • pp.157-166
    • /
    • 2008
  • This study proposed a model to estimate traffic accident likelihood using real-time traffic data obtained from freeway traffic surveillance systems. Traffic variables representing spatio-temporal variations of traffic conditions were utilized as independent variables in the proposed models. Binary logistics regression modelings were conducted to correlate traffic variables and accident data that were collected from the Seohaean freeway during recent three years, from 2004 to 2006. To apply more reliable traffic variables, outlier filtering and data imputation were also performed. The outcomes of the model that are actually probabilistic measures of accident occurrence would be effectively utilized not only in designing warning information systems but also in evaluating the effectiveness of various traffic operations strategies in terms of traffic safety.

Structure and expression of legal principles for artificial intelligence lawyers (인공지능 변호사를 위한 법리의 구조화와 그 표현)

  • Park, Bongcheol
    • Journal of the International Relations & Interdisciplinary Education
    • /
    • v.1 no.1
    • /
    • pp.61-79
    • /
    • 2021
  • In order to implement an artificial intelligence lawyer, this study looked at how to structure legal principles, and then gave specific examples of how structured legal principles can be expressed in predicate logic. While previous studies suggested a method of introducing predicate logic for the reasoning engine of artificial intelligence lawyers, this study focused on the method of expressing legal principles with predicate logic based on the structural appearance of legal principles. Jurisprudence was limited to the content of articles and precedents, and the vertical hierarchy leading to 'law facts - legal requirements - legal effect' and the horizontal hierarchy leading to 'legal effect - defense - defense' were examined. In addition, legal facts were classified and explained that most of the legal facts can be usually expressed in unary or binary predicates. In future research, we plan to program the legal principle expressed in predicate logic and realize an inference engine for artificial intelligence lawyers.