• 제목/요약/키워드: random sets

검색결과 276건 처리시간 0.025초

랜덤대치 기반 프라이버시 보호 기법의 정확성 개선 알고리즘 (An Algorithm for Improving the Accuracy of Privacy-Preserving Technique Based on Random Substitutions)

  • 강주성;이창우;홍도원
    • 정보처리학회논문지C
    • /
    • 제16C권5호
    • /
    • pp.563-574
    • /
    • 2009
  • 랜덤대치 기법은 실용적인 프라이버시 보호 방법으로 다양한 응용 가능성과 프라이버시 손상 관점의 안전성을 보장할 수 있다는 장점이 있다. 하지만 데이터 유용성을 위한 랜덤대치 기법의 정확성을 향상시키는 방법에 대해서는 그동안 면밀히 연구되지 않았다. 본 논문에서는 랜덤 대치 기법의 표준오차에 대한 보다 진전된 이론적 분석을 실시함으로써 정확성을 개선할 수 있는 알고리즘을 제안한다. 다양한 실험을 통하여 균등분포와 정규분포를 따르는 원본 데이터에 대한 랜덤대치 기법의 적용이 실용적이지 못한 정확성을 나타낸다는 사실과 함께 개선된 알고리즘의 정확성 향상 정도를 확인한다. 우리가 제안하는 알고리즘은 기존의 랜덤대치 기법과 동일한 프라이버시 수준을 유지한 상태에서 정확성을 원하는 수준만큼 높일 수 있는 방법이며, 이를 위해 추가로 소요되는 계산량은 실용적인 면에서 여전히 수용 가능한 것임을 밝힌다.

Comparison of tree-based ensemble models for regression

  • Park, Sangho;Kim, Chanmin
    • Communications for Statistical Applications and Methods
    • /
    • 제29권5호
    • /
    • pp.561-589
    • /
    • 2022
  • When multiple classifications and regression trees are combined, tree-based ensemble models, such as random forest (RF) and Bayesian additive regression trees (BART), are produced. We compare the model structures and performances of various ensemble models for regression settings in this study. RF learns bootstrapped samples and selects a splitting variable from predictors gathered at each node. The BART model is specified as the sum of trees and is calculated using the Bayesian backfitting algorithm. Throughout the extensive simulation studies, the strengths and drawbacks of the two methods in the presence of missing data, high-dimensional data, or highly correlated data are investigated. In the presence of missing data, BART performs well in general, whereas RF provides adequate coverage. The BART outperforms in high dimensional, highly correlated data. However, in all of the scenarios considered, the RF has a shorter computation time. The performance of the two methods is also compared using two real data sets that represent the aforementioned situations, and the same conclusion is reached.

Application of Monte Carlo simulations to uncertainty assessment of ship powering prediction by the 1978 ITTC method

  • Seo, Jeonghwa;Park, Jongyeol;Go, Seok Cheon;Rhee, Shin Hyung;Yoo, Jaehoon
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • 제13권1호
    • /
    • pp.292-305
    • /
    • 2021
  • The present study concerns uncertainty assessment of powering prediction from towing tank model tests, suggested by the International Towing Tank Conference (ITTC). The systematic uncertainty of towing tank tests was estimated by allowance of test setup and measurement accuracy of ITTC. The random uncertainty was varied from 0 to 8% of the measurement. Randomly generated inputs of test conditions and measurement data sets under systematic and random uncertainty are used to statistically analyze resistance and propulsive performance parameters at the full scale. The error propagation through an extrapolation procedure is investigated in terms of the sensitivity and coefficient of determination. By the uncertainty assessment, it is found that the uncertainty of resultant powering prediction was smaller than the test uncertainty.

Random generator-controlled backpropagation neural network to predicting plasma process data

  • Kim, Sungmo;Kim, Sebum;Kim, Byungwhan
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 2003년도 ISIS 2003
    • /
    • pp.599-602
    • /
    • 2003
  • A new technique is presented to construct predictive models of plasma etch processes. This was accomplished by combining a backpropagation neural network (BPNN) and a random generator (RC). The RG played a critical role to control neuron gradients in the hidden layer, The predictive model constructed in this way is referred to as a randomized BPNN (RG-BPNN). The proposed scheme was evaluated with a set of experimental plasma etch process data. The etch process was characterized by a 2$^3$ full factorial experiment. The etch responses modeled are 4, including aluminum (Al) etch rate, profile angle, Al selectivity, and do bias. Additional test data were prepared to evaluate model appropriateness. The performance of RC-BPNN was evaluated as a function of the number of hidden neurons and the range of gradient. for given range and hidden neurons, 100 sets of random neuron gradients were generated and among them one best set was selected for evaluation. Compared to the conventional BPNN, the proposed RC-BPNN demonstrated about 50% improvements in all comparisons. This illustrates that the RG-BPNN of multi-valued gradients is an effective way to considerably improve the predictive ability of current BPNN of single-valued gradient.

  • PDF

Neighbor Discovery in a Wireless Sensor Network: Multipacket Reception Capability and Physical-Layer Signal Processing

  • Jeon, Jeongho;Ephremides, Anthony
    • Journal of Communications and Networks
    • /
    • 제14권5호
    • /
    • pp.566-577
    • /
    • 2012
  • In randomly deployed networks, such as sensor networks, an important problem for each node is to discover its neighbor nodes so that the connectivity amongst nodes can be established. In this paper, we consider this problem by incorporating the physical layer parameters in contrast to the most of the previous work which assumed a collision channel. Specifically, the pilot signals that nodes transmit are successfully decoded if the strength of the received signal relative to the interference is sufficiently high. Thus, each node must extract signal parameter information from the superposition of an unknown number of received signals. This problem falls naturally in the purview of random set theory (RST) which generalizes standard probability theory by assigning sets, rather than values, to random outcomes. The contributions in the paper are twofold: First, we introduce the realistic effect of physical layer considerations in the evaluation of the performance of logical discovery algorithms; such an introduction is necessary for the accurate assessment of how an algorithm performs. Secondly, given the double uncertainty of the environment (that is, the lack of knowledge of the number of neighbors along with the lack of knowledge of the individual signal parameters), we adopt the viewpoint of RST and demonstrate its advantage relative to classical matched filter detection method.

Slangs and Short forms of Malay Twitter Sentiment Analysis using Supervised Machine Learning

  • Yin, Cheng Jet;Ayop, Zakiah;Anawar, Syarulnaziah;Othman, Nur Fadzilah;Zainudin, Norulzahrah Mohd
    • International Journal of Computer Science & Network Security
    • /
    • 제21권11호
    • /
    • pp.294-300
    • /
    • 2021
  • The current society relies upon social media on an everyday basis, which contributes to finding which of the following supervised machine learning algorithms used in sentiment analysis have higher accuracy in detecting Malay internet slang and short forms which can be offensive to a person. This paper is to determine which of the algorithms chosen in supervised machine learning with higher accuracy in detecting internet slang and short forms. To analyze the results of the supervised machine learning classifiers, we have chosen two types of datasets, one is political topic-based, and another same set but is mixed with 50 tweets per targeted keyword. The datasets are then manually labelled positive and negative, before separating the 275 tweets into training and testing sets. Naïve Bayes and Random Forest classifiers are then analyzed and evaluated from their performances. Our experiment results show that Random Forest is a better classifier compared to Naïve Bayes.

Influence Measures for a Test Statistic on Independence of Two Random Vectors

  • Jung Kang-Mo
    • Communications for Statistical Applications and Methods
    • /
    • 제12권3호
    • /
    • pp.635-642
    • /
    • 2005
  • In statistical diagnostics a large number of influence measures have been proposed for identifying outliers and influential observations. However it seems to be few accounts of the influence diagnostics on test statistics. We study influence analysis on the likelihood ratio test statistic whether the two sets of variables are uncorrelated with one another or not. The influence of observations is measured using the case-deletion approach, the influence function. We compared the proposed influence measures through two illustrative examples.

A new heuristics for the generalized assignment problem

  • Joo, Jaehun
    • 한국경영과학회:학술대회논문집
    • /
    • 대한산업공학회/한국경영과학회 1995년도 춘계공동학술대회논문집; 전남대학교; 28-29 Apr. 1995
    • /
    • pp.47-53
    • /
    • 1995
  • The Generalized Assignment (GAP) determines the minimum assignment of n tasks to m workstations such that each task is assigned to exactly one workstation, subject to the capacity of a workstation. In this paper, we presented a new heuristic search algorithm for GAPs. Then we tested it on 4 different benchmark sample sets of random problems generated according to uniform distribution on a microcomputer.

  • PDF

지능형 제어기법에 의한 생산 계획 설계 (Design of the intelligent control-based job scheduler)

  • 이창훈;서기성;정현호;우광방
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 1989년도 한국자동제어학술회의논문집; Seoul, Korea; 27-28 Oct. 1989
    • /
    • pp.286-289
    • /
    • 1989
  • The purpose of this paper is to design a job scheduling algorithm utilizing intelligent control technique. Rulebase is built through the evaluation of rule-set scheduling. 24 scheduling rule-sets and meta-rules are employed. An appropriate scheduling rule-set is selected based on this rulebase and current manufacturing system status. Six criteria have been used to evaluate the performance of scheduling. The performance of sheduling is dependent on random breakdown of the major FMS components during simulation.

  • PDF

On the Estimation in Regression Models with Multiplicative Errors

  • Park, Cheol-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • 제10권1호
    • /
    • pp.193-198
    • /
    • 1999
  • The estimation of parameters in regression models with multiplicative errors is usually based on the gamma or log-normal likelihoods. Under reciprocal misspecification, we compare the small sample efficiencies of two sets of estimators via a Monte Carlo study. We further consider the case where the errors are a random sample from a Weibull distribution. We compute the asymptotic relative efficiency of quasi-likelihood estimators on the original scale to least squares estimators on the log-transformed scale and perform a Monte Carlo study to compare the small sample performances of quasi-likelihood and least squares estimators.

  • PDF