• 제목/요약/키워드: small data set

검색결과 664건 처리시간 0.037초

멀티셋의 크기 추정 기법에서 샘플링의 효과 (Effect of Sampling for Multi-set Cardinality Estimation)

  • ;양대헌;이경희
    • 정보처리학회논문지:컴퓨터 및 통신 시스템
    • /
    • 제4권1호
    • /
    • pp.15-22
    • /
    • 2015
  • 멀티셋에서 중복을 제외한 서로 다른 원소의 수를 추정하는 것은 네트워크 트래픽 측정 분야에서 매우 잘 알려진 문제이며, 많은 알고리즘들이 제안되었다. 최근에는 선형 카운팅 기법(Linear Counting)에 기반해서 매우 작은 메모리만을 이용해서 멀티셋의 크기를 추정하는 알고리즘이 개발되었다. 너무 많은 데이터를 처리하기 어려운 경우 전체 데이터를 처리하지 않고, 패킷의 일부를 샘플링해서 사용하는데, 이 샘플링은 일반적으로 정확도에 부정적인 영향을 주는 것으로 알려져있다. 하지만, 이 논문에서는 멀티셋의 크기를 추정하는데 있어서 CSE를 이용하는 경우 샘플링이 정확도와 측정 범위의 측면에서 오히려 전수조사를 하는 MCSE보다 더 좋은 결과를 낼 수 있음을 보였다. 이를 입증하기 위해 수학적 분석, 실제 데이터를 이용한 실험을 수행하고, CSE, MCSE 그리고 CSES를 비교하였다.

지도학습기법을 이용한 비선형 다변량 공정의 비정상 상태 탐지 (Abnormality Detection to Non-linear Multivariate Process Using Supervised Learning Methods)

  • 손영태;윤덕균
    • 산업공학
    • /
    • 제24권1호
    • /
    • pp.8-14
    • /
    • 2011
  • Principal Component Analysis (PCA) reduces the dimensionality of the process by creating a new set of variables, Principal components (PCs), which attempt to reflect the true underlying process dimension. However, for highly nonlinear processes, this form of monitoring may not be efficient since the process dimensionality can't be represented by a small number of PCs. Examples include the process of semiconductors, pharmaceuticals and chemicals. Nonlinear correlated process variables can be reduced to a set of nonlinear principal components, through the application of Kernel Principal Component Analysis (KPCA). Support Vector Data Description (SVDD) which has roots in a supervised learning theory is a training algorithm based on structural risk minimization. Its control limit does not depend on the distribution, but adapts to the real data. So, in this paper proposes a non-linear process monitoring technique based on supervised learning methods and KPCA. Through simulated examples, it has been shown that the proposed monitoring chart is more effective than $T^2$ chart for nonlinear processes.

초음파 영상에서 LoG 연산자를 이용한 진단 객체의 3차원 분할 (3D Segmentation of a Diagnostic Object in Ultrasound Images Using LoG Operator)

  • 정말남;곽종인;김상현;김남철
    • 대한의용생체공학회:의공학회지
    • /
    • 제24권4호
    • /
    • pp.247-257
    • /
    • 2003
  • This paper proposes a three-dimensional (3D) segmentation algorithm for extracting a diagnostic object from ultrasound images by using a LoG operator In the proposed algorithm, 2D cutting planes are first obtained by the equiangular revolution of a cross sectional Plane on a reference axis for a 3D volume data. In each 2D ultrasound image. a region of interest (ROI) box that is included tightly in a diagnostic object of interest is set. Inside the ROI box, a LoG operator, where the value of $\sigma$ is adaptively selected by the distance between reference points and the variance of the 2D image, extracts edges in the 2D image. In Post processing. regions of the edge image are found out by region filling, small regions in the region filled image are removed. and the contour image of the object is obtained by morphological opening finally. a 3D volume of the diagnostic object is rendered from the set of contour images obtained by post-processing. Experimental results for a tumor and gall bladder volume data show that the proposed method yields on average two times reduction in error rate over Krivanek's method when the results obtained manually are used as a reference data.

Price Forecasting on a Large Scale Data Set using Time Series and Neural Network Models

  • Preetha, KG;Remesh Babu, KR;Sangeetha, U;Thomas, Rinta Susan;Saigopika, Saigopika;Walter, Shalon;Thomas, Swapna
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권12호
    • /
    • pp.3923-3942
    • /
    • 2022
  • Environment, price, regulation, and other factors influence the price of agricultural products, which is a social signal of product supply and demand. The price of many agricultural products fluctuates greatly due to the asymmetry between production and marketing details. Horticultural goods are particularly price sensitive because they cannot be stored for long periods of time. It is very important and helpful to forecast the price of horticultural products which is crucial in designing a cropping plan. The proposed method guides the farmers in agricultural product production and harvesting plans. Farmers can benefit from long-term forecasting since it helps them plan their planting and harvesting schedules. Customers can also profit from daily average price estimates for the short term. This paper study the time series models such as ARIMA, SARIMA, and neural network models such as BPN, LSTM and are used for wheat cost prediction in India. A large scale available data set is collected and tested. The results shows that since ARIMA and SARIMA models are well suited for small-scale, continuous, and periodic data, the BPN and LSTM provide more accurate and faster results for predicting well weekly and monthly trends of price fluctuation.

Magnetic Field Correction Method of Magnetometers in Small Satellites

  • Lee, Seon-Ho;Rhee, Seung-Wu;Ahn, Hyo-Sung
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2003년도 ICCAS
    • /
    • pp.36-40
    • /
    • 2003
  • The considered satellite is supposed to operate in the earth-point mode and sun-point mode in accordance with the mission requirements. The magnetic field correction is based on the orbit geometry using a set of measured magnetic field data from the three-axis-magnetometer and its algorithm excludes the earth’s magnetic field model. Moreover, the usefulness of the proposed method is investigated throughout the simulation of KOMPSAT-1.

  • PDF

Definition of the neutronics benchmark of the NuScale-like core

  • Emil Fridman;Yurii Bilodid;Ville Valtavirta
    • Nuclear Engineering and Technology
    • /
    • 제55권10호
    • /
    • pp.3639-3647
    • /
    • 2023
  • This paper defines a 3D full core neutronics benchmark which is based on the NuScale small modular reactor (SMR) concept. The paper provides a detailed description of the NuScale-like core, a list of expected outputs, and a reference solution to the benchmark exercises obtained with the Monte Carlo code Serpent. The benchmark was developed in the framework of the Euratom McSAFER project and can be used for verification of computational chains dedicated to 3D full-core neutronics simulations of water cooled SMRs. The paper is supplemented with a digital data set to ease the modeling process.

데이터 베이스 특성에 따른 효율적인 데이터 마이닝 알고리즘 (An Efficient Data Mining Algorithm based on the Database Characteristics)

  • 박지현;고찬
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • 제10권1호
    • /
    • pp.107-119
    • /
    • 2006
  • 인터넷과 웹 기술 발전에 따라 데이터베이스에 축적되는 자료의 양이 급속히 늘어나고 있다. 데이터베이스의 응용 범위가 확대되고 대용량 데이터베이스로부터 유용한 지식을 발견하고자 하는 데이터 마이닝(Data Mining) 기술에 대한 연구가 활발하게 진행되고 있다. 기존의 알고리즘들은 대부분 후보 항목 집합들을 줄임과 동시에 데이터베이스의 크기를 줄이는 방법으로 발전해 오고 있다. 그러나 후보 항목집합들을 줄이는 노력이나 데이터베이스의 크기를 줄이는 방법들이 빈발 항목집합들을 생성하는 전 과정에서 필요로 하지는 않는다. 그러한 방법들이 어느 과정에서는 시간을 줄이는데 효과가 있지만 다른 과정에서는 오히려 그러한 방법들을 적용하는데 더 많은 시간이 소요되기 때문이다. 본 논문에서는 트랜잭션들의 길이가 짧거나 데이터베이스를 이루는 항목들의 수가 비교적 적은 트랜잭션 데이터베이스에서 해슁 기법을 사용하여 데이터베이스를 한 번 스캔하고 동시에 각 트랜잭션에서 발생 가능한 모든 부분집합들을 해쉬 테이블에 저장함으로써 최소 지지도에 영향을 받지 않고 기존의 알고리즘보다 더 짧은 시간에 빈발항목집합을 발견할 수 있는 효과적인 연관 규칙 탐사 알고리즘을 제안하고 실험하였다.

  • PDF

Modified Bagging Predictors를 이용한 SOHO 부도 예측 (SOHO Bankruptcy Prediction Using Modified Bagging Predictors)

  • 김승혁;김종우
    • 지능정보연구
    • /
    • 제13권2호
    • /
    • pp.15-26
    • /
    • 2007
  • 본 연구에서는 기존 Bagging Predictors에 수정을 가한 Modified Bagging Predictors를 이용하여 SOHO에 대한 부도예측 모델을 제시한다. 대기업 및 중소기업에 대한 기업부도예측 모델에 대한 많은 선행 연구가 있어왔지만 SOHO만의 기업부도 예측 모델에 관한 연구는 미비한 상태이다. 금융기관들의 대출 심사 시 대기업 및 중소기업과는 달리 SOHO에 대한 대출심사는 아직은 체계화되지 못한 채 신용정보점수 등의 단편적인 요소를 사용하고 있는 것이 현실이고 이에 따라 잘못된 대출로 인한 금융기관의 부실화를 초래할 위험성이 크다. 본 연구에서는 실제국내은행의 SOHO 대출 데이터 집합이 사용되었다. 먼저, 기업부도 예측 모델에서 우수하다고 연구되어진 인공신경망과 의사결정나무 추론 기법을 적용하여 보았지만 만족할 만한 성과를 이끌어내지 못하여, 기존 기업부도 예측 모델 연구에서 적용이 미비하였던 Bagging Predictors와 이를 개선한 Modified Bagging Predictors를 제시하고 이를 적용하여 보았다. 연구결과, SOHO 부도 예측에 있어서 본 연구에서 제시한 Modified Bagging Predictors가 인공신경망과 Bagging Predictors 등의 기존 기법에 비해서 성과가 향상됨을 알 수 있었다.

  • PDF

DESIGN AND IMPLEMENTATION OF 3D TERRAIN RENDERING SYSTEM ON MOBILE ENVIRONMENT USING HIGH RESOLUTION SATELLITE IMAGERY

  • Kim, Seung-Yub;Lee, Ki-Won
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2006년도 Proceedings of ISRS 2006 PORSEC Volume I
    • /
    • pp.417-420
    • /
    • 2006
  • In these days, mobile application dealing with information contents on mobile or handheld devices such as mobile communicator, PDA or WAP device face the most important industrial needs. The motivation of this study is the design and implementation of mobile application using high resolution satellite imagery, large-sized image data set. Although major advantages of mobile devices are portability and mobility to users, limited system resources such as small-sized memory, slow CPU, low power and small screen size are the main obstacles to developers who should handle a large volume of geo-based 3D model. Related to this, the previous works have been concentrated on GIS-based location awareness services on mobile; however, the mobile 3D terrain model, which aims at this study, with the source data of DEM (Digital Elevation Model) and high resolution satellite imagery is not considered yet, in the other mobile systems. The main functions of 3D graphic processing or pixel pipeline in this prototype are implemented with OpenGL|ES (Embedded System) standard API (Application Programming Interface) released by Khronos group. In the developing stage, experiments to investigate optimal operation environment and good performance are carried out: TIN-based vertex generation with regular elevation data, image tiling, and image-vertex texturing, text processing of Unicode type and ASCII type.

  • PDF

Continuous and Accurate PCRAM Current-voltage Model

  • Jung, Chul-Moon;Lee, Eun-Sub;Min, Kyeong-Sik
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제11권3호
    • /
    • pp.162-168
    • /
    • 2011
  • In this paper, we propose a new Verilog-A current-voltage model for multi-level-cell PCRAMs. This model can describe the PCRAM operation not only in full SET and RESET states but also in the partial resistance states. And, 3 PCRAM operating regions of SET-RESET, Negative Differential Resistance, and strong-ON are unified into one equation in this model thereby any discontinuity that may introduce a convergence problem cannot be found in the new PCRAM model. The percentage error between the measured data and this model is as small as 7.4% on average compared to 60.1% of the previous piecewise model. The parameter extraction which is embedded in the Verilog-A code can be done automatically.