• 제목/요약/키워드: Data Density

검색결과 5,354건 처리시간 0.032초

Density-based Outlier Detection in Multi-dimensional Datasets

  • Wang, Xite;Cao, Zhixin;Zhan, Rongjuan;Bai, Mei;Ma, Qian;Li, Guanyu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권12호
    • /
    • pp.3815-3835
    • /
    • 2022
  • Density-based outlier detection is one of the hot issues in data mining. A point is determined as outlier on basis of the density of points near them. The existing density-based detection algorithms have high time complexity, in order to reduce the time complexity, a new outlier detection algorithm DODMD (Density-based Outlier Detection in Multidimensional Datasets) is proposed. Firstly, on the basis of ZH-tree, the concept of micro-cluster is introduced. Each leaf node is regarded as a micro-cluster, and the micro-cluster is calculated to achieve the purpose of batch filtering. In order to obtain n sets of approximate outliers quickly, a greedy method is used to calculate the boundary of LOF and mark the minimum value as LOFmin. Secondly, the outliers can filtered out by LOFmin, the real outliers are calculated, and then the result set is updated to make the boundary closer. Finally, the accuracy and efficiency of DODMD algorithm are verified on real dataset and synthetic dataset respectively.

Density Aware Energy Efficient Clustering Protocol for Normally Distributed Sensor Networks

  • Su, Xin;Choi, Dong-Min;Moh, Sang-Man;Chung, Il-Yong
    • 한국멀티미디어학회논문지
    • /
    • 제13권6호
    • /
    • pp.911-923
    • /
    • 2010
  • In wireless sensor networks (WSNs), cluster based data routing protocols have the advantages of reducing energy consumption and link maintenance cost. Unfortunately, most of clustering protocols have been designed for uniformly distributed sensor networks. However, some urgent situations do not allow thousands of sensor nodes being deployed uniformly. For example, air vehicles or balloons may take the responsibility for deploying sensor nodes hence leading a normally distributed topology. In order to improve energy efficiency in such sensor networks, in this paper, we propose a new cluster formation algorithm named DAEEC (Density Aware Energy-Efficient Clustering). In this algorithm, we define two kinds of clusters: Low Density (LD) clusters and High Density (HD) clusters. They are determined by the number of nodes participated in one cluster. During the data routing period, the HD clusters help the neighbor LD clusters to forward the sensed data to the central base station. Thus, DAEEC can distribute the energy dissipation evenly among all sensor nodes by considering the deployment density to improve network lifetime and average energy savings. Moreover, because the HD clusters are densely deployed they can work in a manner of our former algorithm EEVAR (Energy Efficient Variable Area Routing Protocol) to save energy. According to the performance analysis result, DAEEC outperforms the conventional data routing schemes in terms of energy consumption and network lifetime.

Efficiency and Robustness of Fully Adaptive Simulated Maximum Likelihood Method

  • Oh, Man-Suk;Kim, Dai-Gyoung
    • Communications for Statistical Applications and Methods
    • /
    • 제16권3호
    • /
    • pp.479-485
    • /
    • 2009
  • When a part of data is unobserved the marginal likelihood of parameters given the observed data often involves analytically intractable high dimensional integral and hence it is hard to find the maximum likelihood estimate of the parameters. Simulated maximum likelihood(SML) method which estimates the marginal likelihood via Monte Carlo importance sampling and optimize the estimated marginal likelihood has been used in many applications. A key issue in SML is to find a good proposal density from which Monte Carlo samples are generated. The optimal proposal density is the conditional density of the unobserved data given the parameters and the observed data, and attempts have been given to find a good approximation to the optimal proposal density. Algorithms which adaptively improve the proposal density have been widely used due to its simplicity and efficiency. In this paper, we describe a fully adaptive algorithm which has been used by some practitioners but has not been well recognized in statistical literature, and evaluate its estimation performance and robustness via a simulation study. The simulation study shows a great improvement in the order of magnitudes in the mean squared error, compared to non-adaptive or partially adaptive SML methods. Also, it is shown that the fully adaptive SML is robust in a sense that it is insensitive to the starting points in the optimization routine.

MOISTURE CONTENT MEASUREMENT OF POWDERED FOOD USING RF IMPEDANCE SPECTROSCOPIC METHOD

  • Kim, K. B.;Lee, J. W.;S. H. Noh;Lee, S. S.
    • 한국농업기계학회:학술대회논문집
    • /
    • 한국농업기계학회 2000년도 THE THIRD INTERNATIONAL CONFERENCE ON AGRICULTURAL MACHINERY ENGINEERING. V.II
    • /
    • pp.188-195
    • /
    • 2000
  • This study was conducted to measure the moisture content of powdered food using RF impedance spectroscopic method. In frequency range of 1.0 to 30㎒, the impedance such as reactance and resistance of parallel plate type sample holder filled with wheat flour and red-pepper powder of which moisture content range were 5.93∼-17.07%w.b. and 10.87 ∼ 27.36%w.b., respectively, was characterized using by Q-meter (HP4342). The reactance was a better parameter than the resistance in estimating the moisture density defined as product of moisture content and bulk density which was used to eliminate the effect of bulk density on RF spectral data in this study. Multivariate data analyses such as principal component regression, partial least square regression and multiple linear regression were performed to develop one calibration model having moisture density and reactance spectral data as parameters for determination of moisture content of both wheat flour and red-pepper powder. The best regression model was one by the multiple linear regression model. Its performance for unknown data of powdered food was showed that the bias, standard error of prediction and determination coefficient are 0.179% moisture content, 1.679% moisture content and 0.8849, respectively.

  • PDF

냉매의 포화증기밀도 상관식 비교 (Comparison of Correlations of Saturated Vapor Density for Some Refrigerants)

  • 박경근;강병하;장시열
    • 설비공학논문집
    • /
    • 제19권6호
    • /
    • pp.457-463
    • /
    • 2007
  • Various correlations of saturated vapor density in a truncated power series form are tested and compared in this study. Saturated vapor density correlation can be expressed relating logarithmic reduced density to the reduced temperature. Five types of correlation has been investigated using saturated vapor density data for 22 pure substance refrigerants from ASHRAE (American Society of Heating, Reftigerating and Air-Conditioning Engineers, Inc.) property tables and NIST (National Institute of Standards and Technology) Chemistry Webbook. Correlations are fitted to the data points by least squares method. Data points are equally weighted. The best type of correlation among the five types is suggested. The results obtained indicate that the best correlations with 3, 4, and 5 terms yield average AAD's (Average Absolute Deviation) of 0.27%, 0.04%, and 0.02%, respectively, while widely used conventional correlations with 3, 4, and 5 terms yield those of 1.19%, 0.61%, and 0.17%. The suggested type of correlation could reduce the number of terms while improving performance.

탄질에 따른 석탄의 물성 변화 (Variation of the Physical Properties of Coal depending upon the Quality)

  • 권병두;허식
    • 자원환경지질
    • /
    • 제21권1호
    • /
    • pp.97-106
    • /
    • 1988
  • The purpose of this study is to collect basic data which are prerequisite for quantitative analysis of coal logging data. The study involves laboratory measurements of physical properties such as seismic velocities (P,S-waves), resitivity and density of domestic and imported foreign coals. The relationships between these properties were analyzed by using cross-plots. Correlation between the physical properties of coal and the results of chemical analysis (calorie, fixed carbon, ash, moisture, volatile matter and sulfur contents) was also studied to obtain ideas about coal quality analysis using logging data. Summarized results are as follows: 1. $V_P$ is exponentialy related to $V_S$. And the average value of $V_P$ is about l.8 times as large as $V_S$. 2. Since coal has very low density compared with surrounding sedimentary rocks, density logging is appeared to be the best method for identifying coal seams and evaluating their qualities. 3. For the case of domestic coals, the ash contents and calorie show a perfect inverse relationship. Since the density increases as increase of ash content with a well-defined functional form, the ash content of domestic coals can be estimated by density measurements. 4. Because of low ash content, low density and high resistivity, foreign coals and domestic lignites are easily distinguished from domestic coals.

  • PDF

Study on the Relationship between the Forest Canopy Closure and Hyperspectral Signatures

  • Lin, Chinsu;Chang, Chein-I
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2003년도 Proceedings of ACRS 2003 ISRS
    • /
    • pp.72-74
    • /
    • 2003
  • Forest canopy density is an ideal representative of the forest habitat situations. It can directly or indirectly depict the canopy structure and gap size in the forestland, thus could be applied to assessment of wildlife’s diversit y. Since population survey of vegetation and wildlife diversities is a key issue for sustainable forest ecosystem management, many research efforts have been focused on forest canopy density using multispectral data in the last two decades. Unfortunately, prediction of canopy density using large scaling remote sensing data remains a challenging issue. Due to recent advances in hyperspectral image sensors hyperspectral imagery is now available for environmental monitoring. In this paper, we conduct experiments to monitor complicated environments of forestland that can be captured by using hyperspectral imagery and further be analyzed to test a prediction model of forest canopy density. The results show that 95% of canopy density could be well described by using 2 difference vegetation indices (DVIs), which are difference of blue and green reflectances rband_100-rband_150 and difference of 2 short wave infrared reflectancse rband_406-rband_410 With the wavelengths of band no. 100, 150, 406, and 410 specified by 462.39 nm, 534.40 nm, 918.22 nm and 924.41 nm respectively.

  • PDF

대용량 자료에 대한 밀도 적응 격자 기반의 k-NN 회귀 모형 (Density Adaptive Grid-based k-Nearest Neighbor Regression Model for Large Dataset)

  • 유의기;정욱
    • 품질경영학회지
    • /
    • 제49권2호
    • /
    • pp.201-211
    • /
    • 2021
  • Purpose: This paper proposes a density adaptive grid algorithm for the k-NN regression model to reduce the computation time for large datasets without significant prediction accuracy loss. Methods: The proposed method utilizes the concept of the grid with centroid to reduce the number of reference data points so that the required computation time is much reduced. Since the grid generation process in this paper is based on quantiles of original variables, the proposed method can fully reflect the density information of the original reference data set. Results: Using five real-life datasets, the proposed k-NN regression model is compared with the original k-NN regression model. The results show that the proposed density adaptive grid-based k-NN regression model is superior to the original k-NN regression in terms of data reduction ratio and time efficiency ratio, and provides a similar prediction error if the appropriate number of grids is selected. Conclusion: The proposed density adaptive grid algorithm for the k-NN regression model is a simple and effective model which can help avoid a large loss of prediction accuracy with faster execution speed and fewer memory requirements during the testing phase.

도시고속도로의 진입연결로 접속부내 선형의 밀도예측모형 구축에 관한 연구 (The Linear Density Predictive Models on the On-Ramp Junction in the Urban Freeway)

  • 김태곤;신광식;김승길;김정서
    • 대한토목학회논문집
    • /
    • 제26권1D호
    • /
    • pp.59-66
    • /
    • 2006
  • 이 연구는 도시고속도로의 진입연결로 접속부내 선형의 밀도예측모형 구축에 관한 연구로서 실시간 교통특성분석과 선형의 밀도예측모형 구축 및 검증을 통해 밀도예측모형 구축에 있어서 결정계수($R^2$)값이 대체적으로 0.7이상으로 나타나 선형회귀모형구축에 상당히 높은 설명력을 보이는 것으로 나타났고, 선형모형검증에 있어서 상관계수(r)값도 대체적으로 0.8 이상으로 상당히 높은 상관성을 보이는 것으로 나타났다. 따라서 향후 선형의 밀도예측모형을 이용하여 도시고속도로의 진입연결로 접속부내 차량의 밀도추정 및 지체분석에 상당히 유효할 것으로 판단된다.

1290 MHz 산란 신호의 고도별 파워 스펙트럼 밀도에 기반한 시선 속도와 모멘트 산출 (Retrieval of Radial Velocity and Moment Based on the Power Spectrum Density of Scattered 1290 MHz Signals with Altitude)

  • 조원기;권병혁;윤홍주
    • 한국전자통신학회논문지
    • /
    • 제13권6호
    • /
    • pp.1191-1198
    • /
    • 2018
  • 윈드프로파일러 레이더는 고정점에서 대기물리 신호와 바람 벡터의 연직 프로파일을 제공한다. 바람 벡터는 제조사의 자료 처리 프로그램으로 산출되기 때문에 품질 관리에 한계가 있다. 따라서 바람 벡터의 품질을 향상시키기 위해서 원시 스펙트럼 자료의 이해와 활용이 이루어져야 한다. 바람 벡터의 원시 자료는 바이너리 형태로 저장되는 파워 스펙트럼 밀도이다. 본 연구에서는 원시 자료를 실수형 스펙트럼 밀도로 변환하는 알고리즘을 완성하고 스펙트럼 기반 0차와 1차 모멘트를 구현하여 원시 자료의 활용을 평가하였다.