• 제목/요약/키워드: Research dataset

검색결과 1,324건 처리시간 0.028초

Turbo Positioning Using Link Reliability in Wireless Networks

  • Yun, Kyungsu;Park, Ji Kyu;Ahn, Jae Young;Kwon, Jae Kyun
    • ETRI Journal
    • /
    • 제40권1호
    • /
    • pp.101-110
    • /
    • 2018
  • In wireless positioning systems using range measurements non-line-of-sight (NLOS) links cause estimation errors. Several studies have attempted to improve the positioning performance by mitigating these NLOS errors. These studies, however, have focused on the performance of a dataset consisting of three or more links. Therefore, measurement errors induced by links are averaged, and a reliable link is not fully utilized in the dataset. This paper proposes a Link Reliability based on Range Measurement (LRRM) scheme, which specifies the relative reliability of each link using residuals. The link reliability becomes the input to a Link Residual Weighting (LRW) scheme, which is also proposed as a weighted positioning scheme. Moreover, LRRM and LRW constitute new turbo positioning, where the estimation errors are reduced considerably by iterative updates.

Cluster ing for Analysis of Raman Hyper spectral Dental Data

  • Jung, Sung-Hwan
    • Journal of Korea Multimedia Society
    • /
    • 제16권1호
    • /
    • pp.19-28
    • /
    • 2013
  • In this research, we presented an effective clustering method based on ICA for the analysis of huge Raman hyperspectral dental data. The hyperspectral dataset captured by HR800 micro Raman spectrometer at UMKC-CRISP(University of Missouri-Kansas City Center for Research on Interfacial Structure and Properties), has 569 local points. Each point has 1,005 hyperspectal dentin data. We compared the clustering effectiveness and the clustering time for the case of using all dataset directly and the cases of using the scores after PCA and ICA. As the result of experiment, the cases of using the scores after PCA and ICA showed, not only more detailed internal dentin information in the aspect of medical analysis, but also about 7~19 times much shorter processing times for clustering. ICA based approach also presented better performance than that of PCA, in terms of the detailed internal information of dentin and the clustering time. Therefore, we could confirm the effectiveness of ICA for the analysis of Raman hyperspectral dental data.

An Efficient Indexing Structure for Multidimensional Categorical Range Aggregation Query

  • Yang, Jian;Zhao, Chongchong;Li, Chao;Xing, Chunxiao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권2호
    • /
    • pp.597-618
    • /
    • 2019
  • Categorical range aggregation, which is conceptually equivalent to running a range aggregation query separately on multiple datasets, returns the query result on each dataset. The challenge is when the number of dataset is as large as hundreds or thousands, it takes a lot of computation time and I/O. In previous work, only a single dimension of the range restriction has been solved, and in practice, more applications are being used to calculate multiple range restriction statistics. We proposed MCRI-Tree, an index structure designed to solve multi-dimensional categorical range aggregation queries, which can utilize main memory to maximize the efficiency of CRA queries. Specifically, the MCRI-Tree answers any query in $O(nk^{n-1})$ I/Os (where n is the number of dimensions, and k denotes the maximum number of pages covered in one dimension among all the n dimensions during a query). The practical efficiency of our technique is demonstrated with extensive experiments.

Supervised learning-based DDoS attacks detection: Tuning hyperparameters

  • Kim, Meejoung
    • ETRI Journal
    • /
    • 제41권5호
    • /
    • pp.560-573
    • /
    • 2019
  • Two supervised learning algorithms, a basic neural network and a long short-term memory recurrent neural network, are applied to traffic including DDoS attacks. The joint effects of preprocessing methods and hyperparameters for machine learning on performance are investigated. Values representing attack characteristics are extracted from datasets and preprocessed by two methods. Binary classification and two optimizers are used. Some hyperparameters are obtained exhaustively for fast and accurate detection, while others are fixed with constants to account for performance and data characteristics. An experiment is performed via TensorFlow on three traffic datasets. Three scenarios are considered to investigate the effects of learning former traffic on sequential traffic analysis and the effects of learning one dataset on application to another dataset, and determine whether the algorithms can be used for recent attack traffic. Experimental results show that the used preprocessing methods, neural network architectures and hyperparameters, and the optimizers are appropriate for DDoS attack detection. The obtained results provide a criterion for the detection accuracy of attacks.

Labeling Big Spatial Data: A Case Study of New York Taxi Limousine Dataset

  • AlBatati, Fawaz;Alarabi, Louai
    • International Journal of Computer Science & Network Security
    • /
    • 제21권6호
    • /
    • pp.207-212
    • /
    • 2021
  • Clustering Unlabeled Spatial-datasets to convert them to Labeled Spatial-datasets is a challenging task specially for geographical information systems. In this research study we investigated the NYC Taxi Limousine Commission dataset and discover that all of the spatial-temporal trajectory are unlabeled Spatial-datasets, which is in this case it is not suitable for any data mining tasks, such as classification and regression. Therefore, it is necessary to convert unlabeled Spatial-datasets into labeled Spatial-datasets. In this research study we are going to use the Clustering Technique to do this task for all the Trajectory datasets. A key difficulty for applying machine learning classification algorithms for many applications is that they require a lot of labeled datasets. Labeling a Big-data in many cases is a costly process. In this paper, we show the effectiveness of utilizing a Clustering Technique for labeling spatial data that leads to a high-accuracy classifier.

Motion classification using distributional features of 3D skeleton data

  • Woohyun Kim;Daeun Kim;Kyoung Shin Park;Sungim Lee
    • Communications for Statistical Applications and Methods
    • /
    • 제30권6호
    • /
    • pp.551-560
    • /
    • 2023
  • Recently, there has been significant research into the recognition of human activities using three-dimensional sequential skeleton data captured by the Kinect depth sensor. Many of these studies employ deep learning models. This study introduces a novel feature selection method for this data and analyzes it using machine learning models. Due to the high-dimensional nature of the original Kinect data, effective feature extraction methods are required to address the classification challenge. In this research, we propose using the first four moments as predictors to represent the distribution of joint sequences and evaluate their effectiveness using two datasets: The exergame dataset, consisting of three activities, and the MSR daily activity dataset, composed of ten activities. The results show that the accuracy of our approach outperforms existing methods on average across different classifiers.

Prediction of Depression from Machine Learning Data (머신러닝 데이터의 우울증에 대한 예측)

  • Jeong Hee KIM;Kyung-A KIM
    • Journal of Korea Artificial Intelligence Association
    • /
    • 제1권1호
    • /
    • pp.17-21
    • /
    • 2023
  • The primary objective of this research is to utilize machine learning models to analyze factors tailored to each dataset for predicting mental health conditions. The study aims to develop appropriate models based on specific datasets, with the goal of accurately predicting mental health states through the analysis of distinct factors present in each dataset. This approach seeks to design more effective strategies for the prevention and intervention of depression, enhancing the quality of mental health services by providing personalized services tailored to individual circumstances. Overall, the research endeavors to advance the development of personalized mental health prediction models through data-driven factor analysis, contributing to the improvement of mental health services on an individualized basis.

Prediction of Transition Temperature and Magnetocaloric Effects in Bulk Metallic Glasses with Ensemble Models (앙상블 기계학습 모델을 이용한 비정질 소재의 자기냉각 효과 및 전이온도 예측)

  • Chunghee Nam
    • Korean Journal of Materials Research
    • /
    • 제34권7호
    • /
    • pp.363-369
    • /
    • 2024
  • In this study, the magnetocaloric effect and transition temperature of bulk metallic glass, an amorphous material, were predicted through machine learning based on the composition features. From the Python module 'Matminer', 174 compositional features were obtained, and prediction performance was compared while reducing the composition features to prevent overfitting. After optimization using RandomForest, an ensemble model, changes in prediction performance were analyzed according to the number of compositional features. The R2 score was used as a performance metric in the regression prediction, and the best prediction performance was found using only 90 features predicting transition temperature, and 20 features predicting magnetocaloric effects. The most important feature when predicting magnetocaloric effects was the 'Fe' compositional ratio. The feature importance method provided by 'scikit-learn' was applied to sort compositional features. The feature importance method was found to be appropriate by comparing the prediction performance of the Fe-contained dataset with the full dataset.

A Study on the Sharing of Research Data in Library and Information Science Field (문헌정보학 분야 연구데이터 공유에 관한 연구)

  • Cho, Jane
    • Journal of the Korean Society for information Management
    • /
    • 제34권4호
    • /
    • pp.59-79
    • /
    • 2017
  • This study analyzed the type, subject and open level of research data in the field of library and information science field shared by Figshare, and statistically analyzed the characteristics of data with relatively high recyclability. The results of the analysis showed that datasets and papers were most common data types, and open access and research data were the most common keywords of data, and that 70% of the data were published in a form that can not be processed mechanically such as pdf. As a result of analysis of the relationship between characteristics of research data and degree of sharing, open access areas such as APC (Article Processing Charge) were found to be most common in the subject. However in data type, gray literature such as paper found to be highly utilized rather than dataset.

Estimating Litter Carbon Stock and Change on Forest in Gangwon Province from the National Forestry Inventory Data (국가산림자원조사 자료를 활용한 강원도 산림내 낙엽층의 탄소저장량 및 변화량 추정)

  • Lee, Sun Jeoung;Kim, Raehyun;Son, Yeong Mo;Yim, Jong Su
    • Journal of Climate Change Research
    • /
    • 제8권4호
    • /
    • pp.385-391
    • /
    • 2017
  • This study was conducted to estimate litter carbon stock change from the National Forest Inventory (NFI) data for national greenhouse gas inventory report. Litter carbon stocks were calculated from the NFI dataset in NFI5 (2008) and NFI6 (2013) in Gangwon province. Total carbon stock change of litter was $0.68{\pm}0.71\;t\;C/ha$ from NFI5 (2008) to NFI6 (2013), however, there was no significant difference between the both dataset at 2008 and 2013 year. Litter carbon stock of coniferous stands was higher than deciduous stands in NFI5 (2008) and NFI6 (2013) (P<0.05). This study was limited to pilot study, so we will assess litter carbon stock using more complete data from NFI systems. It can be used as data sources for national greenhouse gas inventory report on forest sector.