• Title/Summary/Keyword: Data Collection Model

Search Result 1,045, Processing Time 0.036 seconds

An Energy-Efficient Periodic Data Collection using Dynamic Cluster Management Method in Wireless Sensor Network (무선 센서 네트워크에서 동적 클러스터 유지 관리 방법을 이용한 에너지 효율적인 주기적 데이터 수집)

  • Yun, SangHun;Cho, Haengrae
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.5 no.4
    • /
    • pp.206-216
    • /
    • 2010
  • Wireless sensor networks (WSNs) are used to collect various data in environment monitoring applications. A spatial clustering may reduce energy consumption of data collection by partitioning the WSN into a set of spatial clusters with similar sensing data. For each cluster, only a few sensor nodes (samplers) report their sensing data to a base station (BS). The BS may predict the missed data of non-samplers using the spatial correlations between sensor nodes. ASAP is a representative data collection algorithm using the spatial clustering. It periodically reconstructs the entire network into new clusters to accommodate to the change of spatial correlations, which results in high message overhead. In this paper, we propose a new data collection algorithm, name EPDC (Energy-efficient Periodic Data Collection). Unlike ASAP, EPDC identifies a specific cluster consisting of many dissimilar sensor nodes. Then it reconstructs only the cluster into subclusters each of which includes strongly correlated sensor nodes. EPDC also tries to reduce the message overhead by incorporating a judicious probabilistic model transfer method. We evaluate the performance of EPDC and ASAP using a simulation model. The experiment results show that the performance improvement of EPDC is up to 84% compared to ASAP.

Analysis Model Evaluation based on IoT Data and Machine Learning Algorithm for Prediction of Acer Mono Sap Liquid Water

  • Lee, Han Sung;Jung, Se Hoon
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.10
    • /
    • pp.1286-1295
    • /
    • 2020
  • It has been increasingly difficult to predict the amounts of Acer mono sap to be collected due to droughts and cold waves caused by recent climate changes with few studies conducted on the prediction of its collection volume. This study thus set out to propose a Big Data prediction system based on meteorological information for the collection of Acer mono sap. The proposed system would analyze collected data and provide managers with a statistical chart of prediction values regarding climate factors to affect the amounts of Acer mono sap to be collected, thus enabling efficient work. It was designed based on Hadoop for data collection, treatment and analysis. The study also analyzed and proposed an optimal prediction model for climate conditions to influence the volume of Acer mono sap to be collected by applying a multiple regression analysis model based on Hadoop and Mahout.

Numerical Simulation of Impactor Collection Efficiency according to Altitude (대기 고도에 따른 입자 포집용 관성 임팩터의 설계 및 포집효율 예측)

  • Kim, Gyuho;Yook, Se-Jin;Ahn, Kang-Ho
    • Particle and aerosol research
    • /
    • v.8 no.1
    • /
    • pp.1-8
    • /
    • 2012
  • In this study, the collection efficiency of inertial impactors was numerically simulated by employing the statistical Lagrangian particle tracking(SLPT) model. The SLPT model was proven to be correct in predicting the impactor collection efficiency, when the numerically obtained collection efficiencies were compared with the experimental data of Marple et al.(1987) at normal pressure level and the experimental data of $Marjam{\ddot{a}}ki$ et al.(2000) at low pressure level. Based on the validation results, balloon-borne impactors with the cut-off sizes of $1{\mu}m$, $2.5{\mu}m$, and $10{\mu}m$ were designed. Then, the sampling flowrates of the inertial impactors, required to keep the cut-off sizes constant at different pressures and temperatures, were estimated according to the altitude.

A Study on Quantity and Quality of Collected Rainwater by Collected Materials (우수 이용을 위한 포집재료별 포집수량과 수질에 관한 연구)

  • Lee, Young-Bok;Lee, Seung-Keun;Wang, Chang-Keun
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.18 no.1
    • /
    • pp.66-72
    • /
    • 2004
  • In this study, quantity and quality of collected rainwater by sand, gravel, soil, lawn and concrete surface, as collection materials were investigated and Rainwater Collection Prediction Model was developed to predict the amount of collected rainwater. The quantity of collected rainwater in concrete surface, gravel, sand, soil and lawn collection system was 1,067L(93.2%), 1,006L(87.8%), 902L(78.8%), 800L(69.9%), 788.5L(68.8%) for 8 months period, respectively. The average turbidity of collected rainwater in concrete surface, gravel, sand, soil and lawn collection system was 3.2NTU, 2.2NTU, 1.9NTU, 1.7NTU, 1.5NTU for 8 months period, respectively. For sand collection material, predicted amount by the Model and actual collected amount were 931.5L and 902L, which were very closed. For gravel collection material, predicted amount by Model and actual collected amount were 1,028.21. and 1,006L, which were very closed. To simulate the optimal rainwater storage volume, the rainfall and evaporation data in Dae-jeon city were used. For sand collection system with 30m2 area, the maximum storage volume was $17m^3$ and 62% of the year was secured for use of 240L/day.

A Specification-Based Methodology for Data Collection in Artificial Intelligence System (명세 기반 인공지능 학습 데이터 수집 방법)

  • Kim, Donggi;Choi, Byunggi;Lee, Jaeho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.11
    • /
    • pp.479-488
    • /
    • 2022
  • In recent years, with the rapid development of machine learning technology, research utilizing machine learning has been actively conducted in fields such as cognition, reasoning and judgment, and action among various technologies constituting intelligent systems. In order to utilize this machine learning, it is indispensable to collect data for learning. However, the types of data generated vary according to the environment in which the data is generated, and the types and forms of data required are different depending on the learning model to be used for machine learning. Due to this, there is a problem that the existing data collection method cannot be reused in a new environment, and a specialized data collection module must be developed each time. In this paper, we propose a specification-based methology for data collection in artificial intelligence system to solve the above problems, ensure the reusability of the data collection method according to the data collection environment, and automate the implementation of the data collection function.

Non-Linear Error Identifier Algorithm for Configuring Mobile Sensor Robot

  • Rajaram., P;Prakasam., P
    • Journal of Electrical Engineering and Technology
    • /
    • v.10 no.3
    • /
    • pp.1201-1211
    • /
    • 2015
  • WSN acts as an effective tool for tracking the large scale environments. In such environment, the battery life of the sensor networks is limited due to collection of the data, usage of sensing, computation and communication. To resolve this, a mobile robot is presented to identify the data present in the partitioned sensor networks and passed onto the sink. In novel data collection algorithm, the performance of the data collecting operation is reduced because mobile robot can be used only within the limited range. To enhance the data collection in a changing environment, Non Linear Error Identifier (NLEI) algorithm has been developed and presented in this paper to configure the robot by means of error models which are non-linear. Experimental evaluation has been conducted to estimate the performance of the proposed NLEI and it has been observed that the proposed NLEI algorithm increases the error correction rate upto 42% and efficiency upto 60%.

Modeling Age-specific Cancer Incidences Using Logistic Growth Equations: Implications for Data Collection

  • Shen, Xing-Rong;Feng, Rui;Chai, Jing;Cheng, Jing;Wang, De-Bin
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.22
    • /
    • pp.9731-9737
    • /
    • 2014
  • Large scale secular registry or surveillance systems have been accumulating vast data that allow mathematical modeling of cancer incidence and mortality rates. Most contemporary models in this regard use time series and APC (age-period-cohort) methods and focus primarily on predicting or analyzing cancer epidemiology with little attention being paid to implications for designing cancer registry, surveillance or evaluation initiatives. This research models age-specific cancer incidence rates using logistic growth equations and explores their performance under different scenarios of data completeness in the hope of deriving clues for reshaping relevant data collection. The study used China Cancer Registry Report 2012 as the data source. It employed 3-parameter logistic growth equations and modeled the age-specific incidence rates of all and the top 10 cancers presented in the registry report. The study performed 3 types of modeling, namely full age-span by fitting, multiple 5-year-segment fitting and single-segment fitting. Measurement of model performance adopted adjusted goodness of fit that combines sum of squred residuals and relative errors. Both model simulation and performance evalation utilized self-developed algorithms programed using C# languade and MS Visual Studio 2008. For models built upon full age-span data, predicted age-specific cancer incidence rates fitted very well with observed values for most (except cervical and breast) cancers with estimated goodness of fit (Rs) being over 0.96. When a given cancer is concerned, the R valuae of the logistic growth model derived using observed data from urban residents was greater than or at least equal to that of the same model built on data from rural people. For models based on multiple-5-year-segment data, the Rs remained fairly high (over 0.89) until 3-fourths of the data segments were excluded. For models using a fixed length single-segment of observed data, the older the age covered by the corresponding data segment, the higher the resulting Rs. Logistic growth models describe age-specific incidence rates perfectly for most cancers and may be used to inform data collection for purposes of monitoring and analyzing cancer epidemic. Helped by appropriate logistic growth equations, the work vomume of contemporary data collection, e.g., cancer registry and surveilance systems, may be reduced substantially.

Performance Improvement Methods of a Spoken Chatting System Using SVM (SVM을 이용한 음성채팅시스템의 성능 향상 방법)

  • Ahn, HyeokJu;Lee, SungHee;Song, YeongKil;Kim, HarkSoo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.6
    • /
    • pp.261-268
    • /
    • 2015
  • In spoken chatting systems, users'spoken queries are converted to text queries using automatic speech recognition (ASR) engines. If the top-1 results of the ASR engines are incorrect, these errors are propagated to the spoken chatting systems. To improve the top-1 accuracies of ASR engines, we propose a post-processing model to rearrange the top-n outputs of ASR engines using a ranking support vector machine (RankSVM). On the other hand, a number of chatting sentences are needed to train chatting systems. If new chatting sentences are not frequently added to training data, responses of the chatting systems will be old-fashioned soon. To resolve this problem, we propose a data collection model to automatically select chatting sentences from TV and movie scenarios using a support vector machine (SVM). In the experiments, the post-processing model showed a higher precision of 4.4% and a higher recall rate of 6.4% compared to the baseline model (without post-processing). Then, the data collection model showed the high precision of 98.95% and the recall rate of 57.14%.

An Analysis of the Determinants of the Collection Rate of Agricultural Plastic Waste (영농폐비닐 수거율 결정요인 분석)

  • Yi, Wooell;An, Donghwan
    • Journal of Korean Society of Rural Planning
    • /
    • v.25 no.3
    • /
    • pp.11-18
    • /
    • 2019
  • It is widely known that agricultural plastic waste incineration by farmers may cause big forest fire or fine dust in rural areas. Hence, how to increase the rate of collection and recycling of the agricultural plastic waste is of concern to policy makers especially for rural environment. The purpose of this study is to find the determinants of the collection rate of agricultural plastic waste. This study used the data from 'Research on Agricultural Waste' by the Korea Environment Corporation from year 2012 to 2015 for 163 regions. This study found that the compensation rate for collection, the frequency of collecting services, and the quality of waste are important to increase the collection rate. And the regions with more elderly and low income people are more likely to have higher collection rate. Finally, the chief producing regions that are specialized in a certain crop shows higher collection rate.

Implementation of AIoT Edge Cluster System via Distributed Deep Learning Pipeline

  • Jeon, Sung-Ho;Lee, Cheol-Gyu;Lee, Jae-Deok;Kim, Bo-Seok;Kim, Joo-Man
    • International journal of advanced smart convergence
    • /
    • v.10 no.4
    • /
    • pp.278-288
    • /
    • 2021
  • Recently, IoT systems are cloud-based, so that continuous and large amounts of data collected from sensor nodes are processed in the data server through the cloud. However, in the centralized configuration of large-scale cloud computing, computational processing must be performed at a physical location where data collection and processing take place, and the need for edge computers to reduce the network load of the cloud system is gradually expanding. In this paper, a cluster system consisting of 6 inexpensive Raspberry Pi boards was constructed to perform fast data processing. And we propose "Kubernetes cluster system(KCS)" for processing large data collection and analysis by model distribution and data pipeline method. To compare the performance of this study, an ensemble model of deep learning was built, and the accuracy, processing performance, and processing time through the proposed KCS system and model distribution were compared and analyzed. As a result, the ensemble model was excellent in accuracy, but the KCS implemented as a data pipeline proved to be superior in processing speed..