• 제목/요약/키워드: Data Collecting

검색결과 2,221건 처리시간 0.042초

보행행태조사방법론의 변화와 모바일 빅데이터의 가능성 진단 연구 - 보행환경 분석연구 최근 사례를 중심으로 - (Changes in Measuring Methods of Walking Behavior and the Potentials of Mobile Big Data in Recent Walkability Researches)

  • 김현주;박소현;이선재
    • 대한건축학회논문집:계획계
    • /
    • 제35권1호
    • /
    • pp.19-28
    • /
    • 2019
  • The purpose of this study is to evaluate the walking behavior analysis methodology used in the previous studies, paying attention to the demand for empirical data collecting for urban and neighborhood planning. The preceding researches are divided into (1)Recording, (2) Surveys, (3)Statistical data, (4)Global positioning system (GPS) devices, and (5)Mobile Big Data analysis. Next, we analyze the precedent research and identify the changes of the walkability research. (1)being required empirical data on the actual walking and moving patterns of people, (2)beginning to be measured micro-walking behaviors such as actual route, walking facilities, detour, walking area. In addition, according to the trend of research, it is analyzed that the use of GPS device and the mobile big data are newly emerged. Finally, we analyze pedestrian data based on mobile big data in terms of 'application' and distinguishing it from existing survey methodology. We present the possibility of mobile big data. (1)Improvement of human, temporal and spatial constraints of data collection, (2)Improvement of inaccuracy of collected data, (3)Improvement of subjective intervention in data collection and preprocessing, (4)Expandability of walking environment research.

K-nn을 이용한 Hot Deck 기반의 결측치 대체 (Imputation of Missing Data Based on Hot Deck Method Using K-nn)

  • 권순창
    • 한국IT서비스학회지
    • /
    • 제13권4호
    • /
    • pp.359-375
    • /
    • 2014
  • Researchers cannot avoid missing data in collecting data, because some respondents arbitrarily or non-arbitrarily do not answer questions in studies and experiments. Missing data not only increase and distort standard deviations, but also impair the convenience of estimating parameters and the reliability of research results. Despite widespread use of hot deck, researchers have not been interested in it, since it handles missing data in ambiguous ways. Hot deck can be complemented using K-nn, a method of machine learning, which can organize donor groups closest to properties of missing data. Interested in the role of k-nn, this study was conducted to impute missing data based on the hot deck method using k-nn. After setting up imputation of missing data based on hot deck using k-nn as a study objective, deletion of listwise, mean, mode, linear regression, and svm imputation were compared and verified regarding nominal and ratio data types and then, data closest to original values were obtained reasonably. Simulations using different neighboring numbers and the distance measuring method were carried out and better performance of k-nn was accomplished. In this study, imputation of hot deck was re-discovered which has failed to attract the attention of researchers. As a result, this study shall be able to help select non-parametric methods which are less likely to be affected by the structure of missing data and its causes.

영상정합을 이용한 교통밀도 수집방법과 수집 데이터 비교분석 (A study on the Traffic Density Collect System using View Synthesis and Data Analysis)

  • 박범진;노창균
    • 한국ITS학회 논문지
    • /
    • 제17권5호
    • /
    • pp.77-87
    • /
    • 2018
  • 교통밀도는 교통수요와 가장 직접적인 관계가 있으므로 거시적인 지표 중에서 가장 중요하다고 알려져 있으며(Traffic Engineering, 2004), 특정시각 주어진 거리 안의 존재하는 차량의 대수로 정의한다. 하지만, 밀도는 기상과 도로조건 및 비용 상의 문제로 인하여 현장에서 직접 수집이 어렵다. 이런 이유로 속도와 교통량보다 상대적으로 연구가 활발하게 이루어지지는 않아 수집방법에 관한 다양한 시도뿐만 아니라 측정된 값의 정확도에 관한 연구가 많이 부족하다. 이에 본 논문에서는 밀도를 측정할 수 있는 방법으로 여러 대의 카메라 영상을 정합(synthesis)하는 기술을 활용하였다. 이러한 밀도수집시스템으로 수집한 밀도를 정의에 기반한 참값으로 선정하고, 이 값을 전통적인 측정방법들로 산출한 밀도와 비교하였다. 비교결과, 관계식(fundamental equation)을 이용한 산출방법으로 산출한 밀도 값이 참값과 비교하여 RMSE(Root Mean Square Error)가 1.8-2.5로 가장 참값에 가깝다. 또한 향후 밀도를 직접 수집할 때 유의하여할 수집 간격 등의 간과하기 쉬운 이슈사항을 순간밀도와 평균밀도를 산출하여 알아보았다. 실험 사이트의 실제 교통상황은 LOS B임에도 불구하고, 초 당 순간밀도는 최대(16veh/km)에서 최소 2(veh/km)의 값이 다양하게 관측되어 교통상황의 판단이 어려웠다. 하지만 30초 간격으로 15분 평균밀도는 8.3-7.9(veh/km)로 정확하게 LOS B를 판단하였다.

한전계통의 송전망 고장확률 산정을 위한 상정고장 DB 관리시스텀(ezCas) 개발 (Development of Outage Data Management System to Calculate the Probability for KEPCO Transmission Systems)

  • 차승태;전동훈;김태균;전명렬;추진부;김진오;이승혁
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2004년도 하계학술대회 논문집 A
    • /
    • pp.88-90
    • /
    • 2004
  • Data are a critical utility asset. Collecting correct data on site leads to accurate information. Data, when gathered with foresight & properly formatted, are useful to both existing database and easily transferable to newer, more comprehensive historical outage data. However, when investigating data items options, the task, can be an arduous one, often requiring the efforts of entire committees. This paper firstly discusses the KEPCO's past 10 years of historical outage data which include meterological data, and also by several elements of the National Weather Service, failure rate, outage duration, and probability classification, etc. Then, these collected data are automatically stored in an Outage Data Management System (ODMS), which allows for easy access and display. ODMS has a straight-forward and easy-to-use interface. It lets you to navigate through modules very easily and allows insertion, deletion or editing of data. In particular, this will further provide the KEPCO that not only helps with probabilistic security assessment but also provides a platform for future development of Probability Estimation Program (PEP).

  • PDF

Implementation of Search Engine to Minimize Traffic Using Blockchain-Based Web Usage History Management System

  • Yu, Sunghyun;Yeom, Cheolmin;Won, Yoojae
    • Journal of Information Processing Systems
    • /
    • 제17권5호
    • /
    • pp.989-1003
    • /
    • 2021
  • With the recent increase in the types of services provided by Internet companies, collection of various types of data has become a necessity. Data collectors corresponding to web services profit by collecting users' data indiscriminately and providing it to the associated services. However, the data provider remains unaware of the manner in which the data are collected and used. Furthermore, the data collector of a web service consumes web resources by generating a large amount of web traffic. This traffic can damage servers by causing service outages. In this study, we propose a website search engine that employs a system that controls user information using blockchains and builds its database based on the recorded information. The system is divided into three parts: a collection section that uses proxy, a management section that uses blockchains, and a search engine that uses a built-in database. This structure allows data sovereigns to manage their data more transparently. Search engines that use blockchains do not use internet bots, and instead use the data generated by user behavior. This avoids generation of traffic from internet bots and can, thereby, contribute to creating a better web ecosystem.

IoT에서 중요한 데이터를 위한 쿼럼 기반 적응적 전파 알고리즘의 설계 및 평가 (Design and Evaluation of a Quorum-Based Adaptive Dissemination Algorithm for Critical Data in IoTs)

  • 배인한;노흥태
    • 한국멀티미디어학회논문지
    • /
    • 제22권8호
    • /
    • pp.913-922
    • /
    • 2019
  • The Internet of Things (IoT) envisions smart objects collecting and sharing data at a massive scale via the Internet. One challenging issue is how to disseminate data to relevant data consuming objects efficiently. In such a massive IoT network, Mission critical data dissemination imposes constraints on the message transfer delay between objects. Due to the low power and communication range of IoT objects, data is relayed over multi-hops before arriving at the destination. In this paper, we propose a quorum-based adaptive dissemination algorithm (QADA) for the critical data in the monitoring-based applications of massive IoTs. To design QADA, we first design a new stepped-triangular grid structures (sT-grid) that support data dissemination, then construct a triangular grid overlay in the fog layer on the lower IoT layer and propose the data dissemination algorithm of the publish/subscribe model that adaptively uses triangle grid (T-grid) and sT-grid quorums depending on the mission critical in the overlay constructed to disseminate the critical data, and evaluate its performance as an analytical model.

국제해사기구 데이터수집시스템 도입에 따른 MRV 지원시스템의 기술적 분석 (Technical Analysis of an MRV System in Relation to the Implementation of a Data Collection System by the International Maritime Organization)

  • 강남선;이정엽;홍연정;변상수;김진형
    • 해양환경안전학회지
    • /
    • 제23권1호
    • /
    • pp.122-129
    • /
    • 2017
  • 본 논문에서는 국제해사기구의 데이터수집 시스템 도입에 따른 MRV 지원 및 국제해운 에너지 효율 포탈 시스템에 대한 기술적 검증을 수행하였다. 데이터 수집 시스템과 연료 사용량 데이터 수집 방법론을 포함하는 SEEMP 가이드라인의 주요 내용을 검토하고, EU MRV와 차이점을 분석하여 MRV에 대한 국내 해운선사의 대응전략을 제시하고 이에 대한 MRV 지원시스템의 기술적 적합성을 검토하였다. MRV 지원시스템은 배출량 데이터를 원시단계에서 최종단계까지 통합 관리함으로서 검증을 위한 비용과 업무 효율성을 높일 수 있으며, 현재 해운선사의 보고절차를 유지하면서 데이터 변환 기능을 통해 표준양식으로 배출량 데이터를 수집하고 보고 할 수 있다. 또한, 포탈시스템에 대한 접근권한을 구분하여 해운선사의 데이터 수집과 보고, 검증자의 데이터 검증업무를 지원할 수 있으며 전자적인 방법으로 보고서 제출이 가능하여 MRV 국제 규제에 대한 대응이 가능하다.

조선산업의 비용분석 데이터 웨어하우스 시스템 개발 (Development of Data Warehouse Systems to Support Cost Analysis in the Ship Production)

  • 황성룡;김재균;장길상
    • 산업공학
    • /
    • 제15권2호
    • /
    • pp.159-171
    • /
    • 2002
  • Data Warehouses integrate data from multiple heterogeneous information sources and transform them into a multidimensional representation for decision support applications. Data warehousing has emerged as one of the most powerful tools in delivering information to users. Most previous researches have focused on marketing, customer service, financing, and insurance industry. Further, relatively less research has been done on data warehouse systems in the complex manufacturing industry such as ship production, which is characterized complex product structures and production processes. In the ship production, data warehouse systems is a requisite for effective cost analysis because collecting and analysis of diverse and large of cost-related(material/production cost, productivity) data in its operational systems, was becoming increasingly cumbersome and time consuming. This paper proposes architecture of the data warehouse systems to support cost analysis in the ship production. Also, in order to illustrate the usefulness of the proposed architecture, the prototype system is designed and implemented with the object of the enterprise of producing a large-scale ship.

PAPG: Private Aggregation Scheme based on Privacy-preserving Gene in Wireless Sensor Networks

  • Zeng, Weini;Chen, Peng;Chen, Hairong;He, Shiming
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제10권9호
    • /
    • pp.4442-4466
    • /
    • 2016
  • This paper proposes a privacy-preserving aggregation scheme based on the designed P-Gene (PAPG) for sensor networks. The P-Gene is constructed using the designed erasable data-hiding technique. In this P-Gene, each sensory data item may be hidden by the collecting sensor node, thereby protecting the privacy of this data item. Thereafter, the hidden data can be directly reported to the cluster head that aggregates the data. The aggregation result can then be recovered from the hidden data in the cluster head. The designed P-Genes can protect the privacy of each data item without additional data exchange or encryption. Given the flexible generation of the P-Genes, the proposed PAPG scheme adapts to dynamically changing reporting nodes. Apart from its favorable resistance to data loss, the extensive analyses and simulations demonstrate how the PAPG scheme efficiently preserves privacy while consuming less communication and computational overheads.

남성 음성 triphone DB 구축에 관한 연구 (Dialogic Male Voice Triphone DB Construction)

  • 김유진;백상훈;한민수;정재호
    • 한국음향학회지
    • /
    • 제15권2호
    • /
    • pp.61-71
    • /
    • 1996
  • 본 논문에서는 음성합성을 위한 대화체(회화체) 음성의 triphone 단위 데이터베이스 구축에 대하여 보고한다. 특히 본 연구는 방송 매체를 이용하여 대화체 음성을 수집하고 3차에 걸친 대화체 표기(transcription)작업을 거쳐 triphone 단위의 분할 및 음성기호층 단계의 레이블링을 목표로 진행되었다. 수집된 총 10시간 방송분량중 6시간 분량을 데이터베이스 구축에 사용하였으며, 나머지 4시간은 예비 분으로 수집되었다. 낭독체 음성 데이터베이스 구축과는 여러 면에서 다른, 대화체 음성 데이터베이스 구축을 위한 음성 데이터 수집에서부터 triphone 단위 레이블링까지의 과정을 본 논문에서 기술하고, 보다 체계적이고 일관성있는 대화체 음성 데이터베이스 구축을 위해 필요한 계획 및 요구 사항에 대해서 논하고자 한다.

  • PDF