• 제목/요약/키워드: small data

검색결과 10,870건 처리시간 0.043초

Bayes tests of independence for contingency tables from small areas

  • Jo, Aejung;Kim, Dal Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • 제28권1호
    • /
    • pp.207-215
    • /
    • 2017
  • In this paper we study pooling effects in Bayesian testing procedures of independence for contingency tables from small areas. In small area estimation setup, we typically use a hierarchical Bayesian model for borrowing strength across small areas. This techniques of borrowing strength in small area estimation is used to construct a Bayes test of independence for contingency tables from small areas. In specific, we consider the methods of direct or indirect pooling in multinomial models through Dirichlet priors. We use the Bayes factor (or equivalently the ratio of the marginal likelihoods) to construct the Bayes test, and the marginal density is obtained by integrating the joint density function over all parameters. The Bayes test is computed by performing a Monte Carlo integration based on the method proposed by Nandram and Kim (2002).

Prediction of small-scale leak flow rate in LOCA situations using bidirectional GRU

  • Hye Seon Jo;Sang Hyun Lee;Man Gyun Na
    • Nuclear Engineering and Technology
    • /
    • 제56권9호
    • /
    • pp.3594-3601
    • /
    • 2024
  • It is difficult to detect a small-scale leakage in a nuclear power plant (NPP) quickly and take appropriate action. Delaying these procedures can have adverse effects on NPPs. In this paper, we propose leak flow rate prediction using the bidirectional gated recurrent unit (Bi-GRU) method to detect leakage quickly and accurately in small-scale leakage situations because large-scale leak rates are known to be predicted accurately. The data were acquired by simulating small loss-of-coolant accidents (LOCA) or small-scale leakage situations using the modular accident analysis program (MAAP) code. In addition, to improve prediction performance, data were collected by distinguishing the break sizes in more detail. In addition, the prediction accuracy was improved by performing both LOCA diagnosis and leak flow rate prediction in small LOCA situations. The prediction model developed using the Bi-GRU showed a superior prediction performance compared with other artificial intelligence methods. Accordingly, the accurate and effective prediction model for small-scale leakage situations proposed herein is expected to support operators in decision-making and taking actions.

Applications of response dimension reduction in large p-small n problems

  • Minjee Kim;Jae Keun Yoo
    • Communications for Statistical Applications and Methods
    • /
    • 제31권2호
    • /
    • pp.191-202
    • /
    • 2024
  • The goal of this paper is to show how multivariate regression analysis with high-dimensional responses is facilitated by the response dimension reduction. Multivariate regression, characterized by multi-dimensional response variables, is increasingly prevalent across diverse fields such as repeated measures, longitudinal studies, and functional data analysis. One of the key challenges in analyzing such data is managing the response dimensions, which can complicate the analysis due to an exponential increase in the number of parameters. Although response dimension reduction methods are developed, there is no practically useful illustration for various types of data such as so-called large p-small n data. This paper aims to fill this gap by showcasing how response dimension reduction can enhance the analysis of high-dimensional response data, thereby providing significant assistance to statistical practitioners and contributing to advancements in multiple scientific domains.

소형 밀리미터파 레이더를 위한 실시간 데이터 전처리 방법 연구 (A Study on Real-time Data Preprocessing Technique for Small Millimeter Wave Radar)

  • 최진규;신영철;홍순일;박창현;김윤진;김홍락;권준범
    • 한국인터넷방송통신학회논문지
    • /
    • 제19권6호
    • /
    • pp.79-85
    • /
    • 2019
  • 최근 소형 레이더는 한번의 타격으로 표적의 시스템을 무능화시키기 위해 높은 거리해상도를 갖는 소형 밀리미터파 레이더 개발을 요구한다. 높은 거리해상도를 갖는 소형 밀리미터파 레이더가 표적을 획득하고, 추적하기 위해서는 대용량의 데이터를 실시간으로 처리해야한다. 본 논문에서는 소형 밀리미터파 레이더에서 요구하는 대용량의 데이터를 실시간으로 처리하기 위한 실시간 데이터 전처리 방법을 정리하였다. 또한 실시간 데이터 전처리 방법으로 제시한 디지털 IF(Intermediate Frequency) 수신기, Window처리, DFT(Discrete Fourier Transform)를 FPGA (Field Programmable Gate Array)를 활용하여 구현하였다. 마지막으로 구현한 실시간 데이터 전처리 모듈은 소형 밀리미터파 레이더를 위한 신호처리기에 적용하여 실시간 데이터 전처리 기능과 관련된 성능시험으로 검증하였다.

하둡 플랫폼을 이용한 대량의 스몰파일 처리방법 (Processing Method of Mass Small File Using Hadoop Platform)

  • 김창복;정재필
    • 한국항행학회논문지
    • /
    • 제18권4호
    • /
    • pp.401-408
    • /
    • 2014
  • 하둡(Hadoop)은 맵리듀스(MapReduce) 분산처리 프로그래밍 모델과 HDFS(Hadoop distributed file system) 분산 파일시스템으로 구성된다. 하둡은 빅데이터 처리에 적합한 프레임워크로서, 대량의 스몰파일 처리에 문제점이 있다. 하둡에서 대량의 스몰파일 처리는 하나의 파일마다 매퍼가 생성되며, 파일의 메타정보를 저장하기 위해 많은 메모리가 필요한 문제점이 있다. 본 논문은 하둡 플랫폼에서 다양한 방법으로 대량의 스몰파일 처리방법을 비교 검토하였다. 일반 압축은 데이터의 크기와 상관없이 하나의 매퍼로 처리해야 하기 때문에, 하둡 처리 포맷으로 적절하지 않다. 시퀀스 와 하둡 아카이브 파일의 처리는 스몰파일을 압축 및 병합을 통해 네임노드의 메모리 문제가 제거되었다. 하둡 아카이브 파일은 스몰파일의 병합시간이 시퀀스 파일보다 빠른 속도를 보였다. CombineFileInputFormat 클래스를 이용한 처리는 병합과정이 필요 없으며, 빅데이터 처리방법과 유사한 속도를 보였다.

Multi-Channel High Speed Data Link Design for Small SAR Satellite Image Data Transmission

  • Kwag, Young K.
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 ITC-CSCC -3
    • /
    • pp.1436-1439
    • /
    • 2002
  • In this paper, based on the data link model characterized by the spaceborne small SAR system, the high rate multi-channel data link module is designed including link storage, link processor, transmitter, and wide-angle antenna. The design results are presented with the performance analysis on the data link budget as well as the multi-mode data rate in association with the SAR imaging mode of operation from high resolution to the wide swath.

  • PDF

A DESIGN OF SMALL DATA UTILIZATION SYSTEM FORTHECOMS

  • Seo Seok-Bae;Ku In-Hoi;Kang Chi-Ho;Lirn Hyun-Su;Ahn Sang-IL
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2005년도 Proceedings of ISRS 2005
    • /
    • pp.268-271
    • /
    • 2005
  • COMS (Communication, Ocean, and Meteorological Satellite) will be launch at end of year 2008. For receiving of COMS LRlT, KARl (Korea Aerospace Research Institute) finished design and software realization of COMS SDUS (Small Data Utilization System). SDUS is a small station receiving LRlT data for distribute satellite image, weather information, and so on. For the future project, KARl preparing COMS MDUS (Mass Data Utilization System) that can receiving large size of data over than 2M BPS (Bit Per Seconds) data size.

  • PDF

ISO 19848 데이터 채널 표현과 선박 기관장비 고장·유지보수 유형 관리를 위한 코드화 기법 (An Encoding Method for Presentation of ISO 19848 Data Channel and Management of Ship Equipment Failure-Maintenance Types)

  • Hwang, Hun-Gyu;Woo, Yun-Tae;Kim, Bae-Sung;Shin, Il-Sik;Lee, Jang-Se
    • 한국정보통신학회논문지
    • /
    • 제24권1호
    • /
    • pp.134-137
    • /
    • 2020
  • Recently, there are emphasized to support the maintenance and management system of vessels using acquired data from engine part equipment. But, there are limitations for data exchange and management. To solve the problem, the ISO published ISO 19847 and 19848. In this paper, we analyze the ISO 19848 requirements related to identify data channel ID for ship equipment, and propose the examples for applying encoding techniques. In addition, we suggest the proposed technique for applying of managing the failure and maintenance type of the ship's engine part facilities by examples. If this method is applied, the vessel's equipment can exchange data through the sharing of the code table, and express what response is needed or acted, including where the failure occurred.

Access efficiency of small sized files in Big Data using various Techniques on Hadoop Distributed File System platform

  • Alange, Neeta;Mathur, Anjali
    • International Journal of Computer Science & Network Security
    • /
    • 제21권7호
    • /
    • pp.359-364
    • /
    • 2021
  • In recent years Hadoop usage has been increasing day by day. The need of development of the technology and its specified outcomes are eagerly waiting across globe to adopt speedy access of data. Need of computers and its dependency is increasing day by day. Big data is exponentially growing as the entire world is working in online mode. Large amount of data has been produced which is very difficult to handle and process within a short time. In present situation industries are widely using the Hadoop framework to store, process and produce at the specified time with huge amount of data that has been put on the server. Processing of this huge amount of data having small files & its storage optimization is a big problem. HDFS, Sequence files, HAR, NHAR various techniques have been already proposed. In this paper we have discussed about various existing techniques which are developed for accessing and storing small files efficiently. Out of the various techniques we have specifically tried to implement the HDFS- HAR, NHAR techniques.

Flow Assessment and Prediction in the Asa River Watershed using different Artificial Intelligence Techniques on Small Dataset

  • Kareem Kola Yusuff;Adigun Adebayo Ismail;Park Kidoo;Jung Younghun
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2023년도 학술발표회
    • /
    • pp.95-95
    • /
    • 2023
  • Common hydrological problems of developing countries include poor data management, insufficient measuring devices and ungauged watersheds, leading to small or unreliable data availability. This has greatly affected the adoption of artificial intelligence techniques for flood risk mitigation and damage control in several developing countries. While climate datasets have recorded resounding applications, but they exhibit more uncertainties than ground-based measurements. To encourage AI adoption in developing countries with small ground-based dataset, we propose data augmentation for regression tasks and compare performance evaluation of different AI models with and without data augmentation. More focus is placed on simple models that offer lesser computational cost and higher accuracy than deeper models that train longer and consume computer resources, which may be insufficient in developing countries. To implement this approach, we modelled and predicted streamflow data of the Asa River Watershed located in Ilorin, Kwara State Nigeria. Results revealed that adequate hyperparameter tuning and proper model selection improve streamflow prediction on small water dataset. This approach can be implemented in data-scarce regions to ensure timely flood intervention and early warning systems are adopted in developing countries.

  • PDF