• 제목/요약/키워드: Synthetic data generation

검색결과 115건 처리시간 0.027초

임의의 표본상호상관함수와 비정규확률분포를 갖는 다중 난류시계열의 디지털 합성방법을 이용한 풍속데이터 시뮬레이션 (Wind Data Simulation Using Digital Generation of Non-Gaussian Turbulence Multiple Time Series with Specified Sample Cross Correlations)

  • 성승학;김욱;김경천;부정숙
    • 한국대기환경학회지
    • /
    • 제19권5호
    • /
    • pp.569-581
    • /
    • 2003
  • A method of synthetic time series generation was developed and applied to the simulation of homogeneous turbulence in a periodic 3 - D box and the hourly wind data simulation. The method can simulate almost exact sample auto and cross correlations of multiple time series and control non-Gaussian distribution. Using the turbulence simulation, influence of correlations, non-Gaussian distribution, and one-direction anisotropy on homogeneous structure were studied by investigating the spatial distribution of turbulence kinetic energy and enstrophy. An hourly wind data of Typhoon Robin was used to illustrate a capability of the method to simulate sample cross correlations of multiple time series. The simulated typhoon data shows a similar shape of fluctuations and almost exactly the same sample auto and cross correlations of the Robin.

은닉 마르코프 모델을 이용하여 계절의 변동을 동반한 인공 바람자료 생성 및 검증 (Generation and Verification of Synthetic Wind Data With Seasonal Fluctuation Using Hidden Markov Model)

  • 박석영;유기완
    • 한국항공우주학회지
    • /
    • 제49권12호
    • /
    • pp.963-969
    • /
    • 2021
  • 풍력발전단지 위치 선정에 있어 풍속 분포 및 발전량을 평가하기 위해 해당 지역의 기상 타워에서 계측된 바람 자료를 이용한다. 그러나 기상 타워에서 계측된 바람 자료는 종종 정보가 누락되거나 원하는 높이에 맞지 않거나, 혹은 데이터 길이가 충분하지 않아 풍력터빈 제어 및 성능 시뮬레이션 수행에 어려움을 겪게 된다. 따라서 풍력터빈 혹은 발전단지에 대한 연간 발전량 및 이용률을 평가하는데 원하는 높이에서 장기간의 연속적인 바람 자료는 매우 중요하다. 또한, 한반도와 같이 계절에 따른 풍향과 풍속 변동이 뚜렷한 경우에는 계절별 특징이 고려된 풍속과 풍향을 동반한 바람 자료를 고려해야 한다. 본 연구에서는 통계적 방법인 은닉 마르코프 모델을 이용하여 풍속과 풍향의 변동을 고려한 인공 바람을 생성하기 위한 방법을 제시한다. 통계처리를 위한 바람 자료는 전라북도 고군산군도에 있는 말도의 기상청 방재기상관측(AWS) 장비에서 계측된 자료를 사용한다. 은닉 마르코프 모델에 의해 생성된 인공 바람은 통계 변수, 풍력에너지밀도, 계절별 평균 풍속, 주 풍향 등을 계측 자료와 비교를 통해 검증하기로 한다.

Automatic Generation of Training Character Samples for OCR Systems

  • Le, Ha;Kim, Soo-Hyung;Na, In-Seop;Do, Yen;Park, Sang-Cheol;Jeong, Sun-Hwa
    • International Journal of Contents
    • /
    • 제8권3호
    • /
    • pp.83-93
    • /
    • 2012
  • In this paper, we propose a novel method that automatically generates real character images to familiarize existing OCR systems with new fonts. At first, we generate synthetic character images using a simple degradation model. The synthetic data is used to train an OCR engine, and the trained OCR is used to recognize and label real character images that are segmented from ideal document images. Since the OCR engine is unable to recognize accurately all real character images, a substring matching method is employed to fix wrongly labeled characters by comparing two strings; one is the string grouped by recognized characters in an ideal document image, and the other is the ordered string of characters which we are considering to train and recognize. Based on our method, we build a system that automatically generates 2350 most common Korean and 117 alphanumeric characters from new fonts. The ideal document images used in the system are postal envelope images with characters printed in ascending order of their codes. The proposed system achieved a labeling accuracy of 99%. Therefore, we believe that our system is effective in facilitating the generation of numerous character samples to enhance the recognition rate of existing OCR systems for fonts that have never been trained.

Toward Complete Bacterial Genome Sequencing Through the Combined Use of Multiple Next-Generation Sequencing Platforms

  • Jeong, Haeyoung;Lee, Dae-Hee;Ryu, Choong-Min;Park, Seung-Hwan
    • Journal of Microbiology and Biotechnology
    • /
    • 제26권1호
    • /
    • pp.207-212
    • /
    • 2016
  • PacBio's long-read sequencing technologies can be successfully used for a complete bacterial genome assembly using recently developed non-hybrid assemblers in the absence of second-generation, high-quality short reads. However, standardized procedures that take into account multiple pre-existing second-generation sequencing platforms are scarce. In addition to Illumina HiSeq and Ion Torrent PGM-based genome sequencing results derived from previous studies, we generated further sequencing data, including from the PacBio RS II platform, and applied various bioinformatics tools to obtain complete genome assemblies for five bacterial strains. Our approach revealed that the hierarchical genome assembly process (HGAP) non-hybrid assembler resulted in nearly complete assemblies at a moderate coverage of ~75x, but that different versions produced non-compatible results requiring post processing. The other two platforms further improved the PacBio assembly through scaffolding and a final error correction.

Computational evaluation of wind loads on a standard tall building using LES

  • Dagnew, Agerneh K.;Bitsuamlak, Girma T.
    • Wind and Structures
    • /
    • 제18권5호
    • /
    • pp.567-598
    • /
    • 2014
  • In this paper, wind induced aerodynamic loads on a standard tall building have been evaluated through large-eddy simulation (LES) technique. The flow parameters of an open terrain were recorded from the downstream of an empty boundary layer wind tunnel (BLWT) and used to prescribe the transient inlet boundary of the LES simulations. Three different numerically generated inflow boundary conditions have been investigated to assess their suitability for LES. A high frequency pressure integration (HFPI) approach has been employed to obtain the wind load. A total of 280 pressure monitoring points have been systematically distributed on the surfaces of the LES model building. Similar BLWT experiments were also done to validate the numerical results. In addition, the effects of adjacent buildings were studied. Among the three wind field generation methods (synthetic, Simirnov's, and Lund's recycling method), LES with perturbation from the synthetic random flow approach showed better agreement with the BLWT data. In general, LES predicted peak wind loads comparable with the BLWT data, with a maximum difference of 15% and an average difference of 5%, for an isolated building case and however higher estimation errors were observed for cases where adjacent buildings were placed in the vicinity of the study building.

실내공간 이동객체 궤적 생성기 (Synthetic Trajectory Generation Tool for Indoor Moving Objects)

  • 류형규;김수진;이기준
    • 대한공간정보학회지
    • /
    • 제24권4호
    • /
    • pp.59-66
    • /
    • 2016
  • 이동객체에 관한 연구를 위하여서는 이동객체 데이터가 필요하다. 예를 들어 이동객체 질의처리 방법의 성능연구를 위하여서는 이동객체의 벤치마크 데이터가 있어야 실험이 가능하다. 이러한 이유로 도로나 실외 공간을 움직이는 가상의 이동객체를 성성하는 도구가 만들어졌다. 반면에 실내공간은 실외공간과 달리 독특한 특징을 가지고 있으며, 실내공간 이동객체 데이터 생성기는 이를 반영하여 만들어져야 한다. 지금까지 몇 개의 실내공간에 대한 이동객체 생성기가 개발되었으나, 이동궤적이 사실적이지 않은 문제점이 있다. 이러한 배경에서 본 논문에서는 실내공간의 가상적 이동객체를 생성하는 도구를 소개한다. 이 도구는 다음과 같은 특징을 가지고 있다. 첫번째, 이동객체는 보행자를 위하여 설정하였다. 두 번째로 다양한 이동객체의 요소를 변수모델로 표현할 수 있도록 하였다. 보행자의 수, 보행자 평균속도와 같이 단순한 것에서 보행자 사이의 최소거리, 이동 패턴과 같은 복잡한 내용을 사용자가 변수로 설정할 수 있도록 하였다. 세 번째로, 보행자의 현실적인 특징을 반영하도록 노력하였다. 그리고, 마직막으로 데이터의 상호운영성을 위하여 국제공간정보 표준인 IndoorGML로 표현된 실제 대규모 쇼핑몰의 실내공간을 대상으로 이동객체 데이터의 생성을 적용하여보았다.

비모수적 추계학적 일 강우 발생기 기반의 빗물이용시설 신뢰도 평가모형의 부산광역시 해운대 신시가지 적용 (Application of Rainwater Harvesting System Reliability Model Based on Non-parametric Stochastic Daily Rainfall Generator to Haundae District of Busan)

  • 최치현;박무종;백천우;김상단
    • 한국물환경학회지
    • /
    • 제27권5호
    • /
    • pp.634-645
    • /
    • 2011
  • A newly developed rainwater harvesting (RWH) system reliability model is evaluated for roof area of buildings in Haeundae District of Busan. RWH system is used to supply water for toilet flushing, back garden irrigation, and air cooling. This model is portable because it is based on a non-parametric precipitation generation algorithm using a markov chain. Precipitation occurrence is simulated using transition probabilities derived for each day of the year based on the historical probability of wet and dry day state changes. Precipitation amounts are selected from a matrix of historical values within a moving 30 day window that is centered on the target day. Then, the reliability of RWH system is determined for catchment area and tank volume ranges using synthetic precipitation data. As a result, the synthetic rainfall data well reproduced the characteristics of precipitation in Busan. Also the reliabilities of RWH system for each of demands were computed to high values. Furthermore, for study area using the RWH system, reduction efficiencies for rooftop runoff inputs to the sewer system and potable water demand are evaluated for 23%, 53%, respectively.

A new clustering algorithm based on the connected region generation

  • Feng, Liuwei;Chang, Dongxia;Zhao, Yao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권6호
    • /
    • pp.2619-2643
    • /
    • 2018
  • In this paper, a new clustering algorithm based on the connected region generation (CRG-clustering) is proposed. It is an effective and robust approach to clustering on the basis of the connectivity of the points and their neighbors. In the new algorithm, a connected region generating (CRG) algorithm is developed to obtain the connected regions and an isolated point set. Each connected region corresponds to a homogeneous cluster and this ensures the separability of an arbitrary data set theoretically. Then, a region expansion strategy and a consensus criterion are used to deal with the points in the isolated point set. Experimental results on the synthetic datasets and the real world datasets show that the proposed algorithm has high performance and is insensitive to noise.

SAR Processing Software for Ground Station

  • Kwak, Sung-Hee;Lee, Young-Ran;Shin, Dong-Seok;Park, Won-Kyu
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2003년도 Proceedings of ACRS 2003 ISRS
    • /
    • pp.634-636
    • /
    • 2003
  • Satrec Initiative (Si) is developing a ground processing system for Synthetic Aperture Radar (SAR) data. SAR provides its own illumination and is not dependent on the light from sun, thus permitting continuous day/night operation and all-weather imaging. The system is capable of producing standard level products from SAR signal. Hence, the system should be able to perform matched filtering, range compression, azimuth compression, multi-look image generation, and geocoded image generation. This paper will describe the processing steps including algorithms, design, and accuracy of the Si's SAR processing system by comparing with commercial software.

  • PDF

Med-StyleGAN2: 의료 영상 생성을 위한 GAN 기반의 합성 데이터 생성 (Med-StyleGAN2: A GAN-Based Synthetic Data Generation for Medical Image Generation)

  • 최재하;김성연;변해린;이세연;이정수
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2023년도 추계학술발표대회
    • /
    • pp.904-905
    • /
    • 2023
  • 본 논문에서는 의료 영상 생성을 위한 Med-StyleGAN2를 제안한다. 생성적 적대 신경망은 이미지 생성에는 효과적이지만, 의료 영상 생성에는 한계점을 가지고 있다. 따라서 본 연구에서는 의료 영상 생성에 특화된 StyleGAN 기반 학습 모델을 제안한다. 이는 다양한 의료 영상 어플리케이션에 활용할 수 있으며, 생성된 의료 영상에 대한 정량적, 정성적 평가를 수행함으로써 의료 영상 생성 분야의 발전 가능성에 대해 연구한다.