• 제목/요약/키워드: Data Normalization

검색결과 483건 처리시간 0.026초

대규모 외생 변수 및 Deep Neural Network 기반 금융 시장 예측 및 성능 향상 (Financial Market Prediction and Improving the Performance Based on Large-scale Exogenous Variables and Deep Neural Networks)

  • 천성길;이주홍;최범기;송재원
    • 스마트미디어저널
    • /
    • 제9권4호
    • /
    • pp.26-35
    • /
    • 2020
  • 미래의 주가를 예측하기 위한 시도는 과거부터 꾸준히 연구되어왔다. 그러나 일반적인 시계열 데이터와 달리 금융 시계열 비정상성(non-stationarity)과 장기 의존성(long-term dependency), 비선형성(non-linearity) 등 예측을 하는 것에 있어서 여러 가지 방해 요인이 존재한다. 또한, 광범위한 데이터의 변수는 기존에 사람이 직접 선택하는 것에 한계가 있으며 모델이 변수를 자동으로 잘 추출할 수 있도록 하여야 한다. 본 논문에서는 비정상성 데이터를 정규화할 수 있는 슬라이딩 타임스텝 정규화(sliding time step normalization) 방법과 LSTM 형태의 오토인코더(AutoEncoder)를 사용하여 모든 변수로부터 압축된 변수로 미래 주가를 예측하는 방법, 기간을 나누어 전이 학습을 하는 이동 전이 학습(moving transfer learning)을 제안한다. 또한, 실험을 통하여 100개의 주요 금융 변수들만을 사용하는 것보다 뉴럴 네트워크를 통해서 가능한 많은 변수를 사용하였을 때 성능이 우수함을 보이며, 슬라이딩 타임스텝 정규화 방법을 사용하여 모든 구간에서 데이터의 비정상성에 대해 정규화를 수행함으로써 성능 향상에 효과적임을 보인다. 이동 전이 학습 방법은 스텝 별 테스트 구간에서 모델의 성능을 평가하고 전이학습을 함으로써 긴 테스트 구간에서 성능 향상에 효과적임을 보인다.

복부 CT 영상에서 밝기값 정규화 및 Faster R-CNN을 이용한 자동 췌장 검출 (Automatic Pancreas Detection on Abdominal CT Images using Intensity Normalization and Faster R-CNN)

  • 최시은;이성은;홍헬렌
    • 한국멀티미디어학회논문지
    • /
    • 제24권3호
    • /
    • pp.396-405
    • /
    • 2021
  • In surgery to remove pancreatic cancer, it is important to figure out the shape of a patient's pancreas. However, previous studies have a limit to detect a pancreas automatically in abdominal CT images, because the pancreas varies in shape, size and location by patient. Therefore, in this paper, we propose a method of learning various shapes of pancreas according to the patients and adjacent slices using Faster R-CNN based on Inception V2, and automatically detecting the pancreas from abdominal CT images. Model training and testing were performed using the NIH Pancreas-CT Dataset, and intensity normalization was applied to all data to improve pancreatic detection accuracy. Additionally, according to the shape of the pancreas, the test dataset was classified into top, middle, and bottom slices to evaluate the model's performance on each data. The results show that the top data's mAP@.50IoU achieved 91.7% and the bottom data's mAP@.50IoU achieved 95.4%, and the highest performance was the middle data's mAP@.50IoU, 98.5%. Thus, we have confirmed that the model can accurately detect the pancreas in CT images.

Human Normalization Approach based on Disease Comparative Prediction Model between Covid-19 and Influenza

  • Janghwan Kim;Min-Yong Jung;Da-Yun Lee;Na-Hyeon Cho;Jo-A Jin;R. Young-Chul Kim
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제15권3호
    • /
    • pp.32-42
    • /
    • 2023
  • There are serious problems worldwide, such as a pandemic due to an unprecedented infection caused by COVID-19. On previous approaches, they invented medical vaccines and preemptive testing tools for medical engineering. However, it is difficult to access poor medical systems and medical institutions due to disparities between countries and regions. In advanced nations, the damage was even greater due to high medical and examination costs because they did not go to the hospital. Therefore, from a software engineering-based perspective, we propose a learning model for determining coronavirus infection through symptom data-based software prediction models and tools. After a comparative analysis of various models (decision tree, Naive Bayes, KNN, multi-perceptron neural network), we decide to choose an appropriate decision tree model. Due to a lack of data, additional survey data and overseas symptom data are applied and built into the judgment model. To protect from thiswe also adapt human normalization approach with traditional Korean medicin approach. We expect to be possible to determine coronavirus, flu, allergy, and cold without medical examination and diagnosis tools through data collection and analysis by applying decision trees.

Research on prediction and analysis of supercritical water heat transfer coefficient based on support vector machine

  • Ma Dongliang;Li Yi;Zhou Tao;Huang Yanping
    • Nuclear Engineering and Technology
    • /
    • 제55권11호
    • /
    • pp.4102-4111
    • /
    • 2023
  • In order to better perform thermal hydraulic calculation and analysis of supercritical water reactor, based on the experimental data of supercritical water, the model training and predictive analysis of the heat transfer coefficient of supercritical water were carried out by using the support vector machine (SVM) algorithm. The changes in the prediction accuracy of the supercritical water heat transfer coefficient are analyzed by the changes of the regularization penalty parameter C, the slack variable epsilon and the Gaussian kernel function parameter gamma. The predicted value of the SVM model obtained after parameter optimization and the actual experimental test data are analyzed for data verification. The research results show that: the normalization of the data has a great influence on the prediction results. The slack variable has a relatively small influence on the accuracy change range of the predicted heat transfer coefficient. The change of gamma has the greatest impact on the accuracy of the heat transfer coefficient. Compared with the calculation results of traditional empirical formula methods, the trained algorithm model using SVM has smaller average error and standard deviations. Using the SVM trained algorithm model, the heat transfer coefficient of supercritical water can be effectively predicted and analyzed.

베이지안 기법을 적용한 마이크로어레이 데이터 분류 알고리즘 설계와 구현 (The Algorithm Design and Implement of Microarray Data Classification using the Byesian Method)

  • 박수영;정채영
    • 한국정보통신학회논문지
    • /
    • 제10권12호
    • /
    • pp.2283-2288
    • /
    • 2006
  • 최근 생명 정보학 기술의 발달로 마이크로 단위의 실험조작이 가능해짐에 따라 하나의 chip상에서 전체 genome의 expression pattern을 관찰할 수 있게 되었고, 동시에 수 만개의 유전자들 간의 상호작용도 연구 가능하게 되었다. 이처럼 DNA 마이크로어레이 기술은 복잡한 생물체를 이해하는 새로운 방향을 제시해주게 되었다. 따라서 이러한 기술을 통해 얻어진 대량의 유전자 정보들을 효과적으로 분석하는 방법이 시급하다. 본 논문에서는 실험용 데이터로 하버드대학교의 바이오인포메틱스 코어 그룹의 샘플데이터 이용하여 마이크로어레이 실험에서 다양한 원인에 의해 발생하는 잡음(noise)을 줄이거나 제거하는 과정인 표준화 과정을 거쳐 특징 추출방법인 베이지안 알고리즘 ASA(Adaptive Simulated Annealing) 방법을 이용하여 데이터를 2개의 클래스로 나누고, 정확도를 평가하는 시스템을 설계하고 구현하였다. Lowess 표준화 후 98.23%의 정확도를 보였다.

An Efficiency Assessment for Reflectance Normalization of RapidEye Employing BRD Components of Wide-Swath satellite

  • Kim, Sang-Il;Han, Kyung-Soo;Yeom, Jong-Min
    • 대한원격탐사학회지
    • /
    • 제27권3호
    • /
    • pp.303-314
    • /
    • 2011
  • Surface albedo is an important parameter of the surface energy budget, and its accurate quantification is of major interest to the global climate modeling community. Therefore, in this paper, we consider the direct solution of kernel based bidirectional reflectance distribution function (BRDF) models for retrieval of normalized reflectance of high resolution satellite. The BRD effects can be seen in satellite data having a wide swath such as SPOT/VGT (VEGETATION) have sufficient angular sampling, but high resolution satellites are impossible to obtain sufficient angular sampling over a pixel during short period because of their narrow swath scanning when applying semi-empirical model. This gives a difficulty to run BRDF model inferring the reflectance normalization of high resolution satellites. The principal purpose of the study is to estimate normalized reflectance of high resolution satellite (RapidEye) through BRDF components from SPOT/VGT. We use semi-empirical BRDF model to estimated BRDF components from SPOT/VGT and reflectance normalization of RapidEye. This study used SPOT/VGT satellite data acquired in the S1 (daily) data, and within this study is the multispectral sensor RapidEye. Isotropic value such as the normalized reflectance was closely related to the BRDF parameters and the kernels. Also, we show scatter plot of the SPOT/VGT and RapidEye isotropic value relationship. The linear relationship between the two linear regression analysis is performed by using the parameters of SPOTNGT like as isotropic value, geometric value and volumetric scattering value, and the kernel values of RapidEye like as geometric and volumetric scattering kernel Because BRDF parameters are difficult to directly calculate from high resolution satellites, we use to BRDF parameter of SPOT/VGT. Also, we make a decision of weighting for geometric value, volumetric scattering value and error through regression models. As a result, the weighting through linear regression analysis produced good agreement. For all sites, the SPOT/VGT isotropic and RapidEye isotropic values had the high correlation (RMSE, bias), and generally are very consistent.

도로기상차량으로 관측한 노면온도자료를 이용한 도로살얼음 취약 구간 산정 (Estimation of Road Sections Vulnerable to Black Ice Using Road Surface Temperatures Obtained by a Mobile Road Weather Observation Vehicle)

  • 박문수;강민수;김상헌;정현채;장성빈;유동길;류성현
    • 대기
    • /
    • 제31권5호
    • /
    • pp.525-537
    • /
    • 2021
  • Black ices on road surfaces in winter tend to cause severe and terrible accidents. It is very difficult to detect black ice events in advance due to their localities as well as sensitivities to surface and upper meteorological variables. This study develops a methodology to detect the road sections vulnerable to black ice with the use of road surface temperature data obtained from a mobile road weather observation vehicle. The 7 experiments were conducted on the route from Nam-Wonju IC to Nam-Andong IC (132.5 km) on the Jungang Expressway during the period from December 2020 to February 2021. Firstly, temporal road surface temperature data were converted to the spatial data with a 50 m resolution. Then, the spatial road surface temperature was normalized with zero mean and one standard deviation using a simple normalization, a linear de-trend and normalization, and a low-pass filter and normalization. The resulting road thermal map was calculated in terms of road surface temperature differences. A road ice index was suggested using the normalized road temperatures and their horizontal differences. Road sections vulnerable to black ice were derived from road ice indices and verified with respect to road geometry and sky view, etc. It was found that black ice could occur not only over bridges, but also roads with a low sky view factor. These results are expected to be applicable to the alarm service for black ice to drivers.

Evaluation of Potential Reference Genes for Quantitative RT-PCR Analysis in Fusarium graminearum under Different Culture Conditions

  • Kim, Hee-Kyoung;Yun, Sung-Hwan
    • The Plant Pathology Journal
    • /
    • 제27권4호
    • /
    • pp.301-309
    • /
    • 2011
  • The filamentous fungus Fusarium graminearum is an important cereal pathogen. Although quantitative realtime PCR (qRT-PCR) is commonly used to analyze the expression of important fungal genes, no detailed validation of reference genes for the normalization of qRT-PCR data has been performed in this fungus. Here, we evaluated 15 candidate genes as references, including those previously described as housekeeping genes and those selected from the whole transcriptome sequencing data. By a combination of three statistical algorithms (BestKeeper, geNorm, and NormFinder), the variation in the expression of these genes was assessed under different culture conditions that favored mycelial growth, sexual development, and trichothecene mycotoxin production. When favoring mycelial growth, GzFLO and GzUBH expression were most stable in complete medium. Both EF1A and GzRPS16 expression were relatively stable under all conditions on carrot agar, including mycelial growth and the subsequent perithecial induction stage. These two genes were also most stable during trichothecene production. For the combined data set, GzUBH and EF1A were selected as the most stable. Thus, these genes are suitable reference genes for accurate normalization of qRT-PCR data for gene expression analyses of F. graminearum and other related fungi.

점프유형에 따른 하지의 근 활동 형태연구(근전도 데이터 표준화 방법을 중심으로) (Analysis of Muscle Activities of Lower Extremity in Jumping Pattern)

  • 이성철;황인승;조영재;김선정
    • 한국운동역학회지
    • /
    • 제15권2호
    • /
    • pp.155-165
    • /
    • 2005
  • The purpose of this study was to compare the muscle activities of Double Legged Jump (DLJ) and Single Legged Jump (SLJ) by the normalization of muscle activity. Eight college students without the lower extremity injuries were selected as subjects for collecting EMG data of vastus medialis and gastrocnemius. The entire section of motion was established as eccentric and concentric contractions, and each of the contractions was divided into three sections with equal timing intervals, which becomes a total of 6 phases. The EMG data of each phase was integrated and normalized. The muscle activities of the vastus medialis for both eccentric and concentric contractions were significantly different between DLJ and SLJ(p<.05). The increase in overall muscle activity of SLJ was 33.6%. Approximately, there was an increase of 25.9% in eccentric contraction and 40% in concentric contraction. Moreover, the data of the muscle activity of gastrocnemius was similar to the data of the muscle activity of vastus medialis. In conclusion, this research suggests muscle activity of a certain motion can be normalized for an analysis of another motion.

세 가지 드리프트 보정 기법을 이용한 단기 센서 드리프트 보정 (Short term Sensor's Drift Compensation by using Three Drift Correction Techniques)

  • 전진영;최장식;변형기
    • 센서학회지
    • /
    • 제25권4호
    • /
    • pp.291-296
    • /
    • 2016
  • The ideal chemical sensor must show the similar result under the same condition for accurate measurement of gases regardless of time. However, the actual responses of chemical sensors have been shown the lacks of repeatability and reproducibility because of the drift which has been caused by aging and pollution of the sensor and the environment change such as temperature and humidity. If the problems are not properly taken into considerations, the stability and reliability of the system using chemical sensors would be decreased. In this paper, we analyzed the sensor's drift and applied the three different compensation methods(DWT( Discrete Wavelets Transform), Baseline Manipulation, Internal Normalization) for reducing the effects of the drift in order to improve the stability and the reliability of short term of the chemical sensors. And in order to compare the results of the methods, the standard deviation was used as a criterion. The sensor drift was analyzed by a trend line graph. We applied the three methods to the successive data measured for three days and compared the results. As a result of comparison, the standard deviation of DWT showed lowest value. (Before compensation: 7.1219, DWT: 1.3644, Baseline Manipulation: 2.5209, Internal Normalization: 3.1425).