• Title/Summary/Keyword: Data Normalization

Search Result 481, Processing Time 0.029 seconds

Financial Market Prediction and Improving the Performance Based on Large-scale Exogenous Variables and Deep Neural Networks (대규모 외생 변수 및 Deep Neural Network 기반 금융 시장 예측 및 성능 향상)

  • Cheon, Sung Gil;Lee, Ju Hong;Choi, Bum Ghi;Song, Jae Won
    • Smart Media Journal
    • /
    • v.9 no.4
    • /
    • pp.26-35
    • /
    • 2020
  • Attempts to predict future stock prices have been studied steadily since the past. However, unlike general time-series data, financial time-series data has various obstacles to making predictions such as non-stationarity, long-term dependence, and non-linearity. In addition, variables of a wide range of data have limitations in the selection by humans, and the model should be able to automatically extract variables well. In this paper, we propose a 'sliding time step normalization' method that can normalize non-stationary data and LSTM autoencoder to compress variables from all variables. and 'moving transfer learning', which divides periods and performs transfer learning. In addition, the experiment shows that the performance is superior when using as many variables as possible through the neural network rather than using only 100 major financial variables and by using 'sliding time step normalization' to normalize the non-stationarity of data in all sections, it is shown to be effective in improving performance. 'moving transfer learning' shows that it is effective in improving the performance in long test intervals by evaluating the performance of the model and performing transfer learning in the test interval for each step.

Automatic Pancreas Detection on Abdominal CT Images using Intensity Normalization and Faster R-CNN (복부 CT 영상에서 밝기값 정규화 및 Faster R-CNN을 이용한 자동 췌장 검출)

  • Choi, Si-Eun;Lee, Seong-Eun;Hong, Helen
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.3
    • /
    • pp.396-405
    • /
    • 2021
  • In surgery to remove pancreatic cancer, it is important to figure out the shape of a patient's pancreas. However, previous studies have a limit to detect a pancreas automatically in abdominal CT images, because the pancreas varies in shape, size and location by patient. Therefore, in this paper, we propose a method of learning various shapes of pancreas according to the patients and adjacent slices using Faster R-CNN based on Inception V2, and automatically detecting the pancreas from abdominal CT images. Model training and testing were performed using the NIH Pancreas-CT Dataset, and intensity normalization was applied to all data to improve pancreatic detection accuracy. Additionally, according to the shape of the pancreas, the test dataset was classified into top, middle, and bottom slices to evaluate the model's performance on each data. The results show that the top data's mAP@.50IoU achieved 91.7% and the bottom data's mAP@.50IoU achieved 95.4%, and the highest performance was the middle data's mAP@.50IoU, 98.5%. Thus, we have confirmed that the model can accurately detect the pancreas in CT images.

Human Normalization Approach based on Disease Comparative Prediction Model between Covid-19 and Influenza

  • Janghwan Kim;Min-Yong Jung;Da-Yun Lee;Na-Hyeon Cho;Jo-A Jin;R. Young-Chul Kim
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.15 no.3
    • /
    • pp.32-42
    • /
    • 2023
  • There are serious problems worldwide, such as a pandemic due to an unprecedented infection caused by COVID-19. On previous approaches, they invented medical vaccines and preemptive testing tools for medical engineering. However, it is difficult to access poor medical systems and medical institutions due to disparities between countries and regions. In advanced nations, the damage was even greater due to high medical and examination costs because they did not go to the hospital. Therefore, from a software engineering-based perspective, we propose a learning model for determining coronavirus infection through symptom data-based software prediction models and tools. After a comparative analysis of various models (decision tree, Naive Bayes, KNN, multi-perceptron neural network), we decide to choose an appropriate decision tree model. Due to a lack of data, additional survey data and overseas symptom data are applied and built into the judgment model. To protect from thiswe also adapt human normalization approach with traditional Korean medicin approach. We expect to be possible to determine coronavirus, flu, allergy, and cold without medical examination and diagnosis tools through data collection and analysis by applying decision trees.

Research on prediction and analysis of supercritical water heat transfer coefficient based on support vector machine

  • Ma Dongliang;Li Yi;Zhou Tao;Huang Yanping
    • Nuclear Engineering and Technology
    • /
    • v.55 no.11
    • /
    • pp.4102-4111
    • /
    • 2023
  • In order to better perform thermal hydraulic calculation and analysis of supercritical water reactor, based on the experimental data of supercritical water, the model training and predictive analysis of the heat transfer coefficient of supercritical water were carried out by using the support vector machine (SVM) algorithm. The changes in the prediction accuracy of the supercritical water heat transfer coefficient are analyzed by the changes of the regularization penalty parameter C, the slack variable epsilon and the Gaussian kernel function parameter gamma. The predicted value of the SVM model obtained after parameter optimization and the actual experimental test data are analyzed for data verification. The research results show that: the normalization of the data has a great influence on the prediction results. The slack variable has a relatively small influence on the accuracy change range of the predicted heat transfer coefficient. The change of gamma has the greatest impact on the accuracy of the heat transfer coefficient. Compared with the calculation results of traditional empirical formula methods, the trained algorithm model using SVM has smaller average error and standard deviations. Using the SVM trained algorithm model, the heat transfer coefficient of supercritical water can be effectively predicted and analyzed.

The Algorithm Design and Implement of Microarray Data Classification using the Byesian Method (베이지안 기법을 적용한 마이크로어레이 데이터 분류 알고리즘 설계와 구현)

  • Park, Su-Young;Jung, Chai-Yeoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.12
    • /
    • pp.2283-2288
    • /
    • 2006
  • As development in technology of bioinformatics recently makes it possible to operate micro-level experiments, we can observe the expression pattern of total genome through on chip and analyze the interactions of thousands of genes at the same time. Thus, DNA microarray technology presents the new directions of understandings for complex organisms. Therefore, it is required how to analyze the enormous gene information obtained through this technology effectively. In this thesis, We used sample data of bioinformatics core group in harvard university. It designed and implemented system that evaluate accuracy after dividing in class of two using Bayesian algorithm, ASA, of feature extraction method through normalization process, reducing or removing of noise that occupy by various factor in microarray experiment. It was represented accuracy of 98.23% after Lowess normalization.

An Efficiency Assessment for Reflectance Normalization of RapidEye Employing BRD Components of Wide-Swath satellite

  • Kim, Sang-Il;Han, Kyung-Soo;Yeom, Jong-Min
    • Korean Journal of Remote Sensing
    • /
    • v.27 no.3
    • /
    • pp.303-314
    • /
    • 2011
  • Surface albedo is an important parameter of the surface energy budget, and its accurate quantification is of major interest to the global climate modeling community. Therefore, in this paper, we consider the direct solution of kernel based bidirectional reflectance distribution function (BRDF) models for retrieval of normalized reflectance of high resolution satellite. The BRD effects can be seen in satellite data having a wide swath such as SPOT/VGT (VEGETATION) have sufficient angular sampling, but high resolution satellites are impossible to obtain sufficient angular sampling over a pixel during short period because of their narrow swath scanning when applying semi-empirical model. This gives a difficulty to run BRDF model inferring the reflectance normalization of high resolution satellites. The principal purpose of the study is to estimate normalized reflectance of high resolution satellite (RapidEye) through BRDF components from SPOT/VGT. We use semi-empirical BRDF model to estimated BRDF components from SPOT/VGT and reflectance normalization of RapidEye. This study used SPOT/VGT satellite data acquired in the S1 (daily) data, and within this study is the multispectral sensor RapidEye. Isotropic value such as the normalized reflectance was closely related to the BRDF parameters and the kernels. Also, we show scatter plot of the SPOT/VGT and RapidEye isotropic value relationship. The linear relationship between the two linear regression analysis is performed by using the parameters of SPOTNGT like as isotropic value, geometric value and volumetric scattering value, and the kernel values of RapidEye like as geometric and volumetric scattering kernel Because BRDF parameters are difficult to directly calculate from high resolution satellites, we use to BRDF parameter of SPOT/VGT. Also, we make a decision of weighting for geometric value, volumetric scattering value and error through regression models. As a result, the weighting through linear regression analysis produced good agreement. For all sites, the SPOT/VGT isotropic and RapidEye isotropic values had the high correlation (RMSE, bias), and generally are very consistent.

Estimation of Road Sections Vulnerable to Black Ice Using Road Surface Temperatures Obtained by a Mobile Road Weather Observation Vehicle (도로기상차량으로 관측한 노면온도자료를 이용한 도로살얼음 취약 구간 산정)

  • Park, Moon-Soo;Kang, Minsoo;Kim, Sang-Heon;Jung, Hyun-Chae;Jang, Seong-Been;You, Dong-Gill;Ryu, Seong-Hyen
    • Atmosphere
    • /
    • v.31 no.5
    • /
    • pp.525-537
    • /
    • 2021
  • Black ices on road surfaces in winter tend to cause severe and terrible accidents. It is very difficult to detect black ice events in advance due to their localities as well as sensitivities to surface and upper meteorological variables. This study develops a methodology to detect the road sections vulnerable to black ice with the use of road surface temperature data obtained from a mobile road weather observation vehicle. The 7 experiments were conducted on the route from Nam-Wonju IC to Nam-Andong IC (132.5 km) on the Jungang Expressway during the period from December 2020 to February 2021. Firstly, temporal road surface temperature data were converted to the spatial data with a 50 m resolution. Then, the spatial road surface temperature was normalized with zero mean and one standard deviation using a simple normalization, a linear de-trend and normalization, and a low-pass filter and normalization. The resulting road thermal map was calculated in terms of road surface temperature differences. A road ice index was suggested using the normalized road temperatures and their horizontal differences. Road sections vulnerable to black ice were derived from road ice indices and verified with respect to road geometry and sky view, etc. It was found that black ice could occur not only over bridges, but also roads with a low sky view factor. These results are expected to be applicable to the alarm service for black ice to drivers.

Evaluation of Potential Reference Genes for Quantitative RT-PCR Analysis in Fusarium graminearum under Different Culture Conditions

  • Kim, Hee-Kyoung;Yun, Sung-Hwan
    • The Plant Pathology Journal
    • /
    • v.27 no.4
    • /
    • pp.301-309
    • /
    • 2011
  • The filamentous fungus Fusarium graminearum is an important cereal pathogen. Although quantitative realtime PCR (qRT-PCR) is commonly used to analyze the expression of important fungal genes, no detailed validation of reference genes for the normalization of qRT-PCR data has been performed in this fungus. Here, we evaluated 15 candidate genes as references, including those previously described as housekeeping genes and those selected from the whole transcriptome sequencing data. By a combination of three statistical algorithms (BestKeeper, geNorm, and NormFinder), the variation in the expression of these genes was assessed under different culture conditions that favored mycelial growth, sexual development, and trichothecene mycotoxin production. When favoring mycelial growth, GzFLO and GzUBH expression were most stable in complete medium. Both EF1A and GzRPS16 expression were relatively stable under all conditions on carrot agar, including mycelial growth and the subsequent perithecial induction stage. These two genes were also most stable during trichothecene production. For the combined data set, GzUBH and EF1A were selected as the most stable. Thus, these genes are suitable reference genes for accurate normalization of qRT-PCR data for gene expression analyses of F. graminearum and other related fungi.

Analysis of Muscle Activities of Lower Extremity in Jumping Pattern (점프유형에 따른 하지의 근 활동 형태연구(근전도 데이터 표준화 방법을 중심으로))

  • Lee, Sung-Cheol;Hwang, In-Seong;Cho, Young-Jae;Kim, Sun-Jung
    • Korean Journal of Applied Biomechanics
    • /
    • v.15 no.2
    • /
    • pp.155-165
    • /
    • 2005
  • The purpose of this study was to compare the muscle activities of Double Legged Jump (DLJ) and Single Legged Jump (SLJ) by the normalization of muscle activity. Eight college students without the lower extremity injuries were selected as subjects for collecting EMG data of vastus medialis and gastrocnemius. The entire section of motion was established as eccentric and concentric contractions, and each of the contractions was divided into three sections with equal timing intervals, which becomes a total of 6 phases. The EMG data of each phase was integrated and normalized. The muscle activities of the vastus medialis for both eccentric and concentric contractions were significantly different between DLJ and SLJ(p<.05). The increase in overall muscle activity of SLJ was 33.6%. Approximately, there was an increase of 25.9% in eccentric contraction and 40% in concentric contraction. Moreover, the data of the muscle activity of gastrocnemius was similar to the data of the muscle activity of vastus medialis. In conclusion, this research suggests muscle activity of a certain motion can be normalized for an analysis of another motion.

Short term Sensor's Drift Compensation by using Three Drift Correction Techniques (세 가지 드리프트 보정 기법을 이용한 단기 센서 드리프트 보정)

  • Jeon, Jin-Young;Choi, Jang-Sik;Byun, Hyung-Gi
    • Journal of Sensor Science and Technology
    • /
    • v.25 no.4
    • /
    • pp.291-296
    • /
    • 2016
  • The ideal chemical sensor must show the similar result under the same condition for accurate measurement of gases regardless of time. However, the actual responses of chemical sensors have been shown the lacks of repeatability and reproducibility because of the drift which has been caused by aging and pollution of the sensor and the environment change such as temperature and humidity. If the problems are not properly taken into considerations, the stability and reliability of the system using chemical sensors would be decreased. In this paper, we analyzed the sensor's drift and applied the three different compensation methods(DWT( Discrete Wavelets Transform), Baseline Manipulation, Internal Normalization) for reducing the effects of the drift in order to improve the stability and the reliability of short term of the chemical sensors. And in order to compare the results of the methods, the standard deviation was used as a criterion. The sensor drift was analyzed by a trend line graph. We applied the three methods to the successive data measured for three days and compared the results. As a result of comparison, the standard deviation of DWT showed lowest value. (Before compensation: 7.1219, DWT: 1.3644, Baseline Manipulation: 2.5209, Internal Normalization: 3.1425).