• Title/Summary/Keyword: Sample Vector

Search Result 269, Processing Time 0.027 seconds

On the Performance of Sample-Adaptive Product Quantizer for Noisy Channels (표본적응 프러덕트 양자기의 전송로 잡음에서의 성능 분석에 관한 연구)

  • Kim Dong Sik
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.3 s.303
    • /
    • pp.81-90
    • /
    • 2005
  • When we transmit signals, which are quantized by the vector quantizer (VQ), through noisy channels, the overall performance of the coding system is very dependent on the employed quantization scheme and the channel error effect. In order to design an optimal coding system, the source and channel coding scheme should be jointly optimized as in the channel-optimized VQ. As a suboptimal approach, we may consider the robust VQ (RVQ). In RVQ, we consider developing an index assignment function for mapping the output of quantizers to channel symbols so that the effect of the channel errors is minimized. Recently, a VQ, which can reduce the encoding complexity and is called the sample-adaptive product quantizer (SAPQ), has been proposed. SAPQ has very similar quantizer structure as to the product quantizer (PQ). However, the quantization performance can be better than PQ. Further, the encoding complexity and the memory requirement for the codebooks are lower than the regular full-search VQ case. In this paper, SAPQ is employed in order to design an RVQ to channel errors by reducing the vector dimension. Discussions on the codebook structure of SAPQ and experiments are introduced in an aspect of robustness to noisy channels.

Re-SSS: Rebalancing Imbalanced Data Using Safe Sample Screening

  • Shi, Hongbo;Chen, Xin;Guo, Min
    • Journal of Information Processing Systems
    • /
    • v.17 no.1
    • /
    • pp.89-106
    • /
    • 2021
  • Different samples can have different effects on learning support vector machine (SVM) classifiers. To rebalance an imbalanced dataset, it is reasonable to reduce non-informative samples and add informative samples for learning classifiers. Safe sample screening can identify a part of non-informative samples and retain informative samples. This study developed a resampling algorithm for Rebalancing imbalanced data using Safe Sample Screening (Re-SSS), which is composed of selecting Informative Samples (Re-SSS-IS) and rebalancing via a Weighted SMOTE (Re-SSS-WSMOTE). The Re-SSS-IS selects informative samples from the majority class, and determines a suitable regularization parameter for SVM, while the Re-SSS-WSMOTE generates informative minority samples. Both Re-SSS-IS and Re-SSS-WSMOTE are based on safe sampling screening. The experimental results show that Re-SSS can effectively improve the classification performance of imbalanced classification problems.

Weight Vector Analysis to Portfolio Performance with Diversification Constraints (비중 상한 제약조건에 따른 포트폴리오 성과에 대한 투자 비중 분석)

  • Park, Kyungchan;Kim, Hongseon;Kim, Seongmoon
    • Korean Management Science Review
    • /
    • v.33 no.4
    • /
    • pp.51-64
    • /
    • 2016
  • The maximum weight of single stock in mutual fund is limited by regulations to enforce diversification. Under incomplete information with added constraints on portfolio weights, enhanced performance had been reported in previous researches. We analyze a weight vector to examine the effects of additional constraints on the portfolio's performance by computing the Euclidean distance from the in-sample tangency portfolio, as opposed to previous researches which analyzed ex-post return only. Empirical experiment was performed on Mean-variance and Minimum-variance model with Fama French's 30 industry portfolio and 10 industry portfolio for the last 1,000 months from August 1932 to November 2015. We find that diversification-constrained portfolios have 7% to 26% smaller Euclidean distances with the benchmark portfolio compared to those of unconstrained portfolios and 3% to 11% greater Sharpe Ratio.

Lindley Type Estimators with the Known Norm

  • Baek, Hoh-Yoo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.11 no.1
    • /
    • pp.37-45
    • /
    • 2000
  • Consider the problem of estimating a $p{\times}1$ mean vector ${\underline{\theta}}(p{\geq}4)$ under the quadratic loss, based on a sample ${\underline{x}_{1}},\;{\cdots}{\underline{x}_{n}}$. We find an optimal decision rule within the class of Lindley type decision rules which shrink the usual one toward the mean of observations when the underlying distribution is that of a variance mixture of normals and when the norm ${\parallel}\;{\underline{\theta}}\;-\;{\bar{\theta}}{\underline{1}}\;{\parallel}$ is known, where ${\bar{\theta}}=(1/p){\sum_{i=1}^p}{\theta}_i$ and $\underline{1}$ is the column vector of ones.

  • PDF

SEQUENTIAL ESTIMATION OF THE MEAN VECTOR WITH BETA-PROTECTION IN THE MULTIVARIATE DISTRIBUTION

  • Kim, Sung Lai;Song, Hae In;Kim, Min Soo;Jang, Yu Seon
    • Journal of the Chungcheong Mathematical Society
    • /
    • v.26 no.1
    • /
    • pp.29-36
    • /
    • 2013
  • In the treatment of the sequential beta-protection procedure, we define the reasonable stopping time and investigate that for the stopping time Wijsman's requirements, coverage probability and beta-protection conditions, are satisfied in the estimation for the mean vector ${\mu}$ by the sample from the multivariate normal distributed population with unknown mean vector ${\mu}$ and a positive definite variance-covariance matrix ${\Sigma}$.

Default Prediction of Automobile Credit Based on Support Vector Machine

  • Chen, Ying;Zhang, Ruirui
    • Journal of Information Processing Systems
    • /
    • v.17 no.1
    • /
    • pp.75-88
    • /
    • 2021
  • Automobile credit business has developed rapidly in recent years, and corresponding default phenomena occur frequently. Credit default will bring great losses to automobile financial institutions. Therefore, the successful prediction of automobile credit default is of great significance. Firstly, the missing values are deleted, then the random forest is used for feature selection, and then the sample data are randomly grouped. Finally, six prediction models of support vector machine (SVM), random forest and k-nearest neighbor (KNN), logistic, decision tree, and artificial neural network (ANN) are constructed. The results show that these six machine learning models can be used to predict the default of automobile credit. Among these six models, the accuracy of decision tree is 0.79, which is the highest, but the comprehensive performance of SVM is the best. And random grouping can improve the efficiency of model operation to a certain extent, especially SVM.

How to improve oil consumption forecast using google trends from online big data?: the structured regularization methods for large vector autoregressive model

  • Choi, Ji-Eun;Shin, Dong Wan
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.1
    • /
    • pp.41-51
    • /
    • 2022
  • We forecast the US oil consumption level taking advantage of google trends. The google trends are the search volumes of the specific search terms that people search on google. We focus on whether proper selection of google trend terms leads to an improvement in forecast performance for oil consumption. As the forecast models, we consider the least absolute shrinkage and selection operator (LASSO) regression and the structured regularization method for large vector autoregressive (VAR-L) model of Nicholson et al. (2017), which select automatically the google trend terms and the lags of the predictors. An out-of-sample forecast comparison reveals that reducing the high dimensional google trend data set to a low-dimensional data set by the LASSO and the VAR-L models produces better forecast performance for oil consumption compared to the frequently-used forecast models such as the autoregressive model, the autoregressive distributed lag model and the vector error correction model.

Wavelength selection by loading vector analysis in determining total protein in human serum using near-infrared spectroscopy and Partial Least Squares Regression

  • Kim, Yoen-Joo;Yoon, Gil-Won
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.4102-4102
    • /
    • 2001
  • In multivariate analysis, absorbance spectrum is measured over a band of wavelengths. One does not often pay attention to the size of this wavelength band. However, it is desirable that spectrum is measured at only necessary wavelengths as long as the acceptable accuracy of prediction can be met. In this paper, the method of selecting an optimal band of wavelengths based on the loading vector analysis was proposed and applied for determining total protein in human serum using near-infrared transmission spectroscopy and PLSR. Loading vectors in the full spectrum PLSR were used as reference in selecting wavelengths, but only the first loading vector was used since it explains the spectrum best. Absorbance spectra of sera from 97 outpatients were measured at 1530∼1850 nm with an interval of 2 nm. Total protein concentrations of sera were ranged from 5.1 to 7.7 g/㎗. Spectra were measured by Cary 5E spectrophotometer (Varian, Australia). Serum in the 5 mm-pathlength cuvette was put in the sample beam and air in the reference beam. Full spectrum PLSR was applied to determine total protein from sera. Next, the wavelength region of 1672∼1754 nm was selected based on the first loading vector analysis. Standard Error of Cross Validation (SECV) of full spectrum (1530∼l850 nm) PLSR and selected wavelength PLSR (1672∼1754 nm) was respectively 0.28 and 0.27 g/㎗. The prediction accuracy between the two bands was equal. Wavelength selection based on loading vector in PLSR seemed to be simple and robust in comparison to other methods based on correlation plot, regression vector and genetic algorithm. As a reference of wavelength selection for PLSR, the loading vector has the advantage over the correlation plot since the former is based on multivariate model whereas the latter, on univariate model. Wavelength selection by the first loading vector analysis requires shorter computation time than that by genetic algorithm and needs not smoothing.

  • PDF

Expression of Human KCNE1 Gene in Zebrafish (Zebrafish에서 인간 KCNE1 유전자 발현에 관한 연구)

  • Park, Hyeon Jeong;Yoo, Min
    • Journal of Life Science
    • /
    • v.27 no.5
    • /
    • pp.524-529
    • /
    • 2017
  • This study was aimed to produce a transgenic zebrafish expressing the human KCNE1 gene. Initially, the entire CDS of the human KCNE1 gene was amplified from a human genomic DNA sample by polymerase chain reaction using a primer set engineered with restriction enzyme sites (EcoRI, BamHI) at the 5' end of each primer. The resultant 402 bp KCNE1 amplicon flanked by EcoR1 and BamH1 was obtained and subsequently cloned into a plasmid vector pPB-CMVp-EF1-GreenPuro. The integrity of the cloned CDS sequence was confirmed by DNA sequencing analysis. Next, the recombinant vector containing the human KCNE1 (pPB-CMVp-hKCNE1-EF1-GreenPuro) was introduced into fertilized eggs of zebrafish by microinjection. Successful expression of the recombinant vector in the eggs was confirmed by the expression of the fluorescence protein encoded in the vector. Finally, in order to assure that the stable expression of the human KCNE1 gene occurred in the transgenic animal, RNAs were extracted from the animal and the presence of KCNE1 transcripts was confirmed by RT-PCT as well as DNA sequencing analysis. The study provides a methodology to construct a useful transgenic animal model applicable to the development of diagnostic technologies for gene therapy of LQTS (Long QT Syndrome) as well as tools for cloning of useful genes in fish.

Motion Field Estimation Using U-Disparity Map in Vehicle Environment

  • Seo, Seung-Woo;Lee, Gyu-Cheol;Yoo, Ji-Sang
    • Journal of Electrical Engineering and Technology
    • /
    • v.12 no.1
    • /
    • pp.428-435
    • /
    • 2017
  • In this paper, we propose a novel motion field estimation algorithm for which a U-disparity map and forward-and-backward error removal are applied in a vehicular environment. Generally, a motion exists in an image obtained by a camera attached to a vehicle by vehicle movement; however, the obtained motion vector is inaccurate because of the surrounding environmental factors such as the illumination changes and vehicles shaking. It is, therefore, difficult to extract an accurate motion vector, especially on the road surface, due to the similarity of the adjacent-pixel values; therefore, the proposed algorithm first removes the road surface region in the obtained image by using a U-disparity map, and uses then the optical flow that represents the motion vector of the object in the remaining part of the image. The algorithm also uses a forward-backward error-removal technique to improve the motion-vector accuracy and a vehicle's movement is predicted through the application of the RANSAC (RANdom SAmple Consensus) to the previously obtained motion vectors, resulting in the generation of a motion field. Through experiment results, we show that the performance of the proposed algorithm is superior to that of an existing algorithm.