• Title/Summary/Keyword: Gradient-descent methods

Search Result 73, Processing Time 0.026 seconds

Comparison with two Gradient Methods through the application to the Vector Linear Predictor (두가지 gradient 방법의 벡터 선형 예측기에 대한 적용 비교)

  • Shin, Kwang-Kyun;Yang, Seung-In
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.1595-1597
    • /
    • 1987
  • Two gradient methods, steepest descent method and conjugate gradient descent method, are compar ed through application to vector linear predictors. It is found that the convergence rate of the conju-gate gradient descent method is much faster than that of the steepest descent method.

  • PDF

Comparison of Gradient Descent for Deep Learning (딥러닝을 위한 경사하강법 비교)

  • Kang, Min-Jae
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.2
    • /
    • pp.189-194
    • /
    • 2020
  • This paper analyzes the gradient descent method, which is the one most used for learning neural networks. Learning means updating a parameter so the loss function is at its minimum. The loss function quantifies the difference between actual and predicted values. The gradient descent method uses the slope of the loss function to update the parameter to minimize error, and is currently used in libraries that provide the best deep learning algorithms. However, these algorithms are provided in the form of a black box, making it difficult to identify the advantages and disadvantages of various gradient descent methods. This paper analyzes the characteristics of the stochastic gradient descent method, the momentum method, the AdaGrad method, and the Adadelta method, which are currently used gradient descent methods. The experimental data used a modified National Institute of Standards and Technology (MNIST) data set that is widely used to verify neural networks. The hidden layer consists of two layers: the first with 500 neurons, and the second with 300. The activation function of the output layer is the softmax function, and the rectified linear unit function is used for the remaining input and hidden layers. The loss function uses cross-entropy error.

An Application of the Clustering Threshold Gradient Descent Regularization Method for Selecting Genes in Predicting the Survival Time of Lung Carcinomas

  • Lee, Seung-Yeoun;Kim, Young-Chul
    • Genomics & Informatics
    • /
    • v.5 no.3
    • /
    • pp.95-101
    • /
    • 2007
  • In this paper, we consider the variable selection methods in the Cox model when a large number of gene expression levels are involved with survival time. Deciding which genes are associated with survival time has been a challenging problem because of the large number of genes and relatively small sample size (n<

A Study on the Tensor-Valued Median Filter Using the Modified Gradient Descent Method in DT-MRI (확산텐서자기공명영상에서 수정된 기울기강하법을 이용한 텐서 중간값 필터에 관한 연구)

  • Kim, Sung-Hee;Kwon, Ki-Woon;Park, In-Sung;Han, Bong-Soo;Kim, Dong-Youn
    • Journal of Biomedical Engineering Research
    • /
    • v.28 no.6
    • /
    • pp.817-824
    • /
    • 2007
  • Tractography using Diffusion Tensor Magnetic Resonance Imaging (DT-MRI) is a method to determine the architecture of axonal fibers in the central nervous system by computing the direction of the principal eigenvector in the white matter of the brain. However, the fiber tracking methods suffer from the noise included in the diffusion tensor images that affects the determination of the principal eigenvector. As the fiber tracking progresses, the accumulated error creates a large deviation between the calculated fiber and the real fiber. This problem of the DT-MRI tractography is known mathematically as the ill-posed problem which means that tractography is very sensitive to perturbations by noise. To reduce the noise in DT-MRI measurements, a tensor-valued median filter which is reported to be denoising and structure-preserving in fiber tracking, is applied in the tractography. In this paper, we proposed the modified gradient descent method which converges fast and accurately to the optimal tensor-valued median filter by changing the step size. In addition, the performance of the modified gradient descent method is compared with others. We used the synthetic image which consists of 45 degree principal eigenvectors and the corticospinal tract. For the synthetic image, the proposed method achieved 4.66%, 16.66% and 15.08% less error than the conventional gradient descent method for error measures AE, AAE, AFA respectively. For the corticospinal tract, at iteration number ten the proposed method achieved 3.78%, 25.71 % and 11.54% less error than the conventional gradient descent method for error measures AE, AAE, AFA respectively.

Perceptron-like LVQ : Generalization of LVQ (퍼셉트론 형태의 LVQ : LVQ의 일반화)

  • Song, Geun-Bae;Lee, Haing-Sei
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.38 no.1
    • /
    • pp.1-6
    • /
    • 2001
  • In this paper we reanalyze Kohonen‘s learning vector quantizing (LVQ) Learning rule which is based on Hcbb’s learning rule with a view to a gradient descent method. Kohonen's LVQ can be classified into two algorithms according to 6learning mode: unsupervised LVQ(ULVQ) and supervised LVQ(SLVQ). These two algorithms can be represented as gradient descent methods, if target values of output neurons are generated properly. As a result, we see that the LVQ learning method is a special case of a gradient descent method and also that LVQ is represented by a generalized percetron-like LVQ(PLVQ).

  • PDF

Gradient Descent Approach for Value-Based Weighting (점진적 하강 방법을 이용한 속성값 기반의 가중치 계산방법)

  • Lee, Chang-Hwan;Bae, Joo-Hyun
    • The KIPS Transactions:PartB
    • /
    • v.17B no.5
    • /
    • pp.381-388
    • /
    • 2010
  • Naive Bayesian learning has been widely used in many data mining applications, and it performs surprisingly well on many applications. However, due to the assumption that all attributes are equally important in naive Bayesian learning, the posterior probabilities estimated by naive Bayesian are sometimes poor. In this paper, we propose more fine-grained weighting methods, called value weighting, in the context of naive Bayesian learning. While the current weighting methods assign a weight to each attribute, we assign a weight to each attribute value. We investigate how the proposed value weighting effects the performance of naive Bayesian learning. We develop new methods, using gradient descent method, for both value weighting and feature weighting in the context of naive Bayesian. The performance of the proposed methods has been compared with the attribute weighting method and general Naive bayesian, and the value weighting method showed better in most cases.

Hybrid Fuzzy Adaptive Control of LEGO Robots

  • Vaseak, Jan;Miklos, Marian
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.2 no.1
    • /
    • pp.65-69
    • /
    • 2002
  • The main drawback of “classical”fuzzy systems is the inability to design and maintain their database. To overcome this disadvantage many types of extensions adding the adaptivity property to those systems were designed. This paper deals with one of them a new hybrid adaptation structure, called gradient-incremental adaptive fuzzy controller connecting gradient-descent methods with the so-called self-organizing fuzzy logic controller designed by Procyk and Mamdani. The aim is to incorporate the advantages of both Principles. This controller was implemented and tested on the system of LEGO robots. The results and comparison to a ‘classical’(non-adaptive) fuzzy controller designed by a human operator are also shown here.

Fuzzy Modeling based on FCM Clustering Algorithm (FCM 클러스터링 알고리즘에 기초한 퍼지 모델링)

  • 윤기찬;오성권
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2000.10a
    • /
    • pp.373-373
    • /
    • 2000
  • In this paper, we propose a fuzzy modeling algorithm which divides the input space more efficiently than convention methods by taking into consideration correlations between components of sample data. The proposed fuzzy modeling algorithm consists of two steps: coarse tuning, which determines consequent parameters approximately using FCRM clustering method, and fine tuning, which adjusts the premise and consequent parameters more precisely by gradient descent algorithm. To evaluate the performance of the proposed fuzzy mode, we use the numerical data of nonlinear function.

  • PDF

A survey on parallel training algorithms for deep neural networks (심층 신경망 병렬 학습 방법 연구 동향)

  • Yook, Dongsuk;Lee, Hyowon;Yoo, In-Chul
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.6
    • /
    • pp.505-514
    • /
    • 2020
  • Since a large amount of training data is typically needed to train Deep Neural Networks (DNNs), a parallel training approach is required to train the DNNs. The Stochastic Gradient Descent (SGD) algorithm is one of the most widely used methods to train the DNNs. However, since the SGD is an inherently sequential process, it requires some sort of approximation schemes to parallelize the SGD algorithm. In this paper, we review various efforts on parallelizing the SGD algorithm, and analyze the computational overhead, communication overhead, and the effects of the approximations.

Development of Railway Vibration Evaluation System Using Actual Railway Vibration Database (실측 철도 진동 데이터베이스를 이용한 철도진동 평가 시스템 개발)

  • Lee, Hyunjun;Seo, Eun Seong;Hwang, Young Sup
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.4
    • /
    • pp.153-162
    • /
    • 2019
  • Recently, it is necessary to develop a technology for quantitatively evaluating railway vibration to prevent civil complaints about orbital structures caused by railway noise and normal operation of ultra-precise equipment of orbital industrial complexes. The existing analytical method requires a very complicated dynamic response model, and it is difficult to secure the reliability of the result due to the inaccuracy of the demand model. Therefore, in this paper, we propose a railway vibration evaluation algorithm and system that deduce the vibration value generated from railway operation by using Linear Regression and Gradient Descent technique based on actual measurement railway vibration database that classifies factors affecting railway vibration. The prediction results obtained by the proposed algorithm show higher efficiency and accuracy than the existing analytical methods.