• Title/Summary/Keyword: 확률적 경사 하강

Search Result 8, Processing Time 0.024 seconds

Adaptive stochastic gradient method under two mixing heterogenous models (두 이종 혼합 모형에서의 수정된 경사 하강법)

  • Moon, Sang Jun;Jeon, Jong-June
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.6
    • /
    • pp.1245-1255
    • /
    • 2017
  • The online learning is a process of obtaining the solution for a given objective function where the data is accumulated in real time or in batch units. The stochastic gradient descent method is one of the most widely used for the online learning. This method is not only easy to implement, but also has good properties of the solution under the assumption that the generating model of data is homogeneous. However, the stochastic gradient method could severely mislead the online-learning when the homogeneity is actually violated. We assume that there are two heterogeneous generating models in the observation, and propose the a new stochastic gradient method that mitigate the problem of the heterogeneous models. We introduce a robust mini-batch optimization method using statistical tests and investigate the convergence radius of the solution in the proposed method. Moreover, the theoretical results are confirmed by the numerical simulations.

Comparison of Gradient Descent for Deep Learning (딥러닝을 위한 경사하강법 비교)

  • Kang, Min-Jae
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.2
    • /
    • pp.189-194
    • /
    • 2020
  • This paper analyzes the gradient descent method, which is the one most used for learning neural networks. Learning means updating a parameter so the loss function is at its minimum. The loss function quantifies the difference between actual and predicted values. The gradient descent method uses the slope of the loss function to update the parameter to minimize error, and is currently used in libraries that provide the best deep learning algorithms. However, these algorithms are provided in the form of a black box, making it difficult to identify the advantages and disadvantages of various gradient descent methods. This paper analyzes the characteristics of the stochastic gradient descent method, the momentum method, the AdaGrad method, and the Adadelta method, which are currently used gradient descent methods. The experimental data used a modified National Institute of Standards and Technology (MNIST) data set that is widely used to verify neural networks. The hidden layer consists of two layers: the first with 500 neurons, and the second with 300. The activation function of the output layer is the softmax function, and the rectified linear unit function is used for the remaining input and hidden layers. The loss function uses cross-entropy error.

An Optimal Filter Design for System Identification with GA (GA를 이용한 시스템 동정용 필터계수 최적화)

  • Song, Young-Jun;Kong, Seong-Gon
    • Proceedings of the KIEE Conference
    • /
    • 1999.07g
    • /
    • pp.2833-2835
    • /
    • 1999
  • 이 논문에서는 임의의 시스템 동정에 사용되는 적응필터의 계수를 최적화시키는 방법으로 광범위하게 사용되어지고 있는 기존의 적응 알고리즘인 Least Mean Square(LMS)방법과 최근들어 다양한 최적화 문제에 응용되고 있는 유전자 알고리즘(GA)을 합성한 하이브리드 형태의 적응 알고리즘을 사용한다. 이 알고리즘은 TIR 필터를 설계하는데 있어, 경사하강법의 개념을 사용함으로써 야기되는 지역 수렴문제의 단점을 보완하기 위해, 미분과 같은 결정론적인 규칙없이 단지 확률적인 연산자만으로 진행하는 유전자 알고리즘을 이용한다. 그리고 유전자 알고리즘에 있어서 확률적인 연산을 사용함으로써 발생하는 많은 계산량과 느린 수렴속도 문제를 LMS의 경사하강법을 이용하여 보완한다. 이처럼 유전자 알고리즘이 지닌 장점과 LMS 알고리즘이 갖는 장점을 이용하여 각 알고리즘이 지니는 단점을 서로 보완함으로써 알고리즘의 성능을 향상시키고 이 향상된 알고리즘을 이용하여 최적 필터계수를 찾는다 이렇게 얻은 필터계수값을 이용하여 적응 필터의 성능을 확인 평가한다.

  • PDF

Privacy Preserving Techniques for Deep Learning in Multi-Party System (멀티 파티 시스템에서 딥러닝을 위한 프라이버시 보존 기술)

  • Hye-Kyeong Ko
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.3
    • /
    • pp.647-654
    • /
    • 2023
  • Deep Learning is a useful method for classifying and recognizing complex data such as images and text, and the accuracy of the deep learning method is the basis for making artificial intelligence-based services on the Internet useful. However, the vast amount of user da vita used for training in deep learning has led to privacy violation problems, and it is worried that companies that have collected personal and sensitive data of users, such as photographs and voices, own the data indefinitely. Users cannot delete their data and cannot limit the purpose of use. For example, data owners such as medical institutions that want to apply deep learning technology to patients' medical records cannot share patient data because of privacy and confidentiality issues, making it difficult to benefit from deep learning technology. In this paper, we have designed a privacy preservation technique-applied deep learning technique that allows multiple workers to use a neural network model jointly, without sharing input datasets, in multi-party system. We proposed a method that can selectively share small subsets using an optimization algorithm based on modified stochastic gradient descent, confirming that it could facilitate training with increased learning accuracy while protecting private information.

Optimal Algorithm and Number of Neurons in Deep Learning (딥러닝 학습에서 최적의 알고리즘과 뉴론수 탐색)

  • Jang, Ha-Young;You, Eun-Kyung;Kim, Hyeock-Jin
    • Journal of Digital Convergence
    • /
    • v.20 no.4
    • /
    • pp.389-396
    • /
    • 2022
  • Deep Learning is based on a perceptron, and is currently being used in various fields such as image recognition, voice recognition, object detection, and drug development. Accordingly, a variety of learning algorithms have been proposed, and the number of neurons constituting a neural network varies greatly among researchers. This study analyzed the learning characteristics according to the number of neurons of the currently used SGD, momentum methods, AdaGrad, RMSProp, and Adam methods. To this end, a neural network was constructed with one input layer, three hidden layers, and one output layer. ReLU was applied to the activation function, cross entropy error (CEE) was applied to the loss function, and MNIST was used for the experimental dataset. As a result, it was concluded that the number of neurons 100-300, the algorithm Adam, and the number of learning (iteraction) 200 would be the most efficient in deep learning learning. This study will provide implications for the algorithm to be developed and the reference value of the number of neurons given new learning data in the future.

Sparse and low-rank feature selection for multi-label learning

  • Lim, Hyunki
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.7
    • /
    • pp.1-7
    • /
    • 2021
  • In this paper, we propose a feature selection technique for multi-label classification. Many existing feature selection techniques have selected features by calculating the relation between features and labels such as a mutual information scale. However, since the mutual information measure requires a joint probability, it is difficult to calculate the joint probability from an actual premise feature set. Therefore, it has the disadvantage that only a few features can be calculated and only local optimization is possible. Away from this regional optimization problem, we propose a feature selection technique that constructs a low-rank space in the entire given feature space and selects features with sparsity. To this end, we designed a regression-based objective function using Nuclear norm, and proposed an algorithm of gradient descent method to solve the optimization problem of this objective function. Based on the results of multi-label classification experiments on four data and three multi-label classification performance, the proposed methodology showed better performance than the existing feature selection technique. In addition, it was showed by experimental results that the performance change is insensitive even to the parameter value change of the proposed objective function.

Proof-of-principle Experimental Study of the CMA-ES Phase-control Algorithm Implemented in a Multichannel Coherent-beam-combining System (다채널 결맞음 빔결합 시스템에서 CMA-ES 위상 제어 알고리즘 구현에 관한 원리증명 실험적 연구)

  • Minsu Yeo;Hansol Kim;Yoonchan Jeong
    • Korean Journal of Optics and Photonics
    • /
    • v.35 no.3
    • /
    • pp.107-114
    • /
    • 2024
  • In this study, the feasibility of using the covariance-matrix-adaptation-evolution-strategy (CMA-ES) algorithm in a multichannel coherent-beam-combining (CBC) system was experimentally verified. We constructed a multichannel CBC system utilizing a spatial light modulator (SLM) as a multichannel phase-modulator array, along with a coherent light source at 635 nm, implemented the stochastic-parallel-gradient-descent (SPGD) and CMA-ES algorithms on it, and compared their performances. In particular, we evaluated the characteristics of the CMA-ES and SPGD algorithms in the CBC system in both 16-channel rectangular and 19-channel honeycomb formats. The results of the evaluation showed that the performances of the two algorithms were similar on average, under the given conditions; However, it was verified that under the given conditions the CMA-ES algorithm was able to operate with more stable performance than the SPGD algorithm, as the former had less operational variation with the initial phase setting than the latter. It is emphasized that this study is the first proof-of-principle demonstration of the CMA-ES phase-control algorithm in a multichannel CBC system, to the best of our knowledge, and is expected to be useful for future experimental studies of the effects of additional channel-number increments, or external-phase-noise effects, in multichannel CBC systems based on the CMA-ES phase-control algorithm.

Classification of Transport Vehicle Noise Events in Magnetotelluric Time Series Data in an Urban area Using Random Forest Techniques (Random Forest 기법을 이용한 도심지 MT 시계열 자료의 차량 잡음 분류)

  • Kwon, Hyoung-Seok;Ryu, Kyeongho;Sim, Ickhyeon;Lee, Choon-Ki;Oh, Seokhoon
    • Geophysics and Geophysical Exploration
    • /
    • v.23 no.4
    • /
    • pp.230-242
    • /
    • 2020
  • We performed a magnetotelluric (MT) survey to delineate the geological structures below the depth of 20 km in the Gyeongju area where an earthquake with a magnitude of 5.8 occurred in September 2016. The measured MT data were severely distorted by electrical noise caused by subways, power lines, factories, houses, and farmlands, and by vehicle noise from passing trains and large trucks. Using machine-learning methods, we classified the MT time series data obtained near the railway and highway into two groups according to the inclusion of traffic noise. We applied three schemes, stochastic gradient descent, support vector machine, and random forest, to the time series data for the highspeed train noise. We formulated three datasets, Hx, Hy, and Hx & Hy, for the time series data of the large truck noise and applied the random forest method to each dataset. To evaluate the effect of removing the traffic noise, we compared the time series data, amplitude spectra, and apparent resistivity curves before and after removing the traffic noise from the time series data. We also examined the frequency range affected by traffic noise and whether artifact noise occurred during the traffic noise removal process as a result of the residual difference.