• Title/Summary/Keyword: Q-algorithm

Search Result 686, Processing Time 0.028 seconds

Multi Behavior Learning of Lamp Robot based on Q-learning (강화학습 Q-learning 기반 복수 행위 학습 램프 로봇)

  • Kwon, Ki-Hyeon;Lee, Hyung-Bong
    • Journal of Digital Contents Society
    • /
    • v.19 no.1
    • /
    • pp.35-41
    • /
    • 2018
  • The Q-learning algorithm based on reinforcement learning is useful for learning the goal for one behavior at a time, using a combination of discrete states and actions. In order to learn multiple actions, applying a behavior-based architecture and using an appropriate behavior adjustment method can make a robot perform fast and reliable actions. Q-learning is a popular reinforcement learning method, and is used much for robot learning for its characteristics which are simple, convergent and little affected by the training environment (off-policy). In this paper, Q-learning algorithm is applied to a lamp robot to learn multiple behaviors (human recognition, desk object recognition). As the learning rate of Q-learning may affect the performance of the robot at the learning stage of multiple behaviors, we present the optimal multiple behaviors learning model by changing learning rate.

Solving the Gale-Shapley Problem by Ant-Q learning (Ant-Q 학습을 이용한 Gale-Shapley 문제 해결에 관한 연구)

  • Kim, Hyun;Chung, Tae-Choong
    • The KIPS Transactions:PartB
    • /
    • v.18B no.3
    • /
    • pp.165-172
    • /
    • 2011
  • In this paper, we propose Ant-Q learning Algorithm[1], which uses the habits of biological ants, to find a new way to solve Stable Marriage Problem(SMP)[3] presented by Gale-Shapley[2]. The issue of SMP is to find optimum matching for a stable marriage based on their preference lists (PL). The problem of Gale-Shapley algorithm is to get a stable matching for only male (or female). We propose other way to satisfy various requirements for SMP. ACS(Ant colony system) is an swarm intelligence method to find optimal solution by using phermone of ants. We try to improve ACS technique by adding Q learning[9] concept. This Ant-Q method can solve SMP problem for various requirements. The experiment results shows the proposed method is good for the problem.

Dodecagon-based Q-learning Algorithm using SVM for Object Search of Robot (로봇의 목표물 추적을 위한 SVM과 12각형 기반의 Q-learning 알고리즘)

  • Seo, Sang-Wook;Jang, In-Hun;Sim, Kwee-Bo
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2007.11a
    • /
    • pp.227-230
    • /
    • 2007
  • 본 논문에서는 로봇의 목표물 추적을 위하여 SVM을 이용한 12각형 기반의 Q-learning 알고리즘을 제안한다. 제안한 알고리즘의 유효성을 보이기 위해 본 논문에서는 두 대의 로봇과 장애물 그리고 하나의 목표물로 정하고, 각각의 로봇이 숨겨진 목표물을 찾아내는 실험을 가정하여 무작위, DBAM과 AMAB의 융합 모델, 마지막으로는 본 논문에서 제안한 SVM과 12각형 기반의 Q-learning 알고리즘을 이용하여 실험을 수행하고, 이 3가지 방법을 비교하여 본 논문의 유효성을 검증하였다.

  • PDF

Quantization Performances and Iteration Number Statistics for Decoding Low Density Parity Check Codes (LDPC 부호의 복호를 위한 양자화 성능과 반복 횟수 통계)

  • Seo, Young-Dong;Kong, Min-Han;Song, Moon-Kyou
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.45 no.2
    • /
    • pp.37-43
    • /
    • 2008
  • The performance and hardware complexity of LDPC decoders depend on the design parameters of quantization, the clipping threshold $c_{th}$ and the number of quantization bits q, and also on the maximum number of decoding iterations. In this paper, the BER performances of LDPC codes are evaluated according to the clipping threshold $c_{th}$ and the number of quantization bits q through the simulation studies. By comparing the quantized Min-Sum algorithm with the ideal Min-Sum algorithm, it is shown that the quantized case with $c_{th}=2.5$ and q=6 has the best performance, which approaches the idea case. The decoding complexities are calculated and the word error rates(WER) are estimated by using the pdf which is obtained through the statistical analyses on the iteration numbers. These results can be utilized to tradeoff between the decoding performance and the complexity in LDPC decoder design.

STRONG CONVERGENCE OF STRICT PSEUDO-CONTRACTIONS IN Q-UNIFORMLY SMOOTH BANACH SPACES

  • Pei, Yonggang;Liu, Fujun;Gao, Qinghui
    • Journal of applied mathematics & informatics
    • /
    • v.33 no.1_2
    • /
    • pp.13-31
    • /
    • 2015
  • In this paper, we introduce a general iterative algorithm for finding a common element of the common fixed point set of an infinite family of ${\lambda}_i$-strict pseudo-contractions and the solution set of a general system of variational inclusions for two inverse strongly accretive operators in q-uniformly smooth Banach spaces. Then, we analyze the strong convergence of the iterative sequence generated by the proposed iterative algorithm under mild conditions.

A Nearly Optimal One-to-Many Routing Algorithm on k-ary n-cube Networks

  • Choi, Dongmin;Chung, Ilyong
    • Smart Media Journal
    • /
    • v.7 no.2
    • /
    • pp.9-14
    • /
    • 2018
  • The k-ary n-cube $Q^k_n$ is widely used in the design and implementation of parallel and distributed processing architectures. It consists of $k^n$ identical nodes, each node having degree 2n is connected through bidirectional, point-to-point communication channels to different neighbors. On $Q^k_n$ we would like to transmit packets from a source node to 2n destination nodes simultaneously along paths on this network, the $i^{th}$ packet will be transmitted along the $i^{th}$ path, where $0{\leq}i{\leq}2n-1$. In order for all packets to arrive at a destination node quickly and securely, we present an $O(n^3)$ routing algorithm on $Q^k_n$ for generating a set of one-to-many node-disjoint and nearly shortest paths, where each path is either shortest or nearly shortest and the total length of these paths is nearly minimum since the path is mainly determined by employing the Hungarian method.

GEOMETRIC DISTANCE FITTING OF PARABOLAS IN ℝ3

  • Kim, Ik Sung
    • Communications of the Korean Mathematical Society
    • /
    • v.37 no.3
    • /
    • pp.915-938
    • /
    • 2022
  • We are interested in the problem of fitting a parabola to a set of data points in ℝ3. It can be usually solved by minimizing the geometric distances from the fitted parabola to the given data points. In this paper, a parabola fitting algorithm will be proposed in such a way that the sum of the squares of the geometric distances is minimized in ℝ3. Our algorithm is mainly based on the steepest descent technique which determines an adequate number λ such that h(λ) = Q(u - λ𝛁Q(u)) < Q(u). Some numerical examples are given to test our algorithm.

A Study of Field Application Process of Public Key Algorithm RSA Based on Mathematical Principles and Characteristics through a Diagnostic (수학원리와 특성 진단을 기반으로 한 공개키 RSA 알고리즘의 현장 적용 프로세스)

  • Noh, SiChoon;Song, EunJee;Moon, SongChul
    • Journal of Service Research and Studies
    • /
    • v.5 no.2
    • /
    • pp.71-81
    • /
    • 2015
  • The RSA public key encryption algorithm, a few, key generation, factoring, the Euler function, key setup, a joint expression law, the application process are serial indexes. The foundation of such algorithms are mathematical principles. The first concept from mathematics principle is applied from how to obtain a minority. It is to obtain a product of two very large prime numbers, but readily tracking station the original two prime number, the product are used in a very hard principles. If a very large prime numbers p and q to obtain, then the product is the two $n=p{\times}q$ easy station, a method for tracking the number of p and q from n synthesis and it is substantially impossible. The RSA encryption algorithm, the number of digits in order to implement the inverse calculation is difficult mathematical one-way function and uses the integer factorization problem of a large amount. Factoring the concept of the calculation of the mod is difficult to use in addition to the problem in the reverse direction. But the interests of the encryption algorithm implementation usually are focused on introducing the film the first time you use encryption algorithm but we have to know how to go through some process applied to the field work This study presents a field force applied encryption process scheme based on public key algorithms attribute diagnosis.

Path selection algorithm for multi-path system based on deep Q learning (Deep Q 학습 기반의 다중경로 시스템 경로 선택 알고리즘)

  • Chung, Byung Chang;Park, Heasook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.1
    • /
    • pp.50-55
    • /
    • 2021
  • Multi-path system is a system in which utilizes various networks simultaneously. It is expected that multi-path system can enhance communication speed, reliability, security of network. In this paper, we focus on path selection in multi-path system. To select optimal path, we propose deep reinforcement learning algorithm which is rewarded by the round-trip-time (RTT) of each networks. Unlike multi-armed bandit model, deep Q learning is applied to consider rapidly changing situations. Due to the delay of RTT data, we also suggest compensation algorithm of the delayed reward. Moreover, we implement testbed learning server to evaluate the performance of proposed algorithm. The learning server contains distributed database and tensorflow module to efficiently operate deep learning algorithm. By means of simulation, we showed that the proposed algorithm has better performance than lowest RTT about 20%.

A Study on the Algorithm for the Q-CDMA Base Station Receiver (Q-CDMA 기지국 수신기 알고리즘 연구)

  • 이태영;김환우
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.19 no.9
    • /
    • pp.1812-1823
    • /
    • 1994
  • In this paper, we focus on the simulation of receiver algorithms for the Q-CDMA reverse link modem to analyze its structure and performance. Receiver algorithm is to be characterized by processing a large amount of data for reliable data transmission through poor mobile channel environment. According to Q-CDMA receiver scheme, we connect the code acqusition and code tracking models for despreading of input signals and the RAKE structure demodulator used to resolve the time diversity signal due to multipath propagation. And this connected system is under test. The bit error rates are found for an arbitrary user under the AWGN and multipath fading environments.

  • PDF