• Title/Summary/Keyword: Q-algorithm

Search Result 686, Processing Time 0.027 seconds

Multi-regional Anti-jamming Communication Scheme Based on Transfer Learning and Q Learning

  • Han, Chen;Niu, Yingtao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.7
    • /
    • pp.3333-3350
    • /
    • 2019
  • The smart jammer launches jamming attacks which degrade the transmission reliability. In this paper, smart jamming attacks based on the communication probability over different channels is considered, and an anti-jamming Q learning algorithm (AQLA) is developed to obtain anti-jamming knowledge for the local region. To accelerate the learning process across multiple regions, a multi-regional intelligent anti-jamming learning algorithm (MIALA) which utilizes transferred knowledge from neighboring regions is proposed. The MIALA algorithm is evaluated through simulations, and the results show that the it is capable of learning the jamming rules and effectively speed up the learning rate of the whole communication region when the jamming rules are similar in the neighboring regions.

Object tracking algorithm of Swarm Robot System for using SVM and Dodecagon based Q-learning (12각형 기반의 Q-learning과 SVM을 이용한 군집로봇의 목표물 추적 알고리즘)

  • Seo, Sang-Wook;Yang, Hyun-Chang;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.3
    • /
    • pp.291-296
    • /
    • 2008
  • This paper presents the dodecagon-based Q-leaning and SVM algorithm for object search with multiple robots. We organized an experimental environment with several mobile robots, obstacles, and an object. Then we sent the robots to a hallway, where some obstacles were tying about, to search for a hidden object. In experiment, we used four different control methods: a random search, a fusion model with Distance-based action making(DBAM) and Area-based action making(ABAM) process to determine the next action of the robots, and hexagon-based Q-learning and dodecagon-based Q-learning and SVM to enhance the fusion model with Distance-based action making(DBAM) and Area-based action making(ABAM) process.

Active Frequency with a Positive Feedback Anti-Islanding Method Based on a Robust PLL Algorithm for Grid-Connected PV PCS

  • Lee, Jong-Pil;Min, Byung-Duk;Kim, Tae-Jin;Yoo, Dong-Wook;Yoo, Ji-Yoon
    • Journal of Power Electronics
    • /
    • v.11 no.3
    • /
    • pp.360-368
    • /
    • 2011
  • This paper proposes an active frequency with a positive feedback in the d-q frame anti-islanding method suitable for a robust phase-locked loop (PLL) algorithm using the FFT concept. In general, PLL algorithms for grid-connected PV PCS use d-q transformation and controllers to make zero an imaginary part of the transformed voltage vector. In a real grid system, the grid voltage is not ideal. It may be unbalanced, noisy and have many harmonics. For these reasons, the d-q transformed components do not have a pure DC component. The controller tuning of a PLL algorithm is difficult. The proposed PLL algorithm using the FFT concept can use the strong noise cancelation characteristics of a FFT algorithm without a PI controller. Therefore, the proposed PLL algorithm has no gain-tuning of a PI controller, and it is hardly influenced by voltage drops, phase step changes and harmonics. Islanding prediction is a necessary feature of inverter-based photovoltaic (PV) systems in order to meet the stringent standard requirements for interconnection with an electrical grid. Both passive and active anti-islanding methods exist. Typically, active methods modify a given parameter, which also affects the shape and quality of the grid injected current. In this paper, the active anti-islanding algorithm for a grid-connected PV PCS uses positive feedback control in the d-q frame. The proposed PLL and anti-islanding algorithm are implemented for a 250kW PV PCS. This system has four DC/DC converters each with a 25kW power rating. This is only one-third of the total system power. The experimental results show that the proposed PLL, anti-islanding method and topology demonstrate good performance in a 250kW PV PCS.

Partial Discharge Pattern Recognition using Neural Network (뉴우럴 네트워크에 의한 부분방전 패턴 인식)

  • Lee, June-Ho;Hozumi, Naohiro;Okamoto, Tatsuki
    • Proceedings of the KIEE Conference
    • /
    • 1995.07c
    • /
    • pp.1304-1306
    • /
    • 1995
  • In this study, a neural network algorithm through a data standardization method was developed to discriminate the phase-shifted partial discharge(PD) patterns such as a $\phi$-q-n pattern. Considering the PD measurement in the field, it is not so easy to acquire absolute phase angles of PD pulses. As a consequence, one of the significant problems to be solved in applying the neural network algorithm to practical systems is to develop a method that can discriminate phase-shifted $\phi$-q-n patterns. Therefore, authors established a new method which could convert phase-shifted $\phi$-q-n patterns to a standardized $\phi$-q-n pattern which was not influenced by phase shifting. This new standardization method improved the recognition performance of a neural network for the phase-shifted $\phi$-q-n patterns considerably.

  • PDF

Development of Semi-Active Control Algorithm Using Deep Q-Network (Deep Q-Network를 이용한 준능동 제어알고리즘 개발)

  • Kim, Hyun-Su;Kang, Joo-Won
    • Journal of Korean Association for Spatial Structures
    • /
    • v.21 no.1
    • /
    • pp.79-86
    • /
    • 2021
  • Control performance of a smart tuned mass damper (TMD) mainly depends on control algorithms. A lot of control strategies have been proposed for semi-active control devices. Recently, machine learning begins to be applied to development of vibration control algorithm. In this study, a reinforcement learning among machine learning techniques was employed to develop a semi-active control algorithm for a smart TMD. The smart TMD was composed of magnetorheological damper in this study. For this purpose, an 11-story building structure with a smart TMD was selected to construct a reinforcement learning environment. A time history analysis of the example structure subject to earthquake excitation was conducted in the reinforcement learning procedure. Deep Q-network (DQN) among various reinforcement learning algorithms was used to make a learning agent. The command voltage sent to the MR damper is determined by the action produced by the DQN. Parametric studies on hyper-parameters of DQN were performed by numerical simulations. After appropriate training iteration of the DQN model with proper hyper-parameters, the DQN model for control of seismic responses of the example structure with smart TMD was developed. The developed DQN model can effectively control smart TMD to reduce seismic responses of the example structure.

Q-learning to improve learning speed using Minimax algorithm (미니맥스 알고리즘을 이용한 학습속도 개선을 위한 Q러닝)

  • Shin, YongWoo
    • Journal of Korea Game Society
    • /
    • v.18 no.4
    • /
    • pp.99-106
    • /
    • 2018
  • Board games have many game characters and many state spaces. Therefore, games must be long learning. This paper used reinforcement learning algorithm. But, there is weakness with reinforcement learning. At the beginning of learning, reinforcement learning has the drawback of slow learning speed. Therefore, we tried to improve the learning speed by using the heuristic using the knowledge of the problem domain considering the game tree when there is the same best value during learning. In order to compare the existing character the improved one. I produced a board game. So I compete with one-sided attacking character. Improved character attacked the opponent's one considering the game tree. As a result of experiment, improved character's capability was improved on learning speed.

(The Development of Janggi Board Game Using Backpropagation Neural Network and Q Learning Algorithm) (역전파 신경회로망과 Q학습을 이용한 장기보드게임 개발)

  • 황상문;박인규;백덕수;진달복
    • Journal of the Institute of Electronics Engineers of Korea TE
    • /
    • v.39 no.1
    • /
    • pp.83-90
    • /
    • 2002
  • This paper proposed the strategy learning method by means of the fusion of Back-Propagation neural network and Q learning algorithm for two-person, deterministic janggi board game. The learning process is accomplished simply through the playing each other. The system consists of two parts of move generator and search kernel. The one consists of move generator generating the moves on the board, the other consists of back-propagation and Q learning plus $\alpha$$\beta$ search algorithm in an attempt to learn the evaluation function. while temporal difference learns the discrepancy between the adjacent rewards, Q learning acquires the optimal policies even when there is no prior knowledge of effects of its moves on the environment through the learning of the evaluation function for the augmented rewards. Depended on the evaluation function through lots of games through the learning procedure it proved that the percentage won is linearly proportional to the portion of learning in general.

A Robot Soccer Strategy and Tactic Using Fuzzy Logic (퍼지 로직을 적용한 로봇축구 전략 및 전술)

  • Lee, Jeong-Jun;Ji, Dong-Min;Lee, Won-Chang;Kang, Geun-Taek;Joo, Moon G.
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.16 no.1
    • /
    • pp.79-85
    • /
    • 2006
  • This paper presents a strategy and tactic for robot soccer using furry logic mediator that determines robot action depending on the positions and the roles of adjacent two robots. Conventional Q-learning algorithm, where the number of states increases exponentially with the number of robots, is not suitable for a robot soccer system, because it needs so much calculation that processing cannot be accomplished in real time. A modular Q-teaming algorithm reduces a number of states by partitioning the concerned area, where mediator algorithm for cooperation of robots is used additionally. The proposed scheme implements the mediator algorithm among robots by fuzzy logic system, where simple fuzzy rules make the calculation easy and hence proper for robot soccer system. The simulation of MiroSot shows the feasibility of the proposed scheme.

Q-Learning Policy and Reward Design for Efficient Path Selection (효율적인 경로 선택을 위한 Q-Learning 정책 및 보상 설계)

  • Yong, Sung-Jung;Park, Hyo-Gyeong;You, Yeon-Hwi;Moon, Il-Young
    • Journal of Advanced Navigation Technology
    • /
    • v.26 no.2
    • /
    • pp.72-77
    • /
    • 2022
  • Among the techniques of reinforcement learning, Q-Learning means learning optimal policies by learning Q functions that perform actionsin a given state and predict future efficient expectations. Q-Learning is widely used as a basic algorithm for reinforcement learning. In this paper, we studied the effectiveness of selecting and learning efficient paths by designing policies and rewards based on Q-Learning. In addition, the results of the existing algorithm and punishment compensation policy and the proposed punishment reinforcement policy were compared by applying the same number of times of learning to the 8x8 grid environment of the Frozen Lake game. Through this comparison, it was analyzed that the Q-Learning punishment reinforcement policy proposed in this paper can significantly increase the learning speed compared to the application of conventional algorithms.

D-q Equivalent Circuit-based Protection Algorithm for a Doubly-fed Induction Generator in the Time Domain

  • Kang, Yong-Cheol;Kang, Hae-Gweon;Lee, Ji-Hoon
    • Journal of Electrical Engineering and Technology
    • /
    • v.5 no.3
    • /
    • pp.371-378
    • /
    • 2010
  • Most modern wind turbines employ a doubly-fed induction generator (DFIG) system due to its many advantages, such as variable speed operation, relatively high efficiency, and small converter size. The DFIG system uses a wound rotor induction machine so that the magnetizing current of the generator can be fed from both the stator and the rotor. We propose a protection algorithm for a DFIG based on a d-q equivalent circuit in the time domain. In the DFIG, the voltages and currents of the rotor side and the stator side are available. The proposed algorithm estimates the instantaneous induced voltages of magnetizing inductance using those voltages and currents from both the stator and the rotor sides. If the difference between the two estimated induced voltages exceeds the threshold, the proposed algorithm detects an internal fault. The performance of the proposed algorithm is verified under various operating and fault conditions using a PSCAD/EMTDC simulator.