• Title/Summary/Keyword: Q-Learning

Search Result 432, Processing Time 0.026 seconds

An Intelligent Video Streaming Mechanism based on a Deep Q-Network for QoE Enhancement (QoE 향상을 위한 Deep Q-Network 기반의 지능형 비디오 스트리밍 메커니즘)

  • Kim, ISeul;Hong, Seongjun;Jung, Sungwook;Lim, Kyungshik
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.2
    • /
    • pp.188-198
    • /
    • 2018
  • With recent development of high-speed wide-area wireless networks and wide spread of highperformance wireless devices, the demand on seamless video streaming services in Long Term Evolution (LTE) network environments is ever increasing. To meet the demand and provide enhanced Quality of Experience (QoE) with mobile users, the Dynamic Adaptive Streaming over HTTP (DASH) has been actively studied to achieve QoE enhanced video streaming service in dynamic network environments. However, the existing DASH algorithm to select the quality of requesting video segments is based on a procedural algorithm so that it reveals a limitation to adapt its performance to dynamic network situations. To overcome this limitation this paper proposes a novel quality selection mechanism based on a Deep Q-Network (DQN) model, the DQN-based DASH ABR($DQN_{ABR}$) mechanism. The $DQN_{ABR}$ mechanism replaces the existing DASH ABR algorithm with an intelligent deep learning model which optimizes service quality to mobile users through reinforcement learning. Compared to the existing approaches, the experimental analysis shows that the proposed solution outperforms in terms of adapting to dynamic wireless network situations and improving QoE experience of end users.

A Study on Culinary Arts Major Students's Type of Subjectivity Recognition through Restaurant Start-up Experience Program -Focused on Pop-up Restaurant- (외식창업교육 체험프로그램을 통한 조리전공 재학생의 주관적 인식유형 연구 -팝업레스토랑을 중심으로-)

  • Kim, Chan-Woo;Shin, Seoung-Hoon
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.6
    • /
    • pp.347-358
    • /
    • 2019
  • This study used Q methodology for analogizing culinary arts major students' subjectivity through their participation of one of restaurant start-up experience program, called pop-up restaurant. The study tried to figure out particular structure of among students' responses and noticed five distinctive types. There were Increase learning effect type(Type 1, N=4), Collaboration of members importance type(Type2, N=8), Marketing PR need type(Type3, N=6), Restaurant business plan type(Type4, N=4), Industry work experience required type(Type5, N=3). The study also revealed that each type contained diverse characteristics figures in their own. The research finding could be used as fundamental source of future similar research but in other research methodology in the format of difference among students or diverse measuring point of time frame.

Development of Deep Learning Model for Fingerprint Identification at Digital Mobile Radio (무선 단말기 Fingerprint 식별을 위한 딥러닝 구조 개발)

  • Jung, Young-Giu;Shin, Hak-Chul;Nah, Sun-Phil
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.1
    • /
    • pp.7-13
    • /
    • 2022
  • Radio frequency fingerprinting refers to a methodology that extracts hardware-specific characteristics of a transmitter that are unintentionally embedded in a transmitted waveform. In this paper, we put forward a fingerprinting feature and deep learning structure that can identify the same type of Digital Mobile Radio(DMR) by inputting the in-phase(I) and quadrature(Q). We proposes using the magnitude in polar coordinates of I/Q as RF fingerprinting feature and a modified ResNet-1D structure that can identify them. Experimental results show that our proposed modified ResNet-1D structure can achieve recognition accuracy of 99.5% on 20 DMR.

IRSML: An intelligent routing algorithm based on machine learning in software defined wireless networking

  • Duong, Thuy-Van T.;Binh, Le Huu
    • ETRI Journal
    • /
    • v.44 no.5
    • /
    • pp.733-745
    • /
    • 2022
  • In software-defined wireless networking (SDWN), the optimal routing technique is one of the effective solutions to improve its performance. This routing technique is done by many different methods, with the most common using integer linear programming problem (ILP), building optimal routing metrics. These methods often only focus on one routing objective, such as minimizing the packet blocking probability, minimizing end-to-end delay (EED), and maximizing network throughput. It is difficult to consider multiple objectives concurrently in a routing algorithm. In this paper, we investigate the application of machine learning to control routing in the SDWN. An intelligent routing algorithm is then proposed based on the machine learning to improve the network performance. The proposed algorithm can optimize multiple routing objectives. Our idea is to combine supervised learning (SL) and reinforcement learning (RL) methods to discover new routes. The SL is used to predict the performance metrics of the links, including EED quality of transmission (QoT), and packet blocking probability (PBP). The routing is done by the RL method. We use the Q-value in the fundamental equation of the RL to store the PBP, which is used for the aim of route selection. Concurrently, the learning rate coefficient is flexibly changed to determine the constraints of routing during learning. These constraints include QoT and EED. Our performance evaluations based on OMNeT++ have shown that the proposed algorithm has significantly improved the network performance in terms of the QoT, EED, packet delivery ratio, and network throughput compared with other well-known routing algorithms.

The Core Capabilities and Differences of Korean Credit Card Companies: Based on Q Analysis Results of Employers in Credit Card Companies (한국 신용카드기업의 역량과 차별성 : 신용카드기업 종사자에 대한 Q 분석결과를 중심으로)

  • Koh, Hyung-Myun
    • Survey Research
    • /
    • v.9 no.2
    • /
    • pp.85-118
    • /
    • 2008
  • There have been a great many ups and downs in Korean Credit Card Industries after 2000. It is certain that each Credit Card Company copes with the situations by means of its organizational capabilities. According to the evolutionary view of Institutionalism, a company's capabilities are composed of every day routines, codes, rules, learning(by doing, using, interacting) and decision makings. The purpose of this article is to verify how two Korean Credit Card Companies showed their strategic differences and what factors exposed each company's own limits. To examine the company's ordinary conducts, this study used Q methodology with a random sample of 16 members from each company. Especially from the results of Q factor analysis, it becomes clear that each company still takes a serious view of the economy of scale rather than the innovation or improvement of organizational and relational dynamics.

  • PDF

On the Radial Basis Function Networks with the Basis Function of q-Normal Distribution

  • Eccyuya, Kotaro;Tanaka, Masaru
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.26-29
    • /
    • 2002
  • Radial Basis Function (RBF) networks is known as efficient method in classification problems and function approximation. The basis function of RBF networks is usual adopted normal distribution like the Gaussian function. The output of the Gaussian function has the maximum at the center and decrease as increase the distance from the center. For learning of neural network, the method treating the limited area of input space is sometimes more useful than the method treating the whole of input space. The q-normal distribution is the set of probability density function include the Gaussian function. In this paper, we introduce the RBF networks with the basis function of q-normal distribution and actually approximate a function using the RBF networks.

  • PDF

Thrust and Propellant Mixture Ratio Control of Open Type Liquid Propellant Rocket Engine (개방형 액체추진제로켓엔진의 추력 및 혼합비 제어)

  • Jung, Young-Suk;Lee, Jung-Ho;Oh, Seung-Hyub
    • Proceedings of the KSME Conference
    • /
    • 2007.05a
    • /
    • pp.1143-1148
    • /
    • 2007
  • LRE(Liquid propellant Rocket Engine) is one of the important parts to control the motion of rocket. For operation of rocket in error boundary of the set-up trajectory, it is necessarily to control the thrust of LRE according to the required thrust profile and control the mixture ratio of propellants fed into combustor for the constant mixture ratio. It is not easy to control thrust and mixture ratio of propellants since there are co-interferences among the components of LRE. In this study, the dynamic model of LRE was constructed and the dynamic characteristics were analyzed with control system as PID control and PID+Q-ILC(Iterative Learning Control with Quadratic Criterion) control. From the analysis, it could be observed that PID+Q-ILC control logic is more useful than standard PID control system for control of LRE.

  • PDF

Actor-Critic Reinforcement Learning System with Time-Varying Parameters

  • Obayashi, Masanao;Umesako, Kosuke;Oda, Tazusa;Kobayashi, Kunikazu;Kuremoto, Takashi
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.138-141
    • /
    • 2003
  • Recently reinforcement learning has attracted attention of many researchers because of its simple and flexible learning ability for any environments. And so far many reinforcement learning methods have been proposed such as Q-learning, actor-critic, stochastic gradient ascent method and so on. The reinforcement learning system is able to adapt to changes of the environment because of the mutual action with it. However when the environment changes periodically, it is not able to adapt to its change well. In this paper we propose the reinforcement learning system that is able to adapt to periodical changes of the environment by introducing the time-varying parameters to be adjusted. It is shown that the proposed method works well through the simulation study of the maze problem with aisle that opens and closes periodically, although the conventional method with constant parameters to be adjusted does not works well in such environment.

  • PDF

Effects of Interactions and Affective Factors in On-line English Grammar Courses of High Education (온라인 대학영문법 강의에서 상호작용과 정의적 요인이 교육효과에 미치는 영향)

  • Park, Deok-Jae
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.4
    • /
    • pp.510-519
    • /
    • 2012
  • The purpose of this study is to investigate how interactions and affective factors are influencing on-line English Grammar courses of higher education. This study addressed the following questions: (1) How are the interactions going in on-line English Grammar courses? (2) Are affective factors influencing effective learning in on-line English Grammar courses? The questionnaire was conducted on 170 college students who have taken on-line English course of K University. The data analysis of 300 college students' responses on their courses showed that e-learning has both positive and negative effects compared to face to face classroom instructions. Analysis showed that the percentage of students who have got negative opinions on e-learning was 17%, while that of students who have got positive opinions was 49.3%. The percentage of those in the middle was 33.3%. However, results demonstrated that immediate feedback and affective factors could be facilitated through Q&A bulletin and feedback program for completing on-line learning. Negative effects of on-line learning can be solved by a planned and well-supported on-line approach that includes a theory-based instructional model rather than the new method replaced by 'blended learning' that combines face-to-face classroom instruction with on-line learning.

Policy Modeling for Efficient Reinforcement Learning in Adversarial Multi-Agent Environments (적대적 멀티 에이전트 환경에서 효율적인 강화 학습을 위한 정책 모델링)

  • Kwon, Ki-Duk;Kim, In-Cheol
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.3
    • /
    • pp.179-188
    • /
    • 2008
  • An important issue in multiagent reinforcement learning is how an agent should team its optimal policy through trial-and-error interactions in a dynamic environment where there exist other agents able to influence its own performance. Most previous works for multiagent reinforcement teaming tend to apply single-agent reinforcement learning techniques without any extensions or are based upon some unrealistic assumptions even though they build and use explicit models of other agents. In this paper, basic concepts that constitute the common foundation of multiagent reinforcement learning techniques are first formulated, and then, based on these concepts, previous works are compared in terms of characteristics and limitations. After that, a policy model of the opponent agent and a new multiagent reinforcement learning method using this model are introduced. Unlike previous works, the proposed multiagent reinforcement learning method utilize a policy model instead of the Q function model of the opponent agent. Moreover, this learning method can improve learning efficiency by using a simpler one than other richer but time-consuming policy models such as Finite State Machines(FSM) and Markov chains. In this paper. the Cat and Mouse game is introduced as an adversarial multiagent environment. And effectiveness of the proposed multiagent reinforcement learning method is analyzed through experiments using this game as testbed.