• Title/Summary/Keyword: Policy Gradient

Search Result 73, Processing Time 0.024 seconds

A Study for Detecting Fuel-cut Driving of Vehicle Using GPS (GPS를 이용한 차량 연료차단 관성주행의 감지에 관한 연구)

  • Ko, Kwang-Ho
    • Journal of Digital Convergence
    • /
    • v.17 no.11
    • /
    • pp.207-213
    • /
    • 2019
  • The fuel-cut coast-down driving mode is activated when the acceleration pedal is released with transmission gear engaged, and it's a default function for electronic-controlled engine of vehicles. The fuel economy becomes better because fuel injection stops during fuel-cut driving mode. A fuel-cut detection method is suggested in the study and it's based on the speed, acceleration and road gradient data from GPS sensor. It detects fuel-cut driving mode by comparing calculated acceleration and realtime acceleration value. The one is estimated with driving resistance in the condition of fuel-cut driving and the other is from GPS sensor. The detection accuracy is about 80% when the method is verified with road driving data. The result is estimated with 9,600 data set of vehicle speed, acceleration, fuel consumption and road gradient from test driving on the road of 12km during 16 minutes, and the road slope is rather high. It's easy to detect fuel-cut without injector signal obtained by connecting wire. The detection error is from the fact that the variation range of speed, acceleration and road gradient data, used for road resistance force, is larger than the value of fuel consumption data.

Distribution of Phytoplankton and Bacteria in the Environmental Transitional Zone of Tropical Mangrove Area (열대 홍수림 주변 해역 환경 전이대의 식물플랑크톤 및 박테리아의 분포)

  • Choi, Dong Han;Noh, Jae Hoon;Ahn, Sung Min;Lee, Charity M.;Kim, Dongseon;Kim, Kyung-Tae;Kwon, Moon-Sang;Park, Heung-Sik
    • Ocean and Polar Research
    • /
    • v.35 no.4
    • /
    • pp.415-425
    • /
    • 2013
  • In order to understand phytoplankton and bacterial distribution in tropical coral reef ecosystems in relation to the mangrove community, their biomass and activities were measured in the sea waters of the Chuuk and the Kosrae lagoons located in Micronesia. Chlorophyll a and bacterial abundance showed maximal values in the seawater near the mangrove forests, and then steeply decreased as the distance increased from the mangrove forests, indicating that environmental conditions for these microorganisms changed greatly in lagoon waters. Together with chlorophyll a, abundance of Synechococcus and phototrophic picoeukaryotes and a variety of indicator pigments for dinoflagellates, diatoms, green algae and cryptophytes also showed similar spatial distribution patterns, suggesting that phytoplankton assemblages respond to the environmental gradient by changing community compositions. In addition, primary production and bacterial production were also highest in the bay surrounded by mangrove forest and lowest outside of the lagoon. These results suggest that mangrove waters play an important role in energy production and nutrient cycling in tropical coasts, undoubtedly receiving large inputs of organic matter from shore vegetation such as mangroves. However, the steep decrease of biomass and production of phytoplankton and heterotrophic bacteria within a short distance from the bay to the level of oligotrophic waters indicates that the effect of mangrove waters does not extend far away.

Robot Locomotion via RLS-based Actor-Critic Learning (RLS 기반 Actor-Critic 학습을 이용한 로봇이동)

  • Kim, Jong-Ho;Kang, Dae-Sung;Park, Joo-Young
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.7
    • /
    • pp.893-898
    • /
    • 2005
  • Due to the merits that only a small amount of computation is needed for solutions and stochastic policies can be handled explicitly, the actor-critic algorithm, which is a class of reinforcement learning methods, has recently attracted a lot of interests in the area of artificial intelligence. The actor-critic network composes of tile actor network for selecting control inputs and the critic network for estimating value functions, and in its training stage, the actor and critic networks take the strategy, of changing their parameters adaptively in order to select excellent control inputs and yield accurate approximation for value functions as fast as possible. In this paper, we consider a new actor-critic algorithm employing an RLS(Recursive Least Square) method for critic learning, and policy gradients for actor learning. The applicability of the considered algorithm is illustrated with experiments on the two linked robot arm.

Evaluation of Human Demonstration Augmented Deep Reinforcement Learning Policy Optimization Methods Using Object Manipulation with an Anthropomorphic Robot Hand (휴먼형 로봇 손의 사물 조작 수행을 이용한 인간 행동 복제 강화학습 정책 최적화 방법 성능 평가)

  • Park, Na Hyeon;Oh, Ji Heon;Ryu, Ga Hyun;Anazco, Edwin Valarezo;Lopez, Patricio Rivera;Won, Da Seul;Jeong, Jin Gyun;Chang, Yun Jung;Kim, Tae-Seong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.858-861
    • /
    • 2020
  • 로봇이 사람과 같이 다양하고 복잡한 사물 조작을 하기 위해서 휴먼형 로봇손의 사물 파지 작업이 필수적이다. 자유도 (Degree of Freedom, DoF)가 높은 휴먼형(anthropomorphic) 로봇손을 학습시키기 위하여 사람 데모(human demonstration)가 결합된 강화학습 최적화 방법이 제안되었다. 본 연구에서는 강화학습 최적화 방법에 사람 데모가 결합된 Demonstration Augmented Natural Policy Gradient(DA-NPG)와 NPG 의 성능 비교를 통하여 행동 복제의 효율성을 확인하고, DA-NPG, DA-Trust Region Policy Optimization (DA-TRPO), DA-Proximal Policy Optimization (DA-PPO)의 최적화 방법의 성능 평가를 위하여 6 종의 물체에 대한 휴먼형 로봇손의 사물 조작 작업을 수행한다. 그 결과, DA-NPG 와 NPG를 비교한 결과를 통해 휴먼형 로봇손의 사물 조작 강화학습에 행동 복제가 효율적임을 증명하였다. 또한, DA-NPG 는 DA-TRPO 와 유사한 성능을 보이면서 모든 물체에 대한 사물 파지에 성공하여 가장 안정적이었다. 반면, DA-TRPO 와 DA-PPO 는 사물 조작에 실패한 물체가 존재하여 불안정한 성능을 보였다. 본 연구에서 제안하는 방법은 향후 실제 휴먼형 로봇에 적용하여 휴먼형 로봇 손의 사물조작 지능 개발에 유용할 것으로 전망된다.

Research on the Polarization Effects of the Shandong Processing Trade and Strategy to Coordinate Its Development

  • Xiao, Dan Dan
    • Asian Journal of Business Environment
    • /
    • v.3 no.2
    • /
    • pp.17-22
    • /
    • 2013
  • Purpose - This dissertation is based on previous research, and analyzes processing trade, which constitutes a major section of foreign trade in Shandong Province. Research design, data, and methodology - The study uses the survey data on polarization, which is a vital index reflecting the unbalanced growth of regional economic development. The article introduces the processing trade polarization index, and the processing trade polarization fluctuation rate, to predict the geographical polarization posture and development trends in Shandong Province. Results -The development of processing trade in Shandong Province shows the level of gradient from east to west. The first-line growth pole has been formed and developed, and the initial formation of the diffusion mechanism has taken place. However, coordination problems in accompanying regional development have become increasingly prominent. Conclusions - This study focuses on the development of processing trade strategy and suggests overall coordination of development objectives, using non-balanced development goals. According to regional characteristics and development objectives of the processing trade in Shandong Province, the region around the city is divided into innovation diffusion region, enhanced growth areas, areas expected to undertake development, and areas to upgrade in four levels, given the different policy proposals.

Groundwater Modeling for Estimating Water Balance over Pyosun Watershed in Jeju Island (제주도 표선유역의 물수지 평가를 위한 지하수 유동 모델링)

  • Song, Sung-Ho;Lee, Gyu-Sang;An, Jung-Gi;Jeon, Sun-Geum;Yi, Myung-Jae
    • Journal of Environmental Science International
    • /
    • v.24 no.4
    • /
    • pp.495-504
    • /
    • 2015
  • To estimate water balance of Pyosun watershed in Jeju Island, a three-dimensional finite difference model MODFLOW was applied. Moreover, the accuracy of groundwater flow modeling was evaluated through the comparison of the recharge rate by flow modeling and the existing one from water balance model. The modeling result under the steady-state condition indicates that groundwater flow direction was from Mt. Halla to the South Sea and groundwater gradient was gradually lowered depending on the elevation. Annual recharge rate by the groundwater flow modeling in Pyosun watershed was calculated to 236 million $m^3/year$ and it was found to be very low as compared to the recharge rate 238 million $m^3/year$ by the existing water balance model. Therefore, groundwater flow modeling turned out to be useful to estimate the recharge rate in Pyosun watershed and it would be available to make groundwater management policy for watershed in the future.

The Protocol of Basic Body Ability for 4D Cycling System (4D 사이클링에 대한 기초 신체능력 프로토콜)

  • Kim, Ki-Bong;Lee, Sung-Han
    • Journal of Digital Convergence
    • /
    • v.11 no.11
    • /
    • pp.313-320
    • /
    • 2013
  • The four-dimensional cycle simulator, which can recognize whether the road is ascent or descent, its gradient, and status of its surface on Virtual Reality, is introduced in this paper. On the base of these recognitions, various virtual motion path situations are displayed on LCD monitor attached ahead. These various situations may support not only the sense of realty but also an interest in a game in opposition to previous exercise cycles that might be monotonous. In this paper both analysis and estimation of basic bodily abilities for the four-dimensional cycling are addressed.

Designing an Efficient Reward Function for Robot Reinforcement Learning of The Water Bottle Flipping Task (보틀플리핑의 로봇 강화학습을 위한 효과적인 보상 함수의 설계)

  • Yang, Young-Ha;Lee, Sang-Hyeok;Lee, Cheol-Soo
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.2
    • /
    • pp.81-86
    • /
    • 2019
  • Robots are used in various industrial sites, but traditional methods of operating a robot are limited at some kind of tasks. In order for a robot to accomplish a task, it is needed to find and solve accurate formula between a robot and environment and that is complicated work. Accordingly, reinforcement learning of robots is actively studied to overcome this difficulties. This study describes the process and results of learning and solving which applied reinforcement learning. The mission that the robot is going to learn is bottle flipping. Bottle flipping is an activity that involves throwing a plastic bottle in an attempt to land it upright on its bottom. Complexity of movement of liquid in the bottle when it thrown in the air, makes this task difficult to solve in traditional ways. Reinforcement learning process makes it easier. After 3-DOF robotic arm being instructed how to throwing the bottle, the robot find the better motion that make successful with the task. Two reward functions are designed and compared the result of learning. Finite difference method is used to obtain policy gradient. This paper focuses on the process of designing an efficient reward function to improve bottle flipping motion.

Predicting Reports of Theft in Businesses via Machine Learning

  • JungIn, Seo;JeongHyeon, Chang
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.4
    • /
    • pp.499-510
    • /
    • 2022
  • This study examines the reporting factors of crime against business in Korea and proposes a corresponding predictive model using machine learning. While many previous studies focused on the individual factors of theft victims, there is a lack of evidence on the reporting factors of crime against a business that serves the public good as opposed to those that protect private property. Therefore, we proposed a crime prevention model for the willingness factor of theft reporting in businesses. This study used data collected through the 2015 Commercial Crime Damage Survey conducted by the Korea Institute for Criminal Policy. It analyzed data from 834 businesses that had experienced theft during a 2016 crime investigation. The data showed a problem with unbalanced classes. To solve this problem, we jointly applied the Synthetic Minority Over Sampling Technique and the Tomek link techniques to the training data. Two prediction models were implemented. One was a statistical model using logistic regression and elastic net. The other involved a support vector machine model, tree-based machine learning models (e.g., random forest, extreme gradient boosting), and a stacking model. As a result, the features of theft price, invasion, and remedy, which are known to have significant effects on reporting theft offences, can be predicted as determinants of such offences in companies. Finally, we verified and compared the proposed predictive models using several popular metrics. Based on our evaluation of the importance of the features used in each model, we suggest a more accurate criterion for predicting var.

Enhancing VANET Security: Efficient Communication and Wormhole Attack Detection using VDTN Protocol and TD3 Algorithm

  • Vamshi Krishna. K;Ganesh Reddy K
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.1
    • /
    • pp.233-262
    • /
    • 2024
  • Due to the rapid evolution of vehicular ad hoc networks (VANETs), effective communication and security are now essential components in providing secure and reliable vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communication. However, due to their dynamic nature and potential threats, VANETs need to have strong security mechanisms. This paper presents a novel approach to improve VANET security by combining the Vehicular Delay-Tolerant Network (VDTN) protocol with the Deep Reinforcement Learning (DRL) technique known as the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. A store-carry-forward method is used by the VDTN protocol to resolve the problems caused by inconsistent connectivity and disturbances in VANETs. The TD3 algorithm is employed for capturing and detecting Worm Hole Attack (WHA) behaviors in VANETs, thereby enhancing security measures. By combining these components, it is possible to create trustworthy and effective communication channels as well as successfully detect and stop rushing attacks inside the VANET. Extensive evaluations and simulations demonstrate the effectiveness of the proposed approach, enhancing both security and communication efficiency.