Movatterモバイル変換


[0]ホーム

URL:


Kuznetsov et al., 2021 - Google Patents

Solving continuous control with episodic memory

Kuznetsov et al., 2021

ViewPDF
Document ID
16502663327011426567
Author
Kuznetsov I
Filchenkov A
Publication year
Publication venue
arXiv preprint arXiv:2106.08832

External Links

Snippet

Episodic memory lets reinforcement learning algorithms remember and exploit promising experience from the past to improve agent performance. Previous works on memory mechanisms show benefits of using episodic-based data structures for discrete action …
Continue reading atarxiv.org (PDF) (other versions)

Classifications

The classifications are assigned by a computer and are not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the classifications listed.
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/12Computer systems based on biological models using genetic models
    • G06N3/126Genetic algorithms, i.e. information processing using digital simulations of the genetic system
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/04Inference methods or devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Investment, e.g. financial instruments, portfolio management or fund management
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management

Similar Documents

PublicationPublication DateTitle
Kuznetsov et al.Solving continuous control with episodic memory
Hambly et al.Recent advances in reinforcement learning in finance
Chen et al.Application of deep reinforcement learning on automated stock trading
Rose et al.A reinforcement learning approach to rare trajectory sampling
Liu et al.Practical deep reinforcement learning approach for stock trading
Chen et al.Generative inverse deep reinforcement learning for online recommendation
Li et al.Minimax-optimal multi-agent RL in Markov games with a generative model
Liu et al.Prioritized experience replay based on multi-armed bandit
US20240169237A1 (en)A computer implemented method for real time quantum compiling based on artificial intelligence
Bhambri et al.Reinforcement learning methods for wordle: A pomdp/adaptive control approach
Cini et al.Deep reinforcement learning with weighted Q-Learning
Taveeapiradeecharoen et al.Dynamic model averaging for daily forex prediction: A comparative study
Shi et al.Multi actor hierarchical attention critic with RNN-based feature extraction
Shakya et al.A deep reinforcement learning approach for inventory control under stochastic lead time and demand
Chua et al.FedPEAT: Convergence of federated learning, parameter-efficient fine tuning, and emulator assisted tuning for artificial intelligence foundation models with mobile edge computing
Karda et al.Automation of noise sampling in deep reinforcement learning
Tokmak et al.PACSBO: Probably approximately correct safe Bayesian optimization
Nguyen et al.Nonmyopic multifidelity acitve search
Nabati et al.Representation-driven reinforcement learning
Yang et al.Continuous control for searching and planning with a learned model
Bossens et al.Lifetime policy reuse and the importance of task capacity
Izadi et al.Using rewards for belief state updates in partially observable markov decision processes
Refael et al.LORENZA: Enhancing generalization in low-rank gradient LLM training via efficient zeroth-order adaptive SAM
Marfaing et al.Computer-assisted fraud detection, from active learning to reward maximization
Yin et al.Hashing over predicted future frames for informed exploration of deep reinforcement learning

[8]
ページ先頭

©2009-2025 Movatter.jp