728x90
시간이 나면 보는 걸로...

https://arxiv.org/abs/2202.05607
Online Decision Transformer
Recent work has shown that offline reinforcement learning (RL) can be formulated as a sequence modeling problem (Chen et al., 2021; Janner et al., 2021) and solved via approaches similar to large-scale language modeling. However, any practical instantiatio
arxiv.org
'관심있는 주제 > RL' 카테고리의 다른 글
| Paper) Heuristic Algorithm-based Action Masking Reinforcement Learning (HAAM-RL) with Ensemble Inference Method 읽어보기 (0) | 2024.06.22 |
|---|---|
| [RL] PPO 학습 중에 nan 나오는 특이한 경우 (5) | 2022.05.12 |
| 진행중) Reverb: a framework for experience replay 알아보기 (0) | 2021.10.07 |
| RL) MARL 자료 모음 (2) | 2021.09.25 |
| Paper) Neural Combinatorial Optimization with Reinforcement Learning - Not Finished... (0) | 2021.09.14 |