Reinforcement Learning

leomao, bobogei81123, step5

Games

CartPole

  • Observation: 4 維
  • Action Space: 2 個
  • Reward:
OpenAI Evaluation

Acrobot

  • Observation: 4 維
  • Action Space: 3 個
  • Reward:

躲(吃)子彈

  • Observation: $120 \times 60$ RGB pixels
  • Action Space: 2 個
  • Reward: 吃一個子彈得到 -1 (+1)

二維吃食物

  • Observation: $40 \times 40$ RGB pixels
  • Action Space: 4 個
  • Reward: 吃到一個食物 +1 / 撞一次牆 -1

簡易特訓 99

  • Observation: $100 \times 100$ RGB pixels $\times 4$ frames
  • Action Space: 5 個
  • Reward: 吃一個子彈得到 -1

Approaches

DQN

  • CartPole, Acrobot → 一般的 DQN
  • 影像小遊戲 → CNN 版的 DQN

Important Observations

    Results

    Discussion & Qeustions