【苦しみながら理解するReinforcement Learning】チュートリアル その4-5

今回は、Part 5: Visualizing an Agent’s Thoughts and Actions

概要

it is just as important to understand how, and even more critically, why that agent is behaving in a certain way.

We can think of this visualization as providing a portal into the “thought process” of our agent. Does it know that it is in a good position when it is in a good position? Does it know that going down was a good thing to do when it went down? This can give us the insights needed to understand why our agent might not be performing ideally as we train it under different circumstances in different environments.

Getting Inside Our Agent’s Head

Not only can we use the interface to explore how the agent does during training, we can also use it for testing and debugging our fully trained agents.

デバッグ方法まだわかってないんですが、デバッグに使えたらいいですね!

トレーニングした後に、テストしてみたら緑ばっかり出現した時はグリーンに近づくactionの価値が上がり、

赤ばかり出現した時は近づかないactionの価値が高くなった。

そして、緑も赤もとぱらった時は、ランダムに移動した。

なんとなく、テスト方法がわかりましたねw
意地悪すればいいw
因みにこの意地悪はtrainのデータセットには入っていないようです。

While we may explicitly only think of the green as being rewarding and the red as being punishing, we are subconsciously constraining our actions by a desire to finish quickly.

Using the Control Center

The agent’s performance you will see was pretrained on the gridworld task for 40,000 episodes.

実際のcontrol centerがこちらに。

The Control Center is a piece of software I plan to continue to develop as I work more with various Reinforcement Learning algorithms.

コメントを残す

メールアドレスが公開されることはありません。 * が付いている欄は必須項目です