网络搭建
1 | # Actor-Critic网络 |
采样过程
1 | for i in range(1000): ##1000个episode |
训练过程 Actor-Critic
1 | def update(self, state, action, reward, next_state, done): |
训练A2C (优势AC算法)
1 | def update(self, state, action, reward, next_state, done): |
人工智能,机器学习 学习记录
1 | # Actor-Critic网络 |
1 | for i in range(1000): ##1000个episode |
1 | def update(self, state, action, reward, next_state, done): |
1 | def update(self, state, action, reward, next_state, done): |