[CS687]Lecture 2

Two most-related fields: (1) Operating research; (2) Control (classical / adaptive)

-The main difference: these fields typically assume the environment (plant) can and should be directly approximated.

— In vast majority of the methods in RL, we don’t want to model the environment or the plant.

Question: When should you use RL?

As a last resort. — If your problem is not well suited to standard control methods and if your problem is not one that is supervised learning. So it’s really evaluative feedback and it’s sequential and you don’t think it’s going to be easy to model that environment that is the perfect setting for reinforcement learning.

The key properties of RL: 1. Evaluation Feedback. 2. Sequential

The agent environment interaction is that we may need to deliberately take poor actions repeatedly in order to get to a state where we can get a very large reward. ( undergraduate and graduate ) That’s why sequential matters.

The difference between State Set (In RL) and State space: A space is a set that has some additional structure like a notion of distance. So space is vectors, where we use Euclidean distance to measure distances between them. Set are more general — every space has a set underneath.

GK的终末世界

发表回复取消回复

分类

近期文章

归档

其他操作

GK的终末世界

[CS687]Lecture 2

发表回复 取消回复

分类

近期文章

归档

其他操作

发表回复取消回复