Class for assigning current estimated value for a given action and provides method for returning this value.
More...
|
| actions |
| Store current action ids and their current estimated value and step: {action_id: [est_value, step_count]}. More...
|
|
| estimate_value |
| A reference to the estimation method chosen by the est_method_id. More...
|
|
Class for assigning current estimated value for a given action and provides method for returning this value.
def rl_logic.ValueEstimator.__init__ |
( |
|
self, |
|
|
|
est_method_id = "sample_average" |
|
) |
| |
Constructor.
- Parameters
-
self | The object pointer. |
est_method_id | Default calculation method of the estimation value |
def rl_logic.ValueEstimator.delete_action_id |
( |
|
self, |
|
|
|
action_id |
|
) |
| |
Delete an action_id from the current actions list.
- Parameters
-
self | The object pointer. |
action_id | ID of the action being deleted. |
def rl_logic.ValueEstimator.estimate_value |
( |
|
self | ) |
|
Main method for estimation value calculation.
It is being overridden in the constructor, depending on the chosen estimation method ID. Input: action_id - some action identifier; reward - value of the assigned reward. Output: current estimated value.
- Parameters
-
def rl_logic.ValueEstimator.estimate_value_by_sample_average |
( |
|
self, |
|
|
|
action_id, |
|
|
|
reward |
|
) |
| |
Estimate value by using a simple "sample average" method.
Reference to the method can be found in R.Sutton's book: Reinforcement Learning: An Introduction.
- Parameters
-
self | The object pointer. |
action_id | ID of the action having been chosen. |
reward | Reward value received on the corresponding action ID. |
- Returns
- Estimated value in float().
rl_logic.ValueEstimator.actions |
Store current action ids and their current estimated value and step: {action_id: [est_value, step_count]}.
rl_logic.ValueEstimator.estimate_value |
A reference to the estimation method chosen by the est_method_id.
The documentation for this class was generated from the following file: