xpag.agents.flax_agents.sac.sac_from_jaxrl.SACLearner#
- class SACLearner(seed, observations, actions, actor_lr, critic_lr, temp_lr, hidden_dims, discount, tau, target_update_period, target_entropy, backup_entropy, init_temperature, init_mean, policy_final_fc_init_scale)#
Bases:
object
An implementation of the version of Soft-Actor-Critic described in https://arxiv.org/abs/1812.05905
Methods
sample_actions
- rtype:
Array
update
- rtype:
Dict
[str
,float
]