xpag.agents.rljax_agents.algorithm.sac_discor.SACDisCor#
- class SACDisCor(num_agent_steps, observation_dim, action_dim, seed, max_grad_norm=None, gamma=0.99, num_critics=2, buffer_size=1000000, batch_size=256, start_steps=10000, update_interval=1, tau=0.005, fn_actor=None, fn_critic=None, fn_error=None, lr_actor=0.0003, lr_critic=0.0003, lr_alpha=0.0003, lr_error=0.0003, units_actor=(256, 256), units_critic=(256, 256), units_error=(256, 256, 256), log_std_min=-20.0, log_std_max=2.0, d2rl=False, init_alpha=1.0, init_error=10.0, adam_b1_alpha=0.9)#
Bases:
DisCorMixIn
,SAC
Methods
calculate_value
- rtype:
Array
explore
get_key_list
get_mask
is_update
load_params
save_params
select_action
step
update
Attributes
kwargs_actor
kwargs_critic
name