You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I found that the current custom environment does not seem to support discrete action space, so I changed the model to a one-dimensional continuous action space. But I found that my definition of action_space does not seem to work.
At the beginning, I noticed that the output of the action always hovered around 0.9 during training, and the value space I defined was [-20, 20]. To verify whether it is a question of randomness, I changed the range to [-0.1, 0.1], but the action of each input step() is still 0.9~1.1.
I noticed that the value of the upper and lower limits of the action space in "off_policy_marl.py" seems to be wrong, and only returns "NONE". I wonder if this is the root cause that affects the correctness of action space? Or is it that something went wrong when I defined the environment?
I checked the observation_space and state_space, and their values are both normal.
The text was updated successfully, but these errors were encountered:
Hi, the range of continuous actions returned by agent.policy in XuanCe is determined by the activation function you choose. For instance, using the sigmoid activation function restricts the action range to [0, 1], while tanh results in a range of [-1, 1]. This range cannot be modified by other ways.
If your custom environment has an action space of [low, high], you can rescale the actions within env.step() before they are executed. That means, if the activation action is tanh, you can modify your code like this:
Thanks for the reminder! There are two activation parameters in the configuration file, "activation" and "action_activation". What do they correspond to?
Hello! I found that the current custom environment does not seem to support discrete action space, so I changed the model to a one-dimensional continuous action space. But I found that my definition of action_space does not seem to work.



At the beginning, I noticed that the output of the action always hovered around 0.9 during training, and the value space I defined was [-20, 20]. To verify whether it is a question of randomness, I changed the range to [-0.1, 0.1], but the action of each input step() is still 0.9~1.1.
I noticed that the value of the upper and lower limits of the action space in "off_policy_marl.py" seems to be wrong, and only returns "NONE". I wonder if this is the root cause that affects the correctness of action space? Or is it that something went wrong when I defined the environment?
I checked the observation_space and state_space, and their values are both normal.
The text was updated successfully, but these errors were encountered: