请求RNN方法相关说明文档 #122

HawkQ · 2025-03-28T08:38:12Z

使用多智能体MAPPO算法时，想要尝试用Basic_RNN替换Basic_MLP，在配置文件中同步修改use_rnn: True后，出现错误提示：

Traceback (most recent call last):
File "/Users/hawkq/Desktop/frigatebird_multi/new_run.py", line 22, in <module>
Agent.train(configs.running_steps // configs.parallels) # Train the model for numerous steps.
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/core/on_policy_marl.py", line 287, in train
self.run_episodes(None, n_episodes=self.n_envs, test_mode=False)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/core/on_policy_marl.py", line 384, in run_episodes
policy_out = self.action(obs_dict=obs_dict, state=state, avail_actions_dict=avail_actions,
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/multi_agent_rl/mappo_agents.py", line 141, in action
rnn_hidden_critic_new, values_out = self.policy.get_values(observation=critic_input,
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/policies/gaussian_marl.py", line 176, in get_values
outputs = self.critic_representation[key](observation[key], *rnn_hidden[key])
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/representations/rnn.py", line 63, in forward
output, hn = self.rnn(mlp_output, h)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/torch/nn/modules/rnn.py", line 1117, in forward
raise RuntimeError(
RuntimeError: For unbatched 2-D input, hx should also be 2-D but got 3-D tensor

应该是数据维度的问题，查阅文档后并未发现有相关部分的说明，不知道还需修改环境代码的其他什么位置，以下是我动作、状态、观察空间：

        self.state_space = Box(-np.inf, np.inf, shape=[7 * self.num_agents, ], dtype=np.float32)
        self.observation_space = {agent: Box(-np.inf, np.inf, shape=[14, ], dtype=np.float32) for agent in self.agents}
        self.action_space = {agent: Box(-1, 1, shape=[2, ], dtype=np.float32) for agent in self.agents}

请问还有哪里需要做出调整，谢谢答疑！

The text was updated successfully, but these errors were encountered:

wenzhangliu · 2025-03-29T02:06:34Z

你好，如果要修改representation为RNN，需要配置以下信息：

use_rnn: True
rnn: "GRU"
recurrent_layer_N: 1
fc_hidden_sizes: [64, ]
recurrent_hidden_size: 64
N_recurrent_layers: 1
dropout: 0

HawkQ · 2025-03-29T02:13:26Z

你好，如果要修改representation为RNN，需要配置以下信息：

use_rnn: True
rnn: "GRU"
recurrent_layer_N: 1
fc_hidden_sizes: [64, ]
recurrent_hidden_size: 64
N_recurrent_layers: 1
dropout: 0

您好，按此方法修改后，报错如下

Traceback (most recent call last):
  File "/Users/hawkq/Desktop/frigatebird_multi/new_run.py", line 27, in <module>
    Agent = MAPPO_Agents(config=configs, envs=envs)  # Create a DDPG agent from XuanCe.
  File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/multi_agent_rl/mappo_agents.py", line 24, in __init__
    super(MAPPO_Agents, self).__init__(config, envs)
  File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/multi_agent_rl/ippo_agents.py", line 24, in __init__
    self.policy = self._build_policy()  # build policy
  File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/multi_agent_rl/mappo_agents.py", line 38, in _build_policy
    A_representation = self._build_representation(self.config.representation, self.observation_space, self.config)
  File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/base/agents_marl.py", line 217, in _build_representation
    representation[key] = REGISTRY_Representation[representation_key](**input_representations)
  File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/representations/mlp.py", line 40, in __init__
    self.output_shapes = {'state': (hidden_sizes[-1],)}
KeyError: -1

是否因为我state数据格式需要修改？

wenzhangliu · 2025-03-29T02:20:44Z

可参考这上面的参数配置，对比格式是否一致：https://github.com/agi-brain/xuance/blob/master/examples/mappo/mappo_mpe_configs/simple_spread_v3.yaml.

HawkQ · 2025-03-29T07:52:14Z

我的配置文件就是从simple_spread_v3.yaml修改而来，使用这个yaml也会报错：
RuntimeError: For unbatched 2-D input, hx should also be 2-D but got 3-D tensor
事实上直接使用simple_spread_v3.yaml运行mpe环境的测试，当配置文件修改为RNN相关设置后也会报错

Traceback (most recent call last):
  File "/Users/hawkq/Desktop/frigatebird_multi/testrun.py", line 13, in <module>
    runner.run()
  File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/runners/runner_marl.py", line 32, in run
    self.agents.train(n_train_steps)
  File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/core/on_policy_marl.py", line 287, in train
    self.run_episodes(None, n_episodes=self.n_envs, test_mode=False)
  File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/core/on_policy_marl.py", line 420, in run_episodes
    _, value_next = self.values_next(i_env=i, obs_dict=obs_dict[i], state=state[i],
TypeError: 'NoneType' object is not subscriptable

wenzhangliu · 2025-04-01T05:06:40Z

你好，请问在VDN或MADDPG这类算法上测试过吗？是否也存在同样问题？我需要判断据此判断一下问题出现在哪个环节

HawkQ · 2025-04-03T13:34:48Z

你好，请问在VDN或MADDPG这类算法上测试过吗？是否也存在同样问题？我需要判断据此判断一下问题出现在哪个环节

您好，因readthedocs提供的配置文件有限，仅将MADDPG算法配置文件修改并添加如下内容：

agent: "MADDPG"  # the learning algorithms_marl
env_name: "fb"
env_id: "fb_v0"
env_seed: 1
continuous_action: True
learner: "MADDPG_Learner"
policy: "MADDPG_Policy"
representation: "Basic_RNN"
vectorize: "DummyVecMultiAgentEnv"
runner: "MARL"
distributed_training: False

use_rnn: True
rnn: "GRU"
recurrent_layer_N: 1
fc_hidden_sizes: [64, ]
recurrent_hidden_size: 64
N_recurrent_layers: 1
dropout: 0

representation_hidden_size: []  # the units for each hidden layer
actor_hidden_size: [64, ]
critic_hidden_size: [64, ]
activation: 'leaky_relu'
activation_action: 'sigmoid'
use_parameter_sharing: True
use_actions_mask: False

MADDPG可以运行，但仅单环境运行也会大量读写，不知是否是RNN的特性

在VDN中，配置文件关键如下：

agent: "VDN"  
env_name: "fb"
env_id: "fb_v0"
env_seed: 1
continuous_action: True
learner: "VDN_Learner"
policy: "Mixing_Q_network"
representation: "Basic_MLP"
vectorize: "DummyVecMultiAgentEnv"
runner: "MARL"
distributed_training: False

use_rnn: True
rnn: "GRU"
recurrent_layer_N: 1
fc_hidden_sizes: [64, ]
recurrent_hidden_size: 64
N_recurrent_layers: 1
dropout: 0

representation_hidden_size: [64, ]
q_hidden_size: [64, ]  # the units for each hidden layer
activation: "relu"

此时会报错：

Traceback (most recent call last):
  File "/Users/hawkq/Desktop/frigatebird_multi/new_run.py", line 29, in <module>
    Agent = VDN_Agents(config=configs, envs=envs)
  File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/multi_agent_rl/vdn_agents.py", line 27, in __init__
    self.policy = self._build_policy()  # build policy
  File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/multi_agent_rl/vdn_agents.py", line 44, in _build_policy
    representation = self._build_representation(self.config.representation, self.observation_space, self.config)
  File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/agents/base/agents_marl.py", line 217, in _build_representation
    representation[key] = REGISTRY_Representation[representation_key](**input_representations)
  File "/opt/anaconda3/envs/xuance_marl/lib/python3.8/site-packages/xuance/torch/representations/mlp.py", line 40, in __init__
    self.output_shapes = {'state': (hidden_sizes[-1],)}
KeyError: -1

这与上文报错相同

wenzhangliu · 2025-04-04T03:11:32Z

你好，请确认representation参数设置为“Basic_RNN”：

representation: "Basic_RNN"

HawkQ · 2025-04-04T03:36:56Z

VDN修改为representation: "Basic_RNN"后，由于动作连续，因此报错，但简单调整为离散动作后能够运行

wenzhangliu · 2025-04-04T03:41:19Z

VDN适用于离散动作

HawkQ · 2025-04-27T16:20:38Z

你好，请问在VDN或MADDPG这类算法上测试过吗？是否也存在同样问题？我需要判断据此判断一下问题出现在哪个环节

经过测试，VDN能够运行，MADDPG能够运行，MAPPO报错：

Traceback (most recent call last):
  File "C:\Users\HawkQ\Desktop\frigatebird_multi\new_run.py", line 30, in <module>
    Agent.train(configs.running_steps // configs.parallels)  # Train the model for numerous steps.
  File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\xuance\torch\agents\core\on_policy_marl.py", line 287, in train
    self.run_episodes(None, n_episodes=self.n_envs, test_mode=False)
  File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\xuance\torch\agents\core\on_policy_marl.py", line 384, in run_episodes
    policy_out = self.action(obs_dict=obs_dict, state=state, avail_actions_dict=avail_actions,
  File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\xuance\torch\agents\multi_agent_rl\mappo_agents.py", line 141, in action
    rnn_hidden_critic_new, values_out = self.policy.get_values(observation=critic_input,
  File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\xuance\torch\policies\gaussian_marl.py", line 176, in get_values
    outputs = self.critic_representation[key](observation[key], *rnn_hidden[key])
  File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\xuance\torch\representations\rnn.py", line 63, in forward
    output, hn = self.rnn(mlp_output, h)
  File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Software\Anaconda\envs\xuance_marl\lib\site-packages\torch\nn\modules\rnn.py", line 1117, in forward
    raise RuntimeError(
RuntimeError: For unbatched 2-D input, hx should also be 2-D but got 3-D tensor

wenzhangliu closed this as completed Apr 16, 2025

wenzhangliu reopened this Apr 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

请求RNN方法相关说明文档 #122

请求RNN方法相关说明文档 #122

HawkQ commented Mar 28, 2025

wenzhangliu commented Mar 29, 2025

HawkQ commented Mar 29, 2025

wenzhangliu commented Mar 29, 2025

HawkQ commented Mar 29, 2025

wenzhangliu commented Apr 1, 2025

HawkQ commented Apr 3, 2025

wenzhangliu commented Apr 4, 2025

HawkQ commented Apr 4, 2025

wenzhangliu commented Apr 4, 2025 •

edited

Loading

HawkQ commented Apr 27, 2025

请求RNN方法相关说明文档 #122

请求RNN方法相关说明文档 #122

Comments

HawkQ commented Mar 28, 2025

wenzhangliu commented Mar 29, 2025

HawkQ commented Mar 29, 2025

wenzhangliu commented Mar 29, 2025

HawkQ commented Mar 29, 2025

wenzhangliu commented Apr 1, 2025

HawkQ commented Apr 3, 2025

wenzhangliu commented Apr 4, 2025

HawkQ commented Apr 4, 2025

wenzhangliu commented Apr 4, 2025 • edited Loading

HawkQ commented Apr 27, 2025

wenzhangliu commented Apr 4, 2025 •

edited

Loading