# Python – PyTorch: manually setting weight parameters with numpy array for GRU / LSTM

lstm, python, pytorch, rnn

I'm trying to fill up GRU/LSTM with manually defined parameters in pytorch.

I have numpy arrays for parameters with shapes as defined in their documentation (https://pytorch.org/docs/stable/nn.html#torch.nn.GRU).

It seems to work but I'm not sure whether the returned values are correct.

Is this a right way to fill up GRU/LSTM with numpy parameters?

`gru = nn.GRU(input_size, hidden_size, num_layers, bias=True, batch_first=False, dropout=dropout, bidirectional=bidirectional)def set_nn_wih(layer, parameter_name, w, l0=True): param = getattr(layer, parameter_name) if l0: for i in range(3*hidden_size): param.data[i] = w[i*input_size:(i+1)*input_size] else: for i in range(3*hidden_size): param.data[i] = w[i*num_directions*hidden_size:(i+1)*num_directions*hidden_size]def set_nn_whh(layer, parameter_name, w): param = getattr(layer, parameter_name) for i in range(3*hidden_size): param.data[i] = w[i*hidden_size:(i+1)*hidden_size]l0=Truefor i in range(num_directions): for j in range(num_layers): if j == 0: wih = w0[i, :, :3*input_size] whh = w0[i, :, 3*input_size:] # check l0=True else: wih = w[j-1, i, :, :num_directions*3*hidden_size] whh = w[j-1, i, :, num_directions*3*hidden_size:] l0=False if i == 0: set_nn_wih( gru, "weight_ih_l{}".format(j), torch.from_numpy(wih.flatten()),l0) set_nn_whh( gru, "weight_hh_l{}".format(j), torch.from_numpy(whh.flatten())) else: set_nn_wih( gru, "weight_ih_l{}_reverse".format(j), torch.from_numpy(wih.flatten()),l0) set_nn_whh( gru, "weight_hh_l{}_reverse".format(j), torch.from_numpy(whh.flatten()))y, hn = gru(x_t, h_t)`

numpy arrays are defined as following:

`rng = np.random.RandomState(313)w0 = rng.randn(num_directions, hidden_size, 3*(input_size + hidden_size)).astype(np.float32)w = rng.randn(max(1, num_layers-1), num_directions, hidden_size, 3*(num_directions*hidden_size + hidden_size)).astype(np.float32)`

## Best Solution

That is a good question, and you already give a decent answer. However, it reinvents the wheel - there is a very elegant Pytorch internal routine that will allow you to do the same without as much effort - and one that is applicable for any network.

The core concept here is PyTorch's

`state_dict`

. The state dictionary effectively contains the`parameters`

organized by the tree-structure given by the relationship of the`nn.Modules`

and their submodules, etc.## The short answer

If you only want the code to load a value into a tensor using the

`state_dict`

, then try this line (where the`dict`

contains a valid`state_dict`

):where

`strict=False`

is crucial if you want to loadonly some parameter values.## The long answer - including an introduction to PyTorch's

`state_dict`

Here's an example of how a state dict looks for a GRU (I chose

`input_size = hidden_size = 2`

so that I can print the entire state dict):So the

`state_dict`

all the parameters of the network. If we have "nested"`nn.Modules`

, we get the tree represented by the parameter names:So - what if you want to not extract the state dict, but change it - and thereby the network's parameters? Use

`nn.Module.load_state_dict(state_dict, strict=True)`

(link to the docs)This method allows you to load an entire state_dict with arbitrary valuesinto an instantiated model of the same kindas long as the keys (i.e. the parameter names) are correct and the values (i.e. the parameters) are`torch.tensors`

of the right shape.If the`strict`

kwarg is set to`True`

(the default), the dict you load has to exactly match the original state dict, except for the values of the parameters. That is, there has to be one new value for each parameter.For the GRU example above, we need a tensor of the correct size (and the correct device, btw) for each of

`'weight_ih_l0', 'weight_hh_l0', 'bias_ih_l0', 'bias_hh_l0'`

. As we sometimes only want to loadsomevalues (as I think you want to do), we can set the`strict`

kwarg to`False`

- and we can then load only partial state dicts, as e.g. one that only contains parameter values for`'weight_ih_l0'`

.As a practical advice, I'd simply create the model you want to load values into, and then print the state dict (or at least a list of the keys and the respective tensor sizes)

That tells you what the exact name of the parameter is you want to change. You then simply create a state dict with the respective parameter name and tensor, and load it: