본문 바로가기
Tech Development/Computer Vision (PyTorch)

PyTorch: parameters() & state_dict()

by JK from Korea 2023. 4. 23.

<PyTorch: parameters() & state_dict()>

 

Date: 2023.02.21                

 

* The PyTorch series will mainly touch on the problem I faced. For actual code, check out my github repository.

 

[.parameters() and .state_dict() in PyTorch]

.parameters() and .state_dict() are two new functions in PyTorch. The purposes of these functions were unclear, so I did some additional searching.

[.parameters() method]
[Output]
[.state_dict() method]
[Output]

In PyTorch, .parameters() and .state_dict() are used to get or manipulate the learnable parameters of a PyTorch model.

 

.parameters() returns an iterator over all the learnable parameters of the model, including both the weights and biases of each layer. It is commonly used for optimizing the model's parameters during training, such as computing gradients or updating weights with an optimizer.

 

.state_dict() returns a dictionary that maps each parameter to its corresponding tensor value. This dictionary contains the complete state of the model, including all the learnable parameters and the optimizer's state. The returned dictionary can be saved to a file and later loaded to resume training or inference from the saved state.

 

[Difference between .parameters() and .state_dict()]

The difference between .parameters() and .state_dict() is that .parameters() returns an iterator over the learnable parameters, whereas .state_dict() returns a dictionary that maps each parameter to its corresponding tensor value.

 

While both methods can be used for saving and loading a model's parameters, .state_dict() is typically used for this purpose since it stores the complete state of the model, including the optimizer's state. Additionally, .state_dict() can also be used to transfer a model's parameters between different devices or processes, while .parameters() can only be used to iterate over the parameters.

 

[Predictions Using Random Parameters]

We have yet to apply gradient descent or backward propagation in our model. Thus, so far, it can only do simple forward propagation with random parameters. Such experimentation will be inaccurate. The following are data and visualization for random weight and bias parameters.

[predictions & actual test results]
[Poor Model Performance]

In the following posts, we will improve the model performance using some familiar techniques such as loss function and backward propagation.

728x90
반응형

댓글