본문 바로가기
Tech Development/Computer Vision (PyTorch)

PyTorch: Train the Model

by JK from Korea 2023. 4. 23.

<PyTorch: Train the Model>

Date: 2023.03.10                

 

* The PyTorch series will mainly touch on the problem I faced. For actual code, check out my github repository.

 

[Setting up the Model]

So far, we have been building the model bit by bit. Since this model is simple regression, we first initialized X and y data as torch tensor. Then we did the following. Better in code than writing.

[Field Variable Setup]
[Regression Model Setup]
[Training Stage. 100k Iterations.]

You might think 100k iterations is too much. It took approximately 30 seconds to run, but the results were quite satisfying.

 

Before looking at the predictions, let’s review some of the new torch functions in the train model above.

 

Line 17: During the training of a neural network, the optimizer updates the values of the network's weights and biases by computing the gradients of a loss function with respect to the parameters. However, in each training iteration, the optimizer must first set the gradients of the parameters to zero before computing the new gradients. This is because the gradients from the previous iteration are stored and accumulated in the parameter's gradient attribute. The zero_grad() method is used to clear the gradients of the parameters that are optimized by the optimizer so that new gradients can be computed and used for the next update.

 

Line 20: The method takes two inputs:

  • tensor: The tensor for which gradients need to be computed.
  • gradient: The gradient of the loss function with respect to the output tensor. 

The backward() method computes the gradients of the input tensor tensor with respect to the variables that contributed to its computation. The backward() method computes these partial derivatives automatically, given the loss gradient L/y. It sets the gradients of the input tensor x to the computed values, which can be used by an optimizer to update the tensor's values during training. The method does not return anything, but it sets the gradients of the input tensor(s) passed as argument to the method.

 

Line 23: The step() method is used to update the values of the parameters by applying an optimization algorithm to the gradients. The method takes no inputs other than the optimizer itself, and it does not return anything. Its functionality is purely to update the values of the parameters using the computed gradients.

[Training Results (left) & Initial “y_train” values (right)]

I would say these are quite close. We’ll code the testing stage so that is compares the accuracy.

728x90
반응형

댓글