A Recipe for Applying Neural Networks to a Novel Problems
Date : 2022.12.25
*The original contents of this post is from Andrej Karpathy’s blog.
[A Recipe for Training Neural Networks]
While building my own CNN, I had numerous encounters with the huge gap in between “here is how a convolutional layer works” and “our network achieves state of the art results.” Thankfully Andrej has some words of wisdom about this concern. (Link)
Numerous machine learning frameworks have the potential danger of “leaky abstraction.” As Andrej writes,
Backprop + SGD does not magically make your network work. Batch normalization does not magically make it converge faster. RNN doesn’t magically let you plug in text. If you insist on using the technology without understanding how it works you are likely to fail.
Andrej acknowledges that suffering is a perfectly natural part of getting a neural network to work well. However, it can be mitigated by being thorough, defensive, paranoid, and obsessed with visualizations of basically every possible thing.
[Our Goal]
Prevent the introduction of a lot of “unverified” complexity at once, which is bound to lead to bugs / misconfigurations that will take forever to find (if ever).
'Tech Development > Deep Learning (CNN)' 카테고리의 다른 글
CNN: The Afterwork (0) | 2022.12.29 |
---|---|
Deep Convolutional Neural Network (0) | 2022.12.25 |
Image Visualization and Primary Networks (feat. LeNet, AlexNet) (0) | 2022.12.25 |
Completing the CNN (0) | 2022.12.25 |
Convolutional & Pooling Layers (0) | 2022.12.25 |
댓글