Video
Colab
Cheatsheet
Broadcasting 是指,在運算中,不同大小的兩個array 應該怎樣處理的操作。通常情況下,小一點的數組會被broadcast 到大一點的,這樣才能保持大小一致。
PyTorch 中 Tensor Broadcasting 详解-PyTorch 中文网
PyTorch implements a number of gradient-based optimization methods in torch.optim
, including Gradient Descent. At the minimum, it takes in the model parameters and a learning rate.
Optimizers do not compute the gradients for you, so you must call backward()
yourself. You also must call the optim.zero_grad()
function before calling backward()
since by default PyTorch does and inplace add to the .grad
member variable rather than overwriting it.
This does both the detach_()
and zero_()
calls on all tensor's grad
variables.