矩阵求导的一般规律
Consider two vector
If we define
and
Gradient is different from the derivative. It can be seen as the transpose of derivatives.
In derivative, we may have:
with the shape
Consider two vector
If we define
and
Gradient is different from the derivative. It can be seen as the transpose of derivatives.
In derivative, we may have:
with the shape