You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I don't get what you mean by the expression '(xy)(x*z)' but here is the logic behind incrementing the previous value of the gradient:
Consider an expression like y = (a * b) + (a * c).
When we are evaluating the expression (a*b) to find the gradient of y with respect to a and b, we say that the gradient of y with respect to a is out.grad * b (for this example out.grad will be 1 at that point) and the gradient of y with respect to b is a * out.grad.
So what we currently have is
a.grad = b
b.grad = a
Then when we are trying to evaluate the second expression (a * c) by a similar procedure, we find
c.grad = a
but here we should not say a.grad = c. We should increment the previous a.grad by c. So, a.grad += c.
In the end we should have:
a.grad = b+c
b.grad = a
c.grad = c
Obviously this is what we expect with regular calculus.
def mul(self, other):
other = other if isinstance(other, Value) else Value(other)
out = Value(self.data * other.data, (self, other), '*')
If you have an expression of type (xy)(x*z) then the gradient w.r.t x is not additive, right?
The text was updated successfully, but these errors were encountered: