The code of the alignment bias is not consistent with Equation 9. #86

FacePoluke · 2023-05-15T11:50:23Z

The values outside of the diagonal are 1, not -inf.

ChengyuanYan · 2023-10-05T00:29:56Z

the function returns a boolean mask so the placeholder value doesn't really matter. When the transformer computes attention, places that have non-zero value (i.e. True in the returned matrix) will be masked out, which is somewhat equivalent to adding -inf to the attention value

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The code of the alignment bias is not consistent with Equation 9. #86

The code of the alignment bias is not consistent with Equation 9. #86

FacePoluke commented May 15, 2023

ChengyuanYan commented Oct 5, 2023

The code of the alignment bias is not consistent with Equation 9. #86

The code of the alignment bias is not consistent with Equation 9. #86

Comments

FacePoluke commented May 15, 2023

ChengyuanYan commented Oct 5, 2023