Relative Position Representations

Recent Posts

Recent Comments

Tags more

Archives

Today

Total

관리 메뉴

숟가락 그만 얹어

Relative Position Representations 본문

Research/Generative Model

Relative Position Representations

업무외시간 2021. 3. 12. 18:29

Efficient implementation을 위해 위 식을 tensor 형태로 표현하면

Key : (batch_size, head, seq_length, d)

Query : (batch_size, head, seq_length, d)

A : (seq_length, seq_length, d)

Tensor A는 embedding (i, j)의 learnable position distance 정보를 담고 있음. i와 j의 relative distance가 일정 이상 (k) 멀어지면 clip하여 설정한 index k 또는 -k의 embedding이 사용되도록 설계함.

Query * Key : (batch, head, seq_length, seq_length)

Query * A : (batch_size, head, seq_length, seq_length)

E : (batch_size, head, seq_length, seq_length)

References

[1] P. Shaw et al., Self-Attention with Relative Position Representations, arXiv 2018

[2] medium.com/@_init_/how-self-attention-with-relative-position-representations-works-28173b8c245a

저작자표시

'Research > Generative Model' 카테고리의 다른 글

Gaussian Discriminant Analysis (0)	2021.05.31
SVG-LP (0)	2021.03.21
KL Annealing (0)	2021.02.19
Posterior Collapse (0)	2021.02.18
Adversarial Latent AutoEncoders (2)	2020.08.21

'Research/Generative Model' Related Articles

숟가락 그만 얹어

Relative Position Representations 본문

Relative Position Representations

'Research > Generative Model' 카테고리의 다른 글

티스토리툴바