Spectral Normalization

Research/Operations

업무외시간 2021. 5. 22. 18:06

후아, 다른 분들이 정리해둔 포스팅을 읽고 겨우겨우 따라갈 수 있었다...

GAN의 training stability를 향상하기 위해 dicriminator의 각 layer weight를 largest singular value로 나누어준다. 왜???

- 각 layer weight의 largest singular value는 Lipschitz norm의 역할을 한다.

- Lipschitz norm은 함수의 기울기를 일정 미만 (gradient의 principle direction scale)으로 제한시키는 역할을 한다.

- 이를 spectral_norm(gradient(g(h)))로 구할 수 있는데 g(h) = Wh라면 이는 spectral_norm(W)과 같고, spectral norm은 largest singular value와 같다.

- Spectral normalization은 gradient의 principle direction을 adaptive regularization 하는 효과가 있다.

References

[1] T. Miyato et al., Spectral Normalization for Generative Adversarial Networks, ICLR 2018

[2] 최근우님 - http://keunwoochoi.blogspot.com/2018/01/

[3] 나상현님 - http://sanghyeonna.blogspot.com/2018/10/spectral-normalization-part3.html