업무외시간 2021. 3. 12. 16:21

Preprocess

Time-grid based representation: 16-note interval로 quantization을 수행. 결과적으로 한 마디에 16개의 note가 찍힘. Monophony melody의 경우 128 note-on, 1 note-off, 1 rest로 총 130 class로 구성되며 drum의 경우 9개의 북 또는 심벌을 치는 경우의 수 2^9 class로 표현됨. (one-hot vector)

 

Model

MusicVAE

- Recurrent VAE

- Latent vectors can capture the global characteristic of data

- Hierarchcal decoder: preventing posterior collapse / generating long-term sequence

 

github.com/HanSangJun/MusicVAE

HanSangJun/MusicVAE

MusicVAE Implementation. Contribute to HanSangJun/MusicVAE development by creating an account on GitHub.

github.com

References

[1] A. Roberts et al., A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music, ICML 2018