I am trying to use weight normalization with data-dependent initialization as reported in Salimans Kingma 2016 https://arxiv.org/pdf/1602.07868.pdf I use two ...