by **Yixuan Li, Jason Yosinski, Jeff Clune, Hod Lipson & John Hopcraft**

ICLR, 2016

It is well known within the deep learning community that networks with different random initializations learn different weights. *In Convergent Learning: Do Different Neural Networks Learn the Same Representations?* Li et. al. aim to learn properties of the learned weights across neural networks trained with different initializations. Though there are no immediate applications of their studies, a better understanding of what neural networks learn has myriad applications such as model compression and training better model ensembles.

The basic analysis presented by the authors relies on aligning filters learned in different networks. To match filters, the authors propose computing the correlation between activations of different filters. Though correlation is a simple metric, a small study demonstrates that correlation yields similar results to mutual information, a more statistically powerful metric, and is considerably faster to compute. For all experiments the authors use CaffeNet networks.

After defining a way to measure the similarity between filters, the authors conduct a series of experiments to better understand how filters align in various networks. A first set of experiments aims to understand if there exists a direct mapping from filters learned in one network to filters learned in a second network. Filters are aligned across networks by matching each filter in the first network (Net1) to its most highly correlated filter in the second network (Net2); the paper explores matching filters in Net2 to multiple filters in Net1, as well as a traditional bipartite mapping. Both mappings reveal similar conclusions; many filters in Net1 have highly similar filters in Net2, though some filters do not match well. Furthermore, filters align best in Conv1 and Conv5, and worse in the layers in between. It is unsurprising that filters do not align well after Conv1 since differences in filters can propagate from layer to layer, but less intuitive why filters align better in Conv5 than Conv4. One possibility is that networks are trained using the same label sets.

Though some filters do not match well, it is still possible that a group of neurons in one network can interact together to extract features similar to a single filter or a group of filters in another network. In a second set of experiments, the authors demonstrate that filters in Net1 can be mapped to filters in Net2 by finding a linear combination of filters in Net2 which can represent a filter from Net1 based on minimizing a sum of squared prediction loss with L1 regularization. An average of 4.7 conv1 weights from Net2 is needed to accurately represent each conv1 weight in Net1 with a small loss in accuracy. The authors then use agglomerative clustering to match groups of filters between networks. Clustering results demonstrate that networks learn groups of filters which represent similar subspaces, but the filters which span the same subspace across networks are different. Another method to explore clusters learned by different networks could be representational dissimilarity analysis (RDMs) [1], which is used in the neuroscience community to measure if different neural codes represent the same information.

The experiments in this paper provide insight into what kinds of weights neural networks learn. None of the results are particularly surprising, but the experiments are thorough and provide a solid foundation for future applications. Though it is unlikely that results change substantially between separate networks, it would be worthwhile to conduct similar experiments on other networks to confirm that the patterns observed in this paper are consistent across network architectures. The authors also concentrate their analysis on convolutional filters, but fully connected layers could be analyzed in a similar way. Additionally, network design is increasingly moving towards deeper networks. The authors note that matching is best between Conv1 and Conv5 for CaffeNet, but it is possible that matching ceases to occur at very deep layers in current models such as VGG, GoogleNet and ResNet. If this is the case, it is unclear how the analysis presented can lead to applications suggested by the authors.

[1] Kriegeskorte, Nikolaus, Marieke Mur, and Peter A. Bandettini. “Representational similarity analysis-connecting the branches of systems neuroscience.” *Frontiers in systems neuroscience 2* (2008): 4.