V ITAL: More Understandable Feature Visualization through Distribution Alignment and Relevant Information Flow

Max Planck Institute for Informatics, Saarland Informatics Campus, Germany
Learning from model weights

Unlike traditional feature visualization (FV) methods, which often produce artifacts or repetitive patterns, VITAL generates more understandable visualizations through feature distribution matching. Our approach scales effectively to modern architectures (rows), generalizes well across diverse classes (columns), and better captures meaningful network representations.

Abstract

Neural networks are widely adopted to solve complex and challenging tasks. Especially in high-stakes decision-making, understanding their reasoning process is crucial, yet proves challenging for modern deep networks. Feature visualization (FV) is a powerful tool to decode what information neurons are responding to and hence to better understand the reasoning behind such networks. In particular, in FV we generate human-understandable images that reflect the information detected by neurons of interest. However, current methods often yield unrecognizable visualizations, exhibiting repetitive patterns and visual artifacts that are hard to understand for a human. To address these problems, we propose to guide FV through statistics of real image features combined with measures of relevant network flow to generate prototypical images. Our approach yields human-understandable visualizations that both qualitatively and quantitatively improve over state-of-the-art FVs across various architectures. As such, it can be used to decode which information the network uses, complementing mechanistic circuits that identify where it is encoded.

Class Neuron Visualization

Class neuron visualization aims to reveal what a neural network "sees" when it thinks about a specific class (e.g., "dog" or "airplane"). This is done by generating an image that maximally activates the output neuron corresponding to that class. The resulting visualization gives insight into the features the model associates with that category—such as shapes, textures, or patterns. VITAL enhances this by aligning these visualizations with real-world feature distributions, resulting in clearer and more realistic class representations. This is achieved by matching the generated image's feature distribution to that of real images from the same class through the sort matching algorithm. The result is a more interpretable and meaningful visualization that can help us understand how the model perceives different classes.

Qualitative Results Across Architectures

Quantitative Results

Quantitative Results Table
Comparison of methods on different architectures trained on ImageNet. FID scores, CLIP Zero-shot prediction scores, and top-1 accuracy are reported. Bold and underlined indicate best and second-best results, respectively.

Intermediate Neuron Visualization

Intermediate neuron visualization focuses on understanding how information is represented deep inside the network, rather than just at the classification layer. These internal neurons often respond to abstract concepts like "fur texture" or "wheel shapes," even if they're not directly tied to a class. By visualizing what activates these hidden neurons, we can uncover emergent concepts and compositional features the model builds up to make decisions. VITAL improves this process by filtering neurons based on their relevance and by guiding the visualizations with real feature statistics—leading to more meaningful and interpretable representations. Instead of just maximizing neuron activation like in traditional methods, VITAL traces how much relevant information flows from the neuron toward the model’s final decision for that class and aligns the feature distribution of generated images with the feature distribution of real images that activates the target neuron the most.

Qualitative Results for ResNet50

BibTeX

@misc{gorgun2025vitalunderstandablefeaturevisualization,
      title={VITAL: More Understandable Feature Visualization through Distribution Alignment and Relevant Information Flow}, 
      author={Ada Gorgun and Bernt Schiele and Jonas Fischer},
      year={2025},
      eprint={2503.22399},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2503.22399}, 
  }