Heatmap analysis of 300-dimensional word embeddings across 4 models
Dimensions sorted by correlation with word order. Top = most positively correlated.
Similarity between all word pairs. Smooth gradients indicate learned order.
Top 2 PCA components. Loops indicate learned circular/periodic structure.
Top 6 PCA components plotted as you sweep along a semantic axis. Clean sine waves = learned periodic structure.
All PCA components as a heatmap surface. X = position in sequence (interpolated), Y = PCA component. Wave patterns visible as colored bands.
Mean |activation| per dimension, per word category. Normalized per row.
Quantitative comparison of geometric structure across models.