EMBEDDING SPECTROGRAM

Heatmap analysis of 300-dimensional word embeddings across 4 models

V28: Standard Skip-Gram

V33: Mixed SG+CBOW

V34: Dynamic Masking

Google: Google Word2Vec

1. Semantic Sweep Heatmaps

Dimensions sorted by correlation with word order. Top = most positively correlated.

-1

Similarity between all word pairs. Smooth gradients indicate learned order.

-1

Top 2 PCA components. Loops indicate learned circular/periodic structure.

Top 6 PCA components plotted as you sweep along a semantic axis. Clean sine waves = learned periodic structure.

All PCA components as a heatmap surface. X = position in sequence (interpolated), Y = PCA component. Wave patterns visible as colored bands.

-1

Mean |activation| per dimension, per word category. Normalized per row.

max

Quantitative comparison of geometric structure across models.