EMBEDDING SPECTROGRAM

Heatmap analysis of 300-dimensional word embeddings across 4 models

V28: Standard Skip-Gram
V33: Mixed SG+CBOW
V34: Dynamic Masking
Google: Google Word2Vec

1. Semantic Sweep Heatmaps

Dimensions sorted by correlation with word order. Top = most positively correlated.

-1 +1

2. Cosine Similarity Matrices

Similarity between all word pairs. Smooth gradients indicate learned order.

-1 +1

3. PCA Loop Plots

Top 2 PCA components. Loops indicate learned circular/periodic structure.

4. PCA Component Waves

Top 6 PCA components plotted as you sweep along a semantic axis. Clean sine waves = learned periodic structure.

5. PCA Activation Surface

All PCA components as a heatmap surface. X = position in sequence (interpolated), Y = PCA component. Wave patterns visible as colored bands.

-1 +1

6. Dimension Activation Fingerprints

Mean |activation| per dimension, per word category. Normalized per row.

0 max

7. Numerical Analysis Summary

Quantitative comparison of geometric structure across models.