Layer Analysis¶
Not all layers are equally effective for steering. PSYCTL includes tools to analyze which layers provide the best separation between personality-exhibiting and neutral activations.
CLI Usage¶
psyctl layer.analyze \
--model "google/gemma-3-270m-it" \
--layers "model.layers[*].mlp" \
--dataset "./dataset/steering" \
--method svm \
--top-k 5
Analysis Methods¶
SVM Analyzer¶
Trains a linear SVM at each layer to classify positive vs neutral activations:
- Score: Overall separation quality (higher is better)
- Accuracy: Classification accuracy
- Margin: SVM margin (larger = more robust)
Consensus Analyzer¶
Combines multiple analysis methods for more robust layer selection.
Layer Patterns¶
Wildcard patterns for targeting groups of layers:
| Pattern | Matches |
|---|---|
model.layers[*].mlp |
All MLP layers |
model.layers[5:15].mlp |
Layers 5-14 |
model.layers[::2].mlp |
Every other layer |
Interactive Notebook¶
Try the 05_layer_analysis notebook for a visual walkthrough with charts.