encoding.utils
- class encoding.utils.ActivationCache(cache_dir: str = 'cache')
Handles caching and loading of language model activations with multi-layer support.
- __init__(cache_dir: str = 'cache')
Initialize the cache.
- Parameters:
cache_dir – Directory to store cached activations
- get_cache_path(cache_key: str) Path
Get the path for a cached activation file.
- load_activations(cache_key: str) numpy.ndarray | None
Load single layer activations from cache (backward compatibility).
- load_multi_layer_activations(cache_key: str) LazyLayerCache | None
Load multi-layer activations from cache with lazy loading.
- Parameters:
cache_key – Cache key for the activations
- Returns:
LazyLayerCache object if cache exists, None otherwise
- save_activations(cache_key: str, activations: numpy.ndarray)
Save single layer activations to cache (backward compatibility).
- save_multi_layer_activations(cache_key: str, all_layer_activations: Dict[int, numpy.ndarray], metadata: Dict[str, Any]) None
Save multi-layer activations to cache with lazy loading support.
- Parameters:
cache_key – Cache key for the activations
all_layer_activations – Dictionary mapping layer indices to activations
metadata – Metadata about the cached activations
- class encoding.utils.LazyLayerCache(cache_file_path: str | Path)
Lazy loading cache for multi-layer activations.
- __init__(cache_file_path: str | Path)
Initialize the lazy layer cache.
- Parameters:
cache_file_path – Path to the cache file
- clear_loaded_layers() None
Clear loaded layers from memory.
- get_available_layers() List[int]
Get list of available layers in the cache.
- Returns:
List of available layer indices
- get_layer(layer_idx: int) numpy.ndarray
Load specific layer on demand.
- Parameters:
layer_idx – Index of the layer to load
- Returns:
Layer activations as numpy array
- get_layers(layer_indices: List[int]) List[numpy.ndarray]
Load multiple specific layers.
- Parameters:
layer_indices – List of layer indices to load
- Returns:
List of layer activations
- get_metadata() Dict[str, Any]
Load only metadata (fast).
- Returns:
Dictionary containing cache metadata
- validate_context_type(expected_context_type: str) None
Validate that the cache was created with the expected context type.
- Parameters:
expected_context_type – Expected context type
- Raises:
ValueError – If context type doesn’t match
- class encoding.utils.ModelSaver(base_dir: str = 'results')
Class for saving and loading model weights and hyperparameters.
- __init__(base_dir: str = 'results')
Initialize the ModelSaver.
- Parameters:
base_dir – Base directory for saving results
- list_runs() List[Dict[str, Any]]
List all saved runs with their hyperparameters and metrics.
- Returns:
List of dictionaries containing run information
- load_encoding_model(run_dir: str | Path) Tuple[numpy.ndarray, numpy.ndarray, Dict[str, Any], Dict[str, Any]]
Load encoding model weights and hyperparameters.
- Parameters:
run_dir – Path to the run directory
- Returns:
Tuple of (weights, best_alphas, hyperparams, metrics)
- save_encoding_model(weights: numpy.ndarray, best_alphas: numpy.ndarray, hyperparams: Dict[str, Any], metrics: Dict[str, Any], save_weights: bool = False) Path
Save encoding model weights and hyperparameters.
- Parameters:
weights – Model weights (n_features, n_targets)
best_alphas – Best alpha values for each target
hyperparams – Dictionary of hyperparameters
metrics – Dictionary of evaluation metrics
- Returns:
Path to the run directory
- class encoding.utils.SpeechActivationCache(cache_dir: str = 'speech_cache')
Caching for speech model activations (multi-layer, single .pkl file).
- __init__(cache_dir: str = 'speech_cache')
- get_cache_key(*, audio_id: str, model_name: str, chunk_size: float, context_size: float, pool: str, target_sample_rate: int, dataset_type: str = 'speech', extra: Dict[str, Any] | None = None) str
- get_cache_path(cache_key: str) Path
- load_activations(cache_key: str) numpy.ndarray | None
- load_multi_layer_activations(cache_key: str) SpeechLazyLayerCache | None
- save_activations(cache_key: str, activations: numpy.ndarray)
- save_multi_layer_activations(cache_key: str, all_layer_activations: Dict[int, numpy.ndarray], metadata: Dict[str, Any], times: numpy.ndarray | None = None) None
- Persist to a single pickle:
- {
“metadata”: { … speech params … }, “layers”: { int: np.ndarray [n_chunks, D], … }, “times”: np.ndarray [n_chunks] or None
}
- class encoding.utils.SpeechLazyLayerCache(cache_file_path: str | Path)
Lazy (API-compatible) loader for speech multi-layer activations (single .pkl file).
- __init__(cache_file_path: str | Path)
- clear_loaded_layers() None
- get_available_layers() List[int]
- get_layer(layer_idx: int) numpy.ndarray
- get_layers(layer_indices: List[int]) List[numpy.ndarray]
- get_metadata() Dict[str, Any]
- get_times() numpy.ndarray | None
- validate_params(*, expected: Dict[str, Any]) None
Validate core speech params match (e.g., model_name, chunk/context size, pool, sr). Raises ValueError on mismatch.
- encoding.utils.demean(v)
Removes the mean from each column of [v].
- encoding.utils.dm(v)
Removes the mean from each column of [v].
- encoding.utils.make_delayed(stim, delays, circpad=False)
Creates non-interpolated concatenated delayed versions of [stim] with the given [delays] (in samples).
If [circpad], instead of being padded with zeros, [stim] will be circularly shifted.
- encoding.utils.mcorr(c1, c2)
Matrix correlation. Find the correlation between each column of [c1] and the corresponding column of [c2].
- encoding.utils.rescale(v)
Rescales each column of [v] to have unit variance.
- encoding.utils.rs(v)
Rescales each column of [v] to have unit variance.
- encoding.utils.unmask_correlations_for_plotting(masked_correlations: numpy.ndarray, mask_indices: numpy.ndarray, full_size: int) numpy.ndarray
Expand masked correlations back to full brain size for plotting.
- Parameters:
masked_correlations – Correlations from masked analysis (n_masked_voxels,)
mask_indices – Indices where mask was True (n_masked_voxels,)
full_size – Size of full brain (e.g., 20484 for fsaverage5)
- Returns:
Full-size array with NaNs in unmasked regions (full_size,)
- Return type:
full_correlations
- encoding.utils.validate_path(func)
Decorator to validate that assembly_path exists before loading.
- encoding.utils.xcorr(c1, c2)
Cross-column correlation. Finds the correlation between each row of [c1] and each row of [c2].
- encoding.utils.zs(v)
Z-scores (standardizes) each column of [v].
- encoding.utils.zscore(v)
Z-scores (standardizes) each column of [v].