encoding.utils

class encoding.utils.ActivationCache(cache_dir: str = 'cache')

Handles caching and loading of language model activations with multi-layer support.

__init__(cache_dir: str = 'cache')

Initialize the cache.

Parameters:: cache_dir – Directory to store cached activations

get_cache_path(cache_key: str) → Path: Get the path for a cached activation file.

load_activations(cache_key: str) → numpy.ndarray | None: Load single layer activations from cache (backward compatibility).

load_multi_layer_activations(cache_key: str) → LazyLayerCache | None

Load multi-layer activations from cache with lazy loading.

Parameters:: cache_key – Cache key for the activations
Returns:: LazyLayerCache object if cache exists, None otherwise

save_activations(cache_key: str, activations: numpy.ndarray): Save single layer activations to cache (backward compatibility).

save_multi_layer_activations(cache_key: str, all_layer_activations: Dict[int, numpy.ndarray], metadata: Dict[str, Any]) → None

Save multi-layer activations to cache with lazy loading support.

Parameters:

cache_key – Cache key for the activations
all_layer_activations – Dictionary mapping layer indices to activations
metadata – Metadata about the cached activations

class encoding.utils.LazyLayerCache(cache_file_path: str | Path)

Lazy loading cache for multi-layer activations.

__init__(cache_file_path: str | Path)

Initialize the lazy layer cache.

Parameters:: cache_file_path – Path to the cache file

clear_loaded_layers() → None: Clear loaded layers from memory.

get_available_layers() → List[int]

Get list of available layers in the cache.

Returns:: List of available layer indices

get_layer(layer_idx: int) → numpy.ndarray

Load specific layer on demand.

Parameters:: layer_idx – Index of the layer to load
Returns:: Layer activations as numpy array

get_layers(layer_indices: List[int]) → List[numpy.ndarray]

Load multiple specific layers.

Parameters:: layer_indices – List of layer indices to load
Returns:: List of layer activations

get_metadata() → Dict[str, Any]

Load only metadata (fast).

Returns:: Dictionary containing cache metadata

validate_context_type(expected_context_type: str) → None

Validate that the cache was created with the expected context type.

Parameters:: expected_context_type – Expected context type
Raises:: ValueError – If context type doesn’t match

class encoding.utils.ModelSaver(base_dir: str = 'results')

Class for saving and loading model weights and hyperparameters.

__init__(base_dir: str = 'results')

Initialize the ModelSaver.

Parameters:: base_dir – Base directory for saving results

list_runs() → List[Dict[str, Any]]

List all saved runs with their hyperparameters and metrics.

Returns:: List of dictionaries containing run information

load_encoding_model(run_dir: str | Path) → Tuple[numpy.ndarray, numpy.ndarray, Dict[str, Any], Dict[str, Any]]

Load encoding model weights and hyperparameters.

Parameters:: run_dir – Path to the run directory
Returns:: Tuple of (weights, best_alphas, hyperparams, metrics)

save_encoding_model(weights: numpy.ndarray, best_alphas: numpy.ndarray, hyperparams: Dict[str, Any], metrics: Dict[str, Any], save_weights: bool = False) → Path

Save encoding model weights and hyperparameters.

Parameters:

weights – Model weights (n_features, n_targets)
best_alphas – Best alpha values for each target
hyperparams – Dictionary of hyperparameters
metrics – Dictionary of evaluation metrics

Returns:

Path to the run directory

class encoding.utils.SpeechActivationCache(cache_dir: str = 'speech_cache')

Caching for speech model activations (multi-layer, single .pkl file).

__init__(cache_dir: str = 'speech_cache')

get_cache_key(*, audio_id: str, model_name: str, chunk_size: float, context_size: float, pool: str, target_sample_rate: int, dataset_type: str = 'speech', extra: Dict[str, Any] | None = None) → str

get_cache_path(cache_key: str) → Path

load_activations(cache_key: str) → numpy.ndarray | None

load_multi_layer_activations(cache_key: str) → SpeechLazyLayerCache | None

save_activations(cache_key: str, activations: numpy.ndarray)

save_multi_layer_activations(cache_key: str, all_layer_activations: Dict[int, numpy.ndarray], metadata: Dict[str, Any], times: numpy.ndarray | None = None) → None

Persist to a single pickle:

{: “metadata”: { … speech params … }, “layers”: { int: np.ndarray [n_chunks, D], … }, “times”: np.ndarray [n_chunks] or None

}

class encoding.utils.SpeechLazyLayerCache(cache_file_path: str | Path)

Lazy (API-compatible) loader for speech multi-layer activations (single .pkl file).

__init__(cache_file_path: str | Path)

clear_loaded_layers() → None

get_available_layers() → List[int]

get_layer(layer_idx: int) → numpy.ndarray

get_layers(layer_indices: List[int]) → List[numpy.ndarray]

get_metadata() → Dict[str, Any]

get_times() → numpy.ndarray | None

validate_params(*, expected: Dict[str, Any]) → None: Validate core speech params match (e.g., model_name, chunk/context size, pool, sr). Raises ValueError on mismatch.

encoding.utils.demean(v): Removes the mean from each column of [v].

encoding.utils.dm(v): Removes the mean from each column of [v].

encoding.utils.make_delayed(stim, delays, circpad=False)

Creates non-interpolated concatenated delayed versions of [stim] with the given [delays] (in samples).

If [circpad], instead of being padded with zeros, [stim] will be circularly shifted.

encoding.utils.mcorr(c1, c2): Matrix correlation. Find the correlation between each column of [c1] and the corresponding column of [c2].

encoding.utils.rescale(v): Rescales each column of [v] to have unit variance.

encoding.utils.rs(v): Rescales each column of [v] to have unit variance.

encoding.utils.unmask_correlations_for_plotting(masked_correlations: numpy.ndarray, mask_indices: numpy.ndarray, full_size: int) → numpy.ndarray

Expand masked correlations back to full brain size for plotting.

Parameters:

masked_correlations – Correlations from masked analysis (n_masked_voxels,)
mask_indices – Indices where mask was True (n_masked_voxels,)
full_size – Size of full brain (e.g., 20484 for fsaverage5)

Returns:

Full-size array with NaNs in unmasked regions (full_size,)

Return type:

full_correlations

encoding.utils.validate_path(func): Decorator to validate that assembly_path exists before loading.

encoding.utils.xcorr(c1, c2): Cross-column correlation. Finds the correlation between each row of [c1] and each row of [c2].

encoding.utils.zs(v): Z-scores (standardizes) each column of [v].

encoding.utils.zscore(v): Z-scores (standardizes) each column of [v].