encoding.utils

class encoding.utils.ActivationCache(cache_dir: str = 'cache')

Handles caching and loading of language model activations with multi-layer support.

__init__(cache_dir: str = 'cache')

Initialize the cache.

Parameters:

cache_dir – Directory to store cached activations

get_cache_path(cache_key: str) Path

Get the path for a cached activation file.

load_activations(cache_key: str) numpy.ndarray | None

Load single layer activations from cache (backward compatibility).

load_multi_layer_activations(cache_key: str) LazyLayerCache | None

Load multi-layer activations from cache with lazy loading.

Parameters:

cache_key – Cache key for the activations

Returns:

LazyLayerCache object if cache exists, None otherwise

save_activations(cache_key: str, activations: numpy.ndarray)

Save single layer activations to cache (backward compatibility).

save_multi_layer_activations(cache_key: str, all_layer_activations: Dict[int, numpy.ndarray], metadata: Dict[str, Any]) None

Save multi-layer activations to cache with lazy loading support.

Parameters:
  • cache_key – Cache key for the activations

  • all_layer_activations – Dictionary mapping layer indices to activations

  • metadata – Metadata about the cached activations

class encoding.utils.LazyLayerCache(cache_file_path: str | Path)

Lazy loading cache for multi-layer activations.

__init__(cache_file_path: str | Path)

Initialize the lazy layer cache.

Parameters:

cache_file_path – Path to the cache file

clear_loaded_layers() None

Clear loaded layers from memory.

get_available_layers() List[int]

Get list of available layers in the cache.

Returns:

List of available layer indices

get_layer(layer_idx: int) numpy.ndarray

Load specific layer on demand.

Parameters:

layer_idx – Index of the layer to load

Returns:

Layer activations as numpy array

get_layers(layer_indices: List[int]) List[numpy.ndarray]

Load multiple specific layers.

Parameters:

layer_indices – List of layer indices to load

Returns:

List of layer activations

get_metadata() Dict[str, Any]

Load only metadata (fast).

Returns:

Dictionary containing cache metadata

validate_context_type(expected_context_type: str) None

Validate that the cache was created with the expected context type.

Parameters:

expected_context_type – Expected context type

Raises:

ValueError – If context type doesn’t match

class encoding.utils.ModelSaver(base_dir: str = 'results')

Class for saving and loading model weights and hyperparameters.

__init__(base_dir: str = 'results')

Initialize the ModelSaver.

Parameters:

base_dir – Base directory for saving results

list_runs() List[Dict[str, Any]]

List all saved runs with their hyperparameters and metrics.

Returns:

List of dictionaries containing run information

load_encoding_model(run_dir: str | Path) Tuple[numpy.ndarray, numpy.ndarray, Dict[str, Any], Dict[str, Any]]

Load encoding model weights and hyperparameters.

Parameters:

run_dir – Path to the run directory

Returns:

Tuple of (weights, best_alphas, hyperparams, metrics)

save_encoding_model(weights: numpy.ndarray, best_alphas: numpy.ndarray, hyperparams: Dict[str, Any], metrics: Dict[str, Any], save_weights: bool = False) Path

Save encoding model weights and hyperparameters.

Parameters:
  • weights – Model weights (n_features, n_targets)

  • best_alphas – Best alpha values for each target

  • hyperparams – Dictionary of hyperparameters

  • metrics – Dictionary of evaluation metrics

Returns:

Path to the run directory

class encoding.utils.SpeechActivationCache(cache_dir: str = 'speech_cache')

Caching for speech model activations (multi-layer, single .pkl file).

__init__(cache_dir: str = 'speech_cache')
get_cache_key(*, audio_id: str, model_name: str, chunk_size: float, context_size: float, pool: str, target_sample_rate: int, dataset_type: str = 'speech', extra: Dict[str, Any] | None = None) str
get_cache_path(cache_key: str) Path
load_activations(cache_key: str) numpy.ndarray | None
load_multi_layer_activations(cache_key: str) SpeechLazyLayerCache | None
save_activations(cache_key: str, activations: numpy.ndarray)
save_multi_layer_activations(cache_key: str, all_layer_activations: Dict[int, numpy.ndarray], metadata: Dict[str, Any], times: numpy.ndarray | None = None) None
Persist to a single pickle:
{

“metadata”: { … speech params … }, “layers”: { int: np.ndarray [n_chunks, D], … }, “times”: np.ndarray [n_chunks] or None

}

class encoding.utils.SpeechLazyLayerCache(cache_file_path: str | Path)

Lazy (API-compatible) loader for speech multi-layer activations (single .pkl file).

__init__(cache_file_path: str | Path)
clear_loaded_layers() None
get_available_layers() List[int]
get_layer(layer_idx: int) numpy.ndarray
get_layers(layer_indices: List[int]) List[numpy.ndarray]
get_metadata() Dict[str, Any]
get_times() numpy.ndarray | None
validate_params(*, expected: Dict[str, Any]) None

Validate core speech params match (e.g., model_name, chunk/context size, pool, sr). Raises ValueError on mismatch.

encoding.utils.demean(v)

Removes the mean from each column of [v].

encoding.utils.dm(v)

Removes the mean from each column of [v].

encoding.utils.make_delayed(stim, delays, circpad=False)

Creates non-interpolated concatenated delayed versions of [stim] with the given [delays] (in samples).

If [circpad], instead of being padded with zeros, [stim] will be circularly shifted.

encoding.utils.mcorr(c1, c2)

Matrix correlation. Find the correlation between each column of [c1] and the corresponding column of [c2].

encoding.utils.rescale(v)

Rescales each column of [v] to have unit variance.

encoding.utils.rs(v)

Rescales each column of [v] to have unit variance.

encoding.utils.unmask_correlations_for_plotting(masked_correlations: numpy.ndarray, mask_indices: numpy.ndarray, full_size: int) numpy.ndarray

Expand masked correlations back to full brain size for plotting.

Parameters:
  • masked_correlations – Correlations from masked analysis (n_masked_voxels,)

  • mask_indices – Indices where mask was True (n_masked_voxels,)

  • full_size – Size of full brain (e.g., 20484 for fsaverage5)

Returns:

Full-size array with NaNs in unmasked regions (full_size,)

Return type:

full_correlations

encoding.utils.validate_path(func)

Decorator to validate that assembly_path exists before loading.

encoding.utils.xcorr(c1, c2)

Cross-column correlation. Finds the correlation between each row of [c1] and each row of [c2].

encoding.utils.zs(v)

Z-scores (standardizes) each column of [v].

encoding.utils.zscore(v)

Z-scores (standardizes) each column of [v].