Note

This page is a reference documentation. It only explains the function signature, and not how to use it. Please refer to the user guide for the big picture.

brainprep.interfaces.incremental_pca¶

brainprep.interfaces.incremental_pca(image_files_regex, output_dir, batch_size=10, dryrun=False)[source]¶

Perform an Incremental PCA with 2 components on a collection of images matched by a regex pattern, processing them in batches.

The function loads all images matching the provided regex, splits them into batches, and incrementally fits a PCA model using scikit-learn’s IncrementalPCA. Each image is flattened into a 1D vector before processing. After fitting, the function transforms all batches to obtain the first two principal components for each image. These components are saved in a TSV file as two columns named pc1 and pc2. BIDS entities (participant_id, session, run) are extracted from filenames using parse_bids_keys and included in the output table.

Parameters:

image_files_regexstr: A REGEX to image files, each representing an image, all images must have the same size.
output_dirDirectory: Directory where a TSV file containing the values of the first two components created by the PCA ill be saved, a Directory containing all the graph of all batch.
batch_sizeint: Number of images to use in each batch. If None, a single batch is used. Default is 10.
dryrunbool: If True, skip actual computation and file writing. Default False.

Returns:

pca_fileFile: Path to the generated pca.tsv file containing the PCA results.

Raises:

ValueError: If no image matches the regex pattern. If the dataset contains fewer than 2 images, which prevents PCA computation.