habitat_analysis 模块

Habitat Analysis module for HABIT package.

This module provides: - HabitatAnalysis: Main class for habitat clustering analysis - Configuration schemas: HabitatAnalysisConfig, ResultColumns - Analyzer classes: HabitatMapAnalyzer (formerly HabitatFeatureExtractor)

habit.core.habitat_analysis.get_import_errors()[源代码]

Get dictionary of import errors that occurred during module loading.

返回:

Dictionary mapping class names to error messages

返回类型:

dict

habit.core.habitat_analysis.get_available_classes()[源代码]

Get dictionary of successfully imported classes.

返回:

Dictionary mapping class names to their classes

返回类型:

dict

habit.core.habitat_analysis.is_class_available(class_name: str) bool[源代码]

Check if a specific class is available.

参数:

class_name (str) -- Name of the class to check

返回:

True if class is available, False otherwise

返回类型:

bool

核心分析类 (Core Analysis)

HabitatAnalysis 是执行生境分析的主要入口类。

配置管理 (Configuration)

这些类定义了生境分析的配置结构,了解它们对于自定义分析流程至关重要。

Configuration Schemas for Habitat Analysis Workflows Uses Pydantic for robust validation and type safety.

class habit.core.habitat_analysis.config_schemas.HabitatAnalysisConfig(*, config_file: str | None = None, config_version: str | None = None, data_dir: str, out_dir: str, run_mode: Literal['train', 'predict'] = 'train', pipeline_path: str | None = None, FeatureConstruction: FeatureConstructionConfig | None = None, HabitatsSegmention: HabitatsSegmentionConfig | None = None, processes: Annotated[int, Gt(gt=0)] = 2, plot_curves: bool = True, save_images: bool = True, save_results_csv: bool = True, random_state: int = 42, verbose: bool = True, debug: bool = False)[源代码]

基类:BaseConfig

Root model for the entire habitat analysis configuration.

data_dir: str
out_dir: str
config_file: str | None
run_mode: Literal['train', 'predict']
pipeline_path: str | None
FeatureConstruction: FeatureConstructionConfig | None
HabitatsSegmention: HabitatsSegmentionConfig | None
processes: int
plot_curves: bool
save_images: bool
save_results_csv: bool
random_state: int
verbose: bool
debug: bool
validate_mode_dependent_fields()[源代码]

Validate that required fields are present based on run_mode.

  • In train mode: FeatureConstruction and HabitatsSegmention are required

  • In predict mode: FeatureConstruction is optional, but HabitatsSegmention.clustering_mode is needed

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.VoxelLevelConfig(*, method: str, params: ~typing.Dict[str, ~typing.Any] = <factory>)[源代码]

基类:BaseModel

method: str
params: Dict[str, Any]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.SupervoxelLevelConfig(*, supervoxel_file_keyword: str = '*_supervoxel.nrrd', method: str = 'mean_voxel_features()', params: ~typing.Dict[str, ~typing.Any] = <factory>)[源代码]

基类:BaseModel

supervoxel_file_keyword: str
method: str
params: Dict[str, Any]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.PreprocessingMethod(*, method: Literal['winsorize', 'minmax', 'zscore', 'robust', 'log', 'binning', 'variance_filter', 'correlation_filter'], global_normalize: bool = False, winsor_limits: List[float] | None = None, n_bins: int | None = None, bin_strategy: Literal['uniform', 'quantile', 'kmeans'] | None = None, variance_threshold: float | None = None, corr_threshold: float | None = None, corr_method: Literal['pearson', 'spearman', 'kendall'] | None = None)[源代码]

基类:BaseModel

method: Literal['winsorize', 'minmax', 'zscore', 'robust', 'log', 'binning', 'variance_filter', 'correlation_filter']
global_normalize: bool
winsor_limits: List[float] | None
n_bins: int | None
bin_strategy: Literal['uniform', 'quantile', 'kmeans'] | None
variance_threshold: float | None
corr_threshold: float | None
corr_method: Literal['pearson', 'spearman', 'kendall'] | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.PreprocessingConfig(*, methods: ~typing.List[~habit.core.habitat_analysis.config_schemas.PreprocessingMethod] = <factory>)[源代码]

基类:BaseModel

methods: List[PreprocessingMethod]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.FeatureConstructionConfig(*, voxel_level: VoxelLevelConfig, supervoxel_level: SupervoxelLevelConfig | None = None, preprocessing_for_subject_level: PreprocessingConfig | None = None, preprocessing_for_group_level: PreprocessingConfig | None = None)[源代码]

基类:BaseModel

voxel_level: VoxelLevelConfig
supervoxel_level: SupervoxelLevelConfig | None
preprocessing_for_subject_level: PreprocessingConfig | None
preprocessing_for_group_level: PreprocessingConfig | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.OneStepSettings(*, min_clusters: int = 2, max_clusters: int = 10, fixed_n_clusters: int | None = None, selection_method: Literal['silhouette', 'calinski_harabasz', 'davies_bouldin', 'inertia', 'kneedle'] = 'silhouette', plot_validation_curves: bool = True)[源代码]

基类:BaseModel

Settings for one-step clustering mode (voxel -> habitat directly).

In one-step mode, each subject is clustered independently. You can either: 1. Specify a fixed number of clusters (fixed_n_clusters) 2. Let the algorithm automatically select optimal clusters (min/max_clusters + selection_method)

min_clusters: int
max_clusters: int
fixed_n_clusters: int | None
selection_method: Literal['silhouette', 'calinski_harabasz', 'davies_bouldin', 'inertia', 'kneedle']
plot_validation_curves: bool
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.ConnectedComponentPostprocessConfig(*, enabled: bool = False, min_component_size: Annotated[int, Ge(ge=1)] = 30, connectivity: Literal[1, 2, 3] = 1, reassign_method: Literal['neighbor_vote'] = 'neighbor_vote', max_iterations: Annotated[int, Ge(ge=1)] = 3)[源代码]

基类:BaseModel

Connected-component post-processing settings for label-map cleanup.

enabled: bool
min_component_size: int
connectivity: Literal[1, 2, 3]
reassign_method: Literal['neighbor_vote']
max_iterations: int
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.SupervoxelClusteringConfig(*, algorithm: ~typing.Literal['kmeans', 'gmm', 'slic'] = 'kmeans', n_clusters: int = 50, random_state: int = 42, max_iter: int = 300, n_init: int = 10, compactness: float = 0.1, sigma: float = 0.0, enforce_connectivity: bool = True, one_step_settings: ~habit.core.habitat_analysis.config_schemas.OneStepSettings = <factory>)[源代码]

基类:BaseModel

algorithm: Literal['kmeans', 'gmm', 'slic']
n_clusters: int
random_state: int
max_iter: int
n_init: int
compactness: float
sigma: float
enforce_connectivity: bool
one_step_settings: OneStepSettings
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.HabitatClusteringConfig(*, algorithm: Literal['kmeans', 'gmm'] = 'kmeans', max_clusters: int = 10, min_clusters: int | None = 2, habitat_cluster_selection_method: str | List[str] = 'inertia', fixed_n_clusters: int | None = None, random_state: int = 42, max_iter: int = 300, n_init: int = 10)[源代码]

基类:BaseModel

algorithm: Literal['kmeans', 'gmm']
max_clusters: int
min_clusters: int | None
habitat_cluster_selection_method: str | List[str]
fixed_n_clusters: int | None
random_state: int
max_iter: int
n_init: int
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.HabitatsSegmentionConfig(*, clustering_mode: ~typing.Literal['one_step', 'two_step', 'direct_pooling'] = 'two_step', supervoxel: ~habit.core.habitat_analysis.config_schemas.SupervoxelClusteringConfig = <factory>, habitat: ~habit.core.habitat_analysis.config_schemas.HabitatClusteringConfig = <factory>, postprocess_supervoxel: ~habit.core.habitat_analysis.config_schemas.ConnectedComponentPostprocessConfig = <factory>, postprocess_habitat: ~habit.core.habitat_analysis.config_schemas.ConnectedComponentPostprocessConfig = <factory>)[源代码]

基类:BaseModel

clustering_mode: Literal['one_step', 'two_step', 'direct_pooling']
supervoxel: SupervoxelClusteringConfig
habitat: HabitatClusteringConfig
postprocess_supervoxel: ConnectedComponentPostprocessConfig
postprocess_habitat: ConnectedComponentPostprocessConfig
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.ResultColumns[源代码]

基类:object

Centralized column name definitions for pipeline outputs.

This avoids magic strings across the codebase and keeps feature/metadata column handling consistent in all pipeline steps and managers.

SUBJECT = 'Subject'
SUPERVOXEL = 'Supervoxel'
COUNT = 'Count'
HABITATS = 'Habitats'
ORIGINAL_SUFFIX = '-original'
classmethod metadata_columns() List[str][源代码]

Return list of metadata column names (non-feature columns).

返回:

Columns that are metadata and should not be treated as features

返回类型:

List[str]

classmethod is_feature_column(col_name: str) bool[源代码]

Check if a column name represents a feature (not metadata).

参数:

col_name -- Column name to check

返回:

True if the column is a feature column

返回类型:

bool

class habit.core.habitat_analysis.config_schemas.FeatureExtractionConfig(*, config_file: str | None = None, config_version: str | None = None, params_file_of_non_habitat: str, params_file_of_habitat: str, raw_img_folder: str, habitats_map_folder: str, out_dir: str, n_processes: int = 4, habitat_pattern: str = '*_habitats.nrrd', feature_types: List[str], n_habitats: int | None = None, debug: bool = False)[源代码]

基类:BaseConfig

Configuration for habitat feature extraction workflow.

params_file_of_non_habitat: str
params_file_of_habitat: str
raw_img_folder: str
habitats_map_folder: str
out_dir: str
n_processes: int
habitat_pattern: str
feature_types: List[str]
n_habitats: int | None
debug: bool
model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.PathsConfig(*, params_file: str, images_folder: str, out_dir: str)[源代码]

基类:BaseModel

Paths configuration for radiomics extraction.

params_file: str
images_folder: str
out_dir: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.ProcessingConfig(*, n_processes: ~typing.Annotated[int, ~annotated_types.Gt(gt=0)] = 2, save_every_n_files: ~typing.Annotated[int, ~annotated_types.Gt(gt=0)] = 5, process_image_types: ~typing.List[str] | None = None, target_labels: ~typing.List[int] = <factory>)[源代码]

基类:BaseModel

Processing configuration for radiomics extraction.

n_processes: int
save_every_n_files: int
process_image_types: List[str] | None
target_labels: List[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.ExportConfig(*, export_by_image_type: bool = True, export_combined: bool = True, export_format: Literal['csv', 'json', 'pickle'] = 'csv', add_timestamp: bool = True)[源代码]

基类:BaseModel

Export configuration for radiomics extraction.

export_by_image_type: bool
export_combined: bool
export_format: Literal['csv', 'json', 'pickle']
add_timestamp: bool
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.LoggingConfig(*, level: Literal['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] = 'INFO', console_output: bool = True, file_output: bool = True)[源代码]

基类:BaseModel

Logging configuration for radiomics extraction.

level: Literal['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL']
console_output: bool
file_output: bool
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.RadiomicsConfig(*, config_file: str | None = None, config_version: str | None = None, paths: ~habit.core.habitat_analysis.config_schemas.PathsConfig, processing: ~habit.core.habitat_analysis.config_schemas.ProcessingConfig = <factory>, export: ~habit.core.habitat_analysis.config_schemas.ExportConfig = <factory>, logging: ~habit.core.habitat_analysis.config_schemas.LoggingConfig = <factory>, params_file: str | None = None, images_folder: str | None = None, out_dir: str | None = None, n_processes: int | None = None)[源代码]

基类:BaseConfig

Configuration for traditional radiomics feature extraction.

paths: PathsConfig
processing: ProcessingConfig
export: ExportConfig
logging: LoggingConfig
params_file: str | None
images_folder: str | None
out_dir: str | None
n_processes: int | None
model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

分析策略 (Analysis Strategies)

不同的策略决定了如何从 ROI 中提取生境特征。

Two-step strategy: voxel -> supervoxel -> habitat clustering. Refactored to use HabitatPipeline with template method pattern.

class habit.core.habitat_analysis.strategies.two_step_strategy.TwoStepStrategy(analysis: HabitatAnalysis)[源代码]

基类:BaseClusteringStrategy

Two-step clustering strategy using HabitatPipeline.

Flow: 1) Voxel feature extraction (Pipeline Step 1) 2) Subject-level preprocessing (Pipeline Step 2) 3) Individual clustering (voxel -> supervoxel) (Pipeline Step 3) 4) Supervoxel feature extraction (conditional) (Pipeline Step 4) 5) Supervoxel feature aggregation (Pipeline Step 5) 6) Combine supervoxels (Pipeline Step 6) - merge all subjects' supervoxels 7) Group-level preprocessing (Pipeline Step 7) 8) Population clustering (supervoxel -> habitat) (Pipeline Step 8)

Note: This strategy supports parallel processing through HabitatPipeline. Use config.processes to control the number of parallel workers for individual-level steps (Steps 1-5). Group-level steps (6-8) process all subjects together.

__init__(analysis: HabitatAnalysis)[源代码]

Initialize two-step strategy.

参数:

analysis -- HabitatAnalysis instance with shared utilities

One-step strategy: voxel -> habitat clustering per subject. Refactored to use HabitatPipeline with template method pattern.

class habit.core.habitat_analysis.strategies.one_step_strategy.OneStepStrategy(analysis: HabitatAnalysis)[源代码]

基类:BaseClusteringStrategy

One-step clustering strategy using HabitatPipeline.

Flow: 1) Voxel feature extraction (Pipeline Step 1) 2) Subject-level preprocessing (Pipeline Step 2) 3) Individual clustering (voxel -> habitat per subject) (Pipeline Step 3) 4) Supervoxel aggregation (Pipeline Step 4) - calculates means per habitat 5) Combine supervoxels (Pipeline Step 5) - merge all subjects' results

Note: This strategy supports parallel processing through HabitatPipeline. Use config.processes to control the number of parallel workers.

__init__(analysis: HabitatAnalysis)[源代码]

Initialize one-step strategy.

参数:

analysis -- HabitatAnalysis instance with shared utilities

Direct pooling strategy: concatenate all voxel features across subjects and cluster once. Refactored to use HabitatPipeline with template method pattern.

class habit.core.habitat_analysis.strategies.direct_pooling_strategy.DirectPoolingStrategy(analysis: HabitatAnalysis)[源代码]

基类:BaseClusteringStrategy

Direct pooling strategy using HabitatPipeline.

## Overview

This strategy pools (concatenates) voxel features from ALL subjects into a single feature matrix before clustering. This enables the discovery of population-level tissue patterns that are representative across the entire cohort.

## Workflow

  1. Voxel feature extraction (Pipeline Step 1) - extract features for each subject

  2. Subject-level preprocessing (Pipeline Step 2) - normalize within each subject

  3. Concatenate all voxels (Pipeline Step 3) - merge all subjects' voxels into one matrix

  4. Group-level preprocessing (Pipeline Step 4) - apply population-level transformations

  5. Population clustering (Pipeline Step 5) - cluster all voxels -> discover habitats

## Why Pool All Voxels?

Rationale: By pooling voxels from all subjects, the clustering algorithm can discover tissue patterns that are consistent and reproducible across the entire population. This approach is particularly effective for: - Discovering common biological phenotypes (e.g., "highly perfused tissue" vs "necrotic tissue") - Identifying dominant habitat patterns shared by multiple subjects - Quickly prototyping and exploring population-level tissue heterogeneity

## About Data Leakage

Important: This strategy is NOT equivalent to label leakage in the traditional machine learning sense. Here's why:

  • Unsupervised Learning: Habitat discovery is an UNSUPERVISED process (no labels involved)

  • Feature Space Only: Pooling occurs in the FEATURE space (imaging intensities), not the label space (clinical outcomes)

  • Pre-modeling Step: Habitat segmentation is performed BEFORE building predictive models

  • Pipeline Isolation: When used in predictive workflows, the clustering model is fitted on training data only and applied to test data via the saved Pipeline

Analogy: It's similar to performing k-means clustering on pooled MRI intensities to discover tissue types—the clustering doesn't "know" which subjects are diseased vs healthy.

## Use Cases

Recommended for: - Exploratory analysis to discover dominant tissue patterns - Fast prototyping and hypothesis generation - Cohorts with moderate inter-subject variability - Studies focusing on population-level habitat characterization

Not recommended for: - Extremely heterogeneous cohorts where individual differences dominate - Small sample sizes (prefer Two-Step or One-Step strategies) - Studies requiring subject-specific habitat definitions

## Parallel Processing

This strategy supports parallel processing through HabitatPipeline: - config.processes: Controls parallel workers for individual-level steps (Steps 1-2) - Group-level steps (3-5): Process all subjects together (not parallelized)

__init__(analysis: HabitatAnalysis)[源代码]

Initialize direct pooling strategy.

参数:

analysis -- HabitatAnalysis instance with shared utilities

Base strategy interface for habitat analysis.

class habit.core.habitat_analysis.strategies.base_strategy.BaseClusteringStrategy(analysis: HabitatAnalysis)[源代码]

基类:ABC

Abstract base class for habitat analysis strategies.

Each strategy should implement run() and return a results DataFrame.

__init__(analysis: HabitatAnalysis)[源代码]

Initialize the strategy with a HabitatAnalysis instance.

参数:

analysis -- HabitatAnalysis instance with shared utilities and configuration

run(subjects: List[str] | None = None, save_results_csv: bool | None = None, load_from: str | None = None) DataFrame[源代码]

Template method for executing the strategy.

This method defines the algorithm skeleton. Subclasses can override specific steps if needed, but most will only need to implement strategy-specific logic in hooks.

参数:
  • subjects -- List of subjects to process (None means all subjects)

  • save_results_csv -- Whether to save results to CSV (defaults to config.save_results_csv)

  • load_from -- Optional path to a saved pipeline. If provided, the pipeline is loaded and only transform() is executed.

返回:

Results DataFrame

流程管理器 (Managers)

这些管理器负责协调具体的分析步骤,如特征提取、聚类和结果汇总。

Feature Manager for Habitat Analysis. Handles all feature extraction and preprocessing logic.

class habit.core.habitat_analysis.managers.feature_manager.FeatureManager(config: HabitatAnalysisConfig, logger: Logger)[源代码]

基类:object

Manages feature extraction and preprocessing for habitat analysis.

__init__(config: HabitatAnalysisConfig, logger: Logger)[源代码]

Initialize FeatureManager.

参数:
  • config -- Habitat analysis configuration

  • logger -- Logger instance

set_data_paths(images_paths: Dict, mask_paths: Dict)[源代码]

Set image and mask paths.

set_logging_info(log_file_path: str, log_level: int)[源代码]

Set logging info for subprocesses.

extract_voxel_features(subject: str) Tuple[str, DataFrame, DataFrame, dict][源代码]

Extract voxel-level features for a single subject.

参数:

subject -- Subject ID to process

返回:

Tuple of (subject_id, feature_df, raw_df, mask_info)

extract_supervoxel_features(subject: str) Tuple[str, DataFrame | Exception][源代码]

Extract supervoxel-level features from supervoxel maps.

参数:

subject -- Subject ID to process

返回:

Tuple of (subject_id, features_df or Exception)

apply_preprocessing(feature_df: DataFrame, level: str) DataFrame[源代码]

Apply preprocessing based on level (user-facing interface).

This method provides a simplified interface for applying preprocessing at different levels.

参数:
  • feature_df -- DataFrame to preprocess

  • level -- 'subject' for individual level, 'group' for population level

返回:

Preprocessed DataFrame

备注

Group-level preprocessing is typically handled by Pipeline steps automatically. This method is primarily used for subject-level preprocessing.

calculate_supervoxel_means(subject: str, feature_df: DataFrame, raw_df: DataFrame, supervoxel_labels: ndarray, n_clusters_supervoxel: int) DataFrame[源代码]

Calculate supervoxel-level features (aggregated from voxel features).

setup_supervoxel_files(subjects: List[str], failed_subjects: List[str], out_folder: str) None[源代码]

Setup dictionary mapping subjects to supervoxel files.

clean_features(features: DataFrame) DataFrame[源代码]

Clean feature DataFrame: handle types, inf, nan values.

分析器与提取器 (Analyzers & Extractors)

Habitat Feature Extraction Tool (Refactored Version) This tool provides functionality for extracting features from habitat maps: 1. Radiomic features of raw images within different habitats 2. Radiomic features of habitats within the entire ROI 3. Number of disconnected regions and volume percentage for each habitat 4. MSI (Mutual Spatial Integrity) features from habitat maps 5. ITH (Intratumoral Heterogeneity) scores from habitat maps

class habit.core.habitat_analysis.analyzers.habitat_analyzer.HabitatMapAnalyzer(params_file_of_non_habitat=None, params_file_of_habitat=None, raw_img_folder=None, habitats_map_folder=None, out_dir=None, n_processes=None, habitat_pattern=None, voxel_cutoff=10)[源代码]

基类:object

Habitat Map Analyzer Class (Refactored)

This class provides functionality for extracting various features from habitat maps: 1. Radiomic features of raw images within different habitats 2. Radiomic features of habitats within the entire ROI 3. Number of disconnected regions and volume percentage for each habitat 4. MSI (Mutual Spatial Integrity) features from habitat maps 5. ITH (Intratumoral Heterogeneity) index from habitat maps

__init__(params_file_of_non_habitat=None, params_file_of_habitat=None, raw_img_folder=None, habitats_map_folder=None, out_dir=None, n_processes=None, habitat_pattern=None, voxel_cutoff=10)[源代码]

Initialize the habitat feature extractor

参数:
  • params_file_of_non_habitat -- Parameter file for extracting radiomic features from raw images

  • params_file_of_habitat -- Parameter file for extracting radiomic features from habitat images

  • raw_img_folder -- Root directory of raw images

  • habitats_map_folder -- Root directory of habitat maps

  • out_dir -- Output directory

  • n_processes -- Number of processes to use

  • habitat_pattern -- Pattern for matching habitat files

  • voxel_cutoff -- Voxel threshold for filtering small regions in MSI feature calculation

get_mask_and_raw_files()[源代码]

Get paths to all original images and habitat maps

process_subject(subj, images_paths, habitat_paths, mask_paths=None, feature_types=None)[源代码]

Process a single subject for habitat feature extraction

extract_features(images_paths, habitat_paths, mask_paths=None, feature_types=None)[源代码]

Extract habitat features for all subjects

run(feature_types: List[str] | None = None, n_habitats: int | None = None)[源代码]

Run the complete analysis pipeline

参数:
  • feature_types -- Types of features to extract

  • n_habitats -- Number of habitats to process (None for auto detection)

Voxel-level radiomics feature extractor

class habit.core.habitat_analysis.extractors.voxel_radiomics_extractor.VoxelRadiomicsExtractor(**kwargs)[源代码]

基类:BaseClusteringExtractor

Extract voxel-level radiomics features from image within mask region using PyRadiomics' voxel-based extraction

__init__(**kwargs)[源代码]

Initialize voxel-level radiomics feature extractor

参数:

**kwargs -- Additional parameters

extract_features(image_data: str | Image, mask_data: str | Image, **kwargs) DataFrame[源代码]

Extract voxel-level radiomics features from image within mask region

参数:
  • image_data -- Path to image file or SimpleITK image object

  • mask_data -- Path to mask file or SimpleITK mask object

  • **kwargs -- Additional parameters subj: subject name img_name: Name of the image to append to feature names

返回:

Extracted voxel-level radiomics features

返回类型:

pd.DataFrame

get_feature_names() List[str][源代码]

Get feature names

返回:

List of feature names

返回类型:

List[str]

Supervoxel-level radiomics feature extractor

class habit.core.habitat_analysis.extractors.supervoxel_radiomics_extractor.SupervoxelRadiomicsExtractor(params_file: str | None = None, **kwargs)[源代码]

基类:BaseClusteringExtractor

Extract radiomics features for each supervoxel in the supervoxel map

__init__(params_file: str | None = None, **kwargs)[源代码]

Initialize supervoxel radiomics feature extractor

参数:
  • params_file -- Path to PyRadiomics parameter file or YAML string containing parameters

  • **kwargs -- Additional parameters

extract_features(image_data: str | Image, supervoxel_map: str | Image, config_file: str | None = None, **kwargs) DataFrame[源代码]

Extract radiomics features for each supervoxel in the supervoxel map

参数:
  • image_data -- Path to image file or SimpleITK image object

  • supervoxel_map -- Path to supervoxel map file or SimpleITK image object

  • config_file -- Path to PyRadiomics parameter file (overrides the one in constructor)

  • **kwargs -- Additional parameters

返回:

DataFrame with radiomics features for each supervoxel

返回类型:

pd.DataFrame

get_feature_names() List[str][源代码]

Get feature names

返回:

List of feature names

返回类型:

List[str]