habitat_analysis 模块

Habitat Analysis module for HABIT package.

This module provides: - HabitatAnalysis: Main class for habitat clustering analysis - Configuration schemas: HabitatAnalysisConfig, ResultColumns - Analyzer classes: HabitatMapAnalyzer (formerly HabitatFeatureExtractor)

habit.core.habitat_analysis.get_import_errors()[源代码]

Get dictionary of import errors that occurred during module loading.

返回:: Dictionary mapping class names to error messages
返回类型:: dict

habit.core.habitat_analysis.get_available_classes()[源代码]

Get dictionary of successfully imported classes.

返回:: Dictionary mapping class names to their classes
返回类型:: dict

habit.core.habitat_analysis.is_class_available(class_name: str) → bool[源代码]

Check if a specific class is available.

参数:: class_name (str) -- Name of the class to check
返回:: True if class is available, False otherwise
返回类型:: bool

核心分析类 (Core Analysis)

HabitatAnalysis 是执行生境分析的主要入口类。

配置管理 (Configuration)

这些类定义了生境分析的配置结构，了解它们对于自定义分析流程至关重要。

Configuration Schemas for Habitat Analysis Workflows Uses Pydantic for robust validation and type safety.

class habit.core.habitat_analysis.config_schemas.HabitatAnalysisConfig(*, config_file: str | None = None, config_version: str | None = None, data_dir: str, out_dir: str, run_mode: Literal['train', 'predict'] = 'train', pipeline_path: str | None = None, FeatureConstruction: FeatureConstructionConfig | None = None, HabitatsSegmention: HabitatsSegmentionConfig | None = None, processes: Annotated[int, Gt(gt=0)] = 2, plot_curves: bool = True, save_images: bool = True, save_results_csv: bool = True, random_state: int = 42, verbose: bool = True, debug: bool = False)[源代码]

基类：BaseConfig

Root model for the entire habitat analysis configuration.

data_dir: str

out_dir: str

config_file: str | None

run_mode: Literal['train', 'predict']

pipeline_path: str | None

FeatureConstruction: FeatureConstructionConfig | None

HabitatsSegmention: HabitatsSegmentionConfig | None

processes: int

plot_curves: bool

save_images: bool

save_results_csv: bool

random_state: int

verbose: bool

debug: bool

validate_mode_dependent_fields()[源代码]

Validate that required fields are present based on run_mode.

In train mode: FeatureConstruction and HabitatsSegmention are required
In predict mode: FeatureConstruction is optional, but HabitatsSegmention.clustering_mode is needed

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.VoxelLevelConfig(*, method: str, params: ~typing.Dict[str, ~typing.Any] = <factory>)[源代码]

基类：BaseModel

method: str

params: Dict[str, Any]

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.SupervoxelLevelConfig(*, supervoxel_file_keyword: str = '*_supervoxel.nrrd', method: str = 'mean_voxel_features()', params: ~typing.Dict[str, ~typing.Any] = <factory>)[源代码]

基类：BaseModel

supervoxel_file_keyword: str

method: str

params: Dict[str, Any]

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.PreprocessingMethod(*, method: Literal['winsorize', 'minmax', 'zscore', 'robust', 'log', 'binning', 'variance_filter', 'correlation_filter'], global_normalize: bool = False, winsor_limits: List[float] | None = None, n_bins: int | None = None, bin_strategy: Literal['uniform', 'quantile', 'kmeans'] | None = None, variance_threshold: float | None = None, corr_threshold: float | None = None, corr_method: Literal['pearson', 'spearman', 'kendall'] | None = None)[源代码]

基类：BaseModel

method: Literal['winsorize', 'minmax', 'zscore', 'robust', 'log', 'binning', 'variance_filter', 'correlation_filter']

global_normalize: bool

winsor_limits: List[float] | None

n_bins: int | None

bin_strategy: Literal['uniform', 'quantile', 'kmeans'] | None

variance_threshold: float | None

corr_threshold: float | None

corr_method: Literal['pearson', 'spearman', 'kendall'] | None

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.PreprocessingConfig(*, methods: ~typing.List[~habit.core.habitat_analysis.config_schemas.PreprocessingMethod] = <factory>)[源代码]

基类：BaseModel

methods: List[PreprocessingMethod]

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.FeatureConstructionConfig(*, voxel_level: VoxelLevelConfig, supervoxel_level: SupervoxelLevelConfig | None = None, preprocessing_for_subject_level: PreprocessingConfig | None = None, preprocessing_for_group_level: PreprocessingConfig | None = None)[源代码]

基类：BaseModel

voxel_level: VoxelLevelConfig

supervoxel_level: SupervoxelLevelConfig | None

preprocessing_for_subject_level: PreprocessingConfig | None

preprocessing_for_group_level: PreprocessingConfig | None

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.OneStepSettings(*, min_clusters: int = 2, max_clusters: int = 10, fixed_n_clusters: int | None = None, selection_method: Literal['silhouette', 'calinski_harabasz', 'davies_bouldin', 'inertia', 'kneedle'] = 'silhouette', plot_validation_curves: bool = True)[源代码]

基类：BaseModel

Settings for one-step clustering mode (voxel -> habitat directly).

In one-step mode, each subject is clustered independently. You can either: 1. Specify a fixed number of clusters (fixed_n_clusters) 2. Let the algorithm automatically select optimal clusters (min/max_clusters + selection_method)

min_clusters: int

max_clusters: int

fixed_n_clusters: int | None

selection_method: Literal['silhouette', 'calinski_harabasz', 'davies_bouldin', 'inertia', 'kneedle']

plot_validation_curves: bool

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.ConnectedComponentPostprocessConfig(*, enabled: bool = False, min_component_size: Annotated[int, Ge(ge=1)] = 30, connectivity: Literal[1, 2, 3] = 1, reassign_method: Literal['neighbor_vote'] = 'neighbor_vote', max_iterations: Annotated[int, Ge(ge=1)] = 3)[源代码]

基类：BaseModel

Connected-component post-processing settings for label-map cleanup.

enabled: bool

min_component_size: int

connectivity: Literal[1, 2, 3]

reassign_method: Literal['neighbor_vote']

max_iterations: int

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.SupervoxelClusteringConfig(*, algorithm: ~typing.Literal['kmeans', 'gmm', 'slic'] = 'kmeans', n_clusters: int = 50, random_state: int = 42, max_iter: int = 300, n_init: int = 10, compactness: float = 0.1, sigma: float = 0.0, enforce_connectivity: bool = True, one_step_settings: ~habit.core.habitat_analysis.config_schemas.OneStepSettings = <factory>)[源代码]

基类：BaseModel

algorithm: Literal['kmeans', 'gmm', 'slic']

n_clusters: int

random_state: int

max_iter: int

n_init: int

compactness: float

sigma: float

enforce_connectivity: bool

one_step_settings: OneStepSettings

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.HabitatClusteringConfig(*, algorithm: Literal['kmeans', 'gmm'] = 'kmeans', max_clusters: int = 10, min_clusters: int | None = 2, habitat_cluster_selection_method: str | List[str] = 'inertia', fixed_n_clusters: int | None = None, random_state: int = 42, max_iter: int = 300, n_init: int = 10)[源代码]

基类：BaseModel

algorithm: Literal['kmeans', 'gmm']

max_clusters: int

min_clusters: int | None

habitat_cluster_selection_method: str | List[str]

fixed_n_clusters: int | None

random_state: int

max_iter: int

n_init: int

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.HabitatsSegmentionConfig(*, clustering_mode: ~typing.Literal['one_step', 'two_step', 'direct_pooling'] = 'two_step', supervoxel: ~habit.core.habitat_analysis.config_schemas.SupervoxelClusteringConfig = <factory>, habitat: ~habit.core.habitat_analysis.config_schemas.HabitatClusteringConfig = <factory>, postprocess_supervoxel: ~habit.core.habitat_analysis.config_schemas.ConnectedComponentPostprocessConfig = <factory>, postprocess_habitat: ~habit.core.habitat_analysis.config_schemas.ConnectedComponentPostprocessConfig = <factory>)[源代码]

基类：BaseModel

clustering_mode: Literal['one_step', 'two_step', 'direct_pooling']

supervoxel: SupervoxelClusteringConfig

habitat: HabitatClusteringConfig

postprocess_supervoxel: ConnectedComponentPostprocessConfig

postprocess_habitat: ConnectedComponentPostprocessConfig

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.ResultColumns[源代码]

基类：object

Centralized column name definitions for pipeline outputs.

This avoids magic strings across the codebase and keeps feature/metadata column handling consistent in all pipeline steps and managers.

SUBJECT = 'Subject'

SUPERVOXEL = 'Supervoxel'

COUNT = 'Count'

HABITATS = 'Habitats'

ORIGINAL_SUFFIX = '-original'

classmethod metadata_columns() → List[str][源代码]

Return list of metadata column names (non-feature columns).

返回:: Columns that are metadata and should not be treated as features
返回类型:: List[str]

classmethod is_feature_column(col_name: str) → bool[源代码]

Check if a column name represents a feature (not metadata).

参数:: col_name -- Column name to check
返回:: True if the column is a feature column
返回类型:: bool

class habit.core.habitat_analysis.config_schemas.FeatureExtractionConfig(*, config_file: str | None = None, config_version: str | None = None, params_file_of_non_habitat: str, params_file_of_habitat: str, raw_img_folder: str, habitats_map_folder: str, out_dir: str, n_processes: int = 4, habitat_pattern: str = '*_habitats.nrrd', feature_types: List[str], n_habitats: int | None = None, debug: bool = False)[源代码]

基类：BaseConfig

Configuration for habitat feature extraction workflow.

params_file_of_non_habitat: str

params_file_of_habitat: str

raw_img_folder: str

habitats_map_folder: str

out_dir: str

n_processes: int

habitat_pattern: str

feature_types: List[str]

n_habitats: int | None

debug: bool

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.PathsConfig(*, params_file: str, images_folder: str, out_dir: str)[源代码]

基类：BaseModel

Paths configuration for radiomics extraction.

params_file: str

images_folder: str

out_dir: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.ProcessingConfig(*, n_processes: ~typing.Annotated[int, ~annotated_types.Gt(gt=0)] = 2, save_every_n_files: ~typing.Annotated[int, ~annotated_types.Gt(gt=0)] = 5, process_image_types: ~typing.List[str] | None = None, target_labels: ~typing.List[int] = <factory>)[源代码]

基类：BaseModel

Processing configuration for radiomics extraction.

n_processes: int

save_every_n_files: int

process_image_types: List[str] | None

target_labels: List[int]

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.ExportConfig(*, export_by_image_type: bool = True, export_combined: bool = True, export_format: Literal['csv', 'json', 'pickle'] = 'csv', add_timestamp: bool = True)[源代码]

基类：BaseModel

Export configuration for radiomics extraction.

export_by_image_type: bool

export_combined: bool

export_format: Literal['csv', 'json', 'pickle']

add_timestamp: bool

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.LoggingConfig(*, level: Literal['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] = 'INFO', console_output: bool = True, file_output: bool = True)[源代码]

基类：BaseModel

Logging configuration for radiomics extraction.

level: Literal['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL']

console_output: bool

file_output: bool

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class habit.core.habitat_analysis.config_schemas.RadiomicsConfig(*, config_file: str | None = None, config_version: str | None = None, paths: ~habit.core.habitat_analysis.config_schemas.PathsConfig, processing: ~habit.core.habitat_analysis.config_schemas.ProcessingConfig = <factory>, export: ~habit.core.habitat_analysis.config_schemas.ExportConfig = <factory>, logging: ~habit.core.habitat_analysis.config_schemas.LoggingConfig = <factory>, params_file: str | None = None, images_folder: str | None = None, out_dir: str | None = None, n_processes: int | None = None)[源代码]

基类：BaseConfig

Configuration for traditional radiomics feature extraction.

paths: PathsConfig

processing: ProcessingConfig

export: ExportConfig

logging: LoggingConfig

params_file: str | None

images_folder: str | None

out_dir: str | None

n_processes: int | None

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

分析策略 (Analysis Strategies)

不同的策略决定了如何从 ROI 中提取生境特征。

Two-step strategy: voxel -> supervoxel -> habitat clustering. Refactored to use HabitatPipeline with template method pattern.

class habit.core.habitat_analysis.strategies.two_step_strategy.TwoStepStrategy(analysis: HabitatAnalysis)[源代码]

基类：BaseClusteringStrategy

Two-step clustering strategy using HabitatPipeline.

Flow: 1) Voxel feature extraction (Pipeline Step 1) 2) Subject-level preprocessing (Pipeline Step 2) 3) Individual clustering (voxel -> supervoxel) (Pipeline Step 3) 4) Supervoxel feature extraction (conditional) (Pipeline Step 4) 5) Supervoxel feature aggregation (Pipeline Step 5) 6) Combine supervoxels (Pipeline Step 6) - merge all subjects' supervoxels 7) Group-level preprocessing (Pipeline Step 7) 8) Population clustering (supervoxel -> habitat) (Pipeline Step 8)

Note: This strategy supports parallel processing through HabitatPipeline. Use config.processes to control the number of parallel workers for individual-level steps (Steps 1-5). Group-level steps (6-8) process all subjects together.

__init__(analysis: HabitatAnalysis)[源代码]

Initialize two-step strategy.

参数:: analysis -- HabitatAnalysis instance with shared utilities

One-step strategy: voxel -> habitat clustering per subject. Refactored to use HabitatPipeline with template method pattern.

class habit.core.habitat_analysis.strategies.one_step_strategy.OneStepStrategy(analysis: HabitatAnalysis)[源代码]

基类：BaseClusteringStrategy

One-step clustering strategy using HabitatPipeline.

Flow: 1) Voxel feature extraction (Pipeline Step 1) 2) Subject-level preprocessing (Pipeline Step 2) 3) Individual clustering (voxel -> habitat per subject) (Pipeline Step 3) 4) Supervoxel aggregation (Pipeline Step 4) - calculates means per habitat 5) Combine supervoxels (Pipeline Step 5) - merge all subjects' results

Note: This strategy supports parallel processing through HabitatPipeline. Use config.processes to control the number of parallel workers.

__init__(analysis: HabitatAnalysis)[源代码]

Initialize one-step strategy.

参数:: analysis -- HabitatAnalysis instance with shared utilities

Direct pooling strategy: concatenate all voxel features across subjects and cluster once. Refactored to use HabitatPipeline with template method pattern.

class habit.core.habitat_analysis.strategies.direct_pooling_strategy.DirectPoolingStrategy(analysis: HabitatAnalysis)[源代码]

基类：BaseClusteringStrategy

Direct pooling strategy using HabitatPipeline.

## Overview

This strategy pools (concatenates) voxel features from ALL subjects into a single feature matrix before clustering. This enables the discovery of population-level tissue patterns that are representative across the entire cohort.

## Workflow

Voxel feature extraction (Pipeline Step 1) - extract features for each subject
Subject-level preprocessing (Pipeline Step 2) - normalize within each subject
Concatenate all voxels (Pipeline Step 3) - merge all subjects' voxels into one matrix
Group-level preprocessing (Pipeline Step 4) - apply population-level transformations
Population clustering (Pipeline Step 5) - cluster all voxels -> discover habitats

## Why Pool All Voxels?

Rationale: By pooling voxels from all subjects, the clustering algorithm can discover tissue patterns that are consistent and reproducible across the entire population. This approach is particularly effective for: - Discovering common biological phenotypes (e.g., "highly perfused tissue" vs "necrotic tissue") - Identifying dominant habitat patterns shared by multiple subjects - Quickly prototyping and exploring population-level tissue heterogeneity

## About Data Leakage

Important: This strategy is NOT equivalent to label leakage in the traditional machine learning sense. Here's why:

Unsupervised Learning: Habitat discovery is an UNSUPERVISED process (no labels involved)
Feature Space Only: Pooling occurs in the FEATURE space (imaging intensities), not the label space (clinical outcomes)
Pre-modeling Step: Habitat segmentation is performed BEFORE building predictive models
Pipeline Isolation: When used in predictive workflows, the clustering model is fitted on training data only and applied to test data via the saved Pipeline

Analogy: It's similar to performing k-means clustering on pooled MRI intensities to discover tissue types—the clustering doesn't "know" which subjects are diseased vs healthy.

## Use Cases

Recommended for: - Exploratory analysis to discover dominant tissue patterns - Fast prototyping and hypothesis generation - Cohorts with moderate inter-subject variability - Studies focusing on population-level habitat characterization

Not recommended for: - Extremely heterogeneous cohorts where individual differences dominate - Small sample sizes (prefer Two-Step or One-Step strategies) - Studies requiring subject-specific habitat definitions

## Parallel Processing

This strategy supports parallel processing through HabitatPipeline: - config.processes: Controls parallel workers for individual-level steps (Steps 1-2) - Group-level steps (3-5): Process all subjects together (not parallelized)

__init__(analysis: HabitatAnalysis)[源代码]

Initialize direct pooling strategy.

参数:: analysis -- HabitatAnalysis instance with shared utilities

Base strategy interface for habitat analysis.

class habit.core.habitat_analysis.strategies.base_strategy.BaseClusteringStrategy(analysis: HabitatAnalysis)[源代码]

基类：ABC

Abstract base class for habitat analysis strategies.

Each strategy should implement run() and return a results DataFrame.

__init__(analysis: HabitatAnalysis)[源代码]

Initialize the strategy with a HabitatAnalysis instance.

参数:: analysis -- HabitatAnalysis instance with shared utilities and configuration

run(subjects: List[str] | None = None, save_results_csv: bool | None = None, load_from: str | None = None) → DataFrame[源代码]

Template method for executing the strategy.

This method defines the algorithm skeleton. Subclasses can override specific steps if needed, but most will only need to implement strategy-specific logic in hooks.

参数:

subjects -- List of subjects to process (None means all subjects)
save_results_csv -- Whether to save results to CSV (defaults to config.save_results_csv)
load_from -- Optional path to a saved pipeline. If provided, the pipeline is loaded and only transform() is executed.

返回:

Results DataFrame

流程管理器 (Managers)

这些管理器负责协调具体的分析步骤，如特征提取、聚类和结果汇总。

Feature Manager for Habitat Analysis. Handles all feature extraction and preprocessing logic.

class habit.core.habitat_analysis.managers.feature_manager.FeatureManager(config: HabitatAnalysisConfig, logger: Logger)[源代码]

基类：object

Manages feature extraction and preprocessing for habitat analysis.

__init__(config: HabitatAnalysisConfig, logger: Logger)[源代码]

Initialize FeatureManager.

参数:

config -- Habitat analysis configuration
logger -- Logger instance

set_data_paths(images_paths: Dict, mask_paths: Dict)[源代码]: Set image and mask paths.

set_logging_info(log_file_path: str, log_level: int)[源代码]: Set logging info for subprocesses.

extract_voxel_features(subject: str) → Tuple[str, DataFrame, DataFrame, dict][源代码]

Extract voxel-level features for a single subject.

参数:: subject -- Subject ID to process
返回:: Tuple of (subject_id, feature_df, raw_df, mask_info)

extract_supervoxel_features(subject: str) → Tuple[str, DataFrame | Exception][源代码]

Extract supervoxel-level features from supervoxel maps.

参数:: subject -- Subject ID to process
返回:: Tuple of (subject_id, features_df or Exception)

apply_preprocessing(feature_df: DataFrame, level: str) → DataFrame[源代码]

Apply preprocessing based on level (user-facing interface).

This method provides a simplified interface for applying preprocessing at different levels.

参数:

feature_df -- DataFrame to preprocess
level -- 'subject' for individual level, 'group' for population level

返回:

Preprocessed DataFrame

备注

Group-level preprocessing is typically handled by Pipeline steps automatically. This method is primarily used for subject-level preprocessing.

calculate_supervoxel_means(subject: str, feature_df: DataFrame, raw_df: DataFrame, supervoxel_labels: ndarray, n_clusters_supervoxel: int) → DataFrame[源代码]: Calculate supervoxel-level features (aggregated from voxel features).

setup_supervoxel_files(subjects: List[str], failed_subjects: List[str], out_folder: str) → None[源代码]: Setup dictionary mapping subjects to supervoxel files.

clean_features(features: DataFrame) → DataFrame[源代码]: Clean feature DataFrame: handle types, inf, nan values.

分析器与提取器 (Analyzers & Extractors)

Habitat Feature Extraction Tool (Refactored Version) This tool provides functionality for extracting features from habitat maps: 1. Radiomic features of raw images within different habitats 2. Radiomic features of habitats within the entire ROI 3. Number of disconnected regions and volume percentage for each habitat 4. MSI (Mutual Spatial Integrity) features from habitat maps 5. ITH (Intratumoral Heterogeneity) scores from habitat maps

class habit.core.habitat_analysis.analyzers.habitat_analyzer.HabitatMapAnalyzer(params_file_of_non_habitat=None, params_file_of_habitat=None, raw_img_folder=None, habitats_map_folder=None, out_dir=None, n_processes=None, habitat_pattern=None, voxel_cutoff=10)[源代码]

基类：object

Habitat Map Analyzer Class (Refactored)

This class provides functionality for extracting various features from habitat maps: 1. Radiomic features of raw images within different habitats 2. Radiomic features of habitats within the entire ROI 3. Number of disconnected regions and volume percentage for each habitat 4. MSI (Mutual Spatial Integrity) features from habitat maps 5. ITH (Intratumoral Heterogeneity) index from habitat maps

__init__(params_file_of_non_habitat=None, params_file_of_habitat=None, raw_img_folder=None, habitats_map_folder=None, out_dir=None, n_processes=None, habitat_pattern=None, voxel_cutoff=10)[源代码]

Initialize the habitat feature extractor

参数:

params_file_of_non_habitat -- Parameter file for extracting radiomic features from raw images
params_file_of_habitat -- Parameter file for extracting radiomic features from habitat images
raw_img_folder -- Root directory of raw images
habitats_map_folder -- Root directory of habitat maps
out_dir -- Output directory
n_processes -- Number of processes to use
habitat_pattern -- Pattern for matching habitat files
voxel_cutoff -- Voxel threshold for filtering small regions in MSI feature calculation

get_mask_and_raw_files()[源代码]: Get paths to all original images and habitat maps

process_subject(subj, images_paths, habitat_paths, mask_paths=None, feature_types=None)[源代码]: Process a single subject for habitat feature extraction

extract_features(images_paths, habitat_paths, mask_paths=None, feature_types=None)[源代码]: Extract habitat features for all subjects

run(feature_types: List[str] | None = None, n_habitats: int | None = None)[源代码]

Run the complete analysis pipeline

参数:

feature_types -- Types of features to extract
n_habitats -- Number of habitats to process (None for auto detection)

Voxel-level radiomics feature extractor

class habit.core.habitat_analysis.extractors.voxel_radiomics_extractor.VoxelRadiomicsExtractor(**kwargs)[源代码]

基类：BaseClusteringExtractor

Extract voxel-level radiomics features from image within mask region using PyRadiomics' voxel-based extraction

__init__(**kwargs)[源代码]

Initialize voxel-level radiomics feature extractor

参数:: **kwargs -- Additional parameters

extract_features(image_data: str | Image, mask_data: str | Image, **kwargs) → DataFrame[源代码]

Extract voxel-level radiomics features from image within mask region

参数:

image_data -- Path to image file or SimpleITK image object
mask_data -- Path to mask file or SimpleITK mask object
**kwargs -- Additional parameters subj: subject name img_name: Name of the image to append to feature names

返回:

Extracted voxel-level radiomics features

返回类型:

pd.DataFrame

get_feature_names() → List[str][源代码]

Get feature names

返回:: List of feature names
返回类型:: List[str]

Supervoxel-level radiomics feature extractor

class habit.core.habitat_analysis.extractors.supervoxel_radiomics_extractor.SupervoxelRadiomicsExtractor(params_file: str | None = None, **kwargs)[源代码]

基类：BaseClusteringExtractor

Extract radiomics features for each supervoxel in the supervoxel map

__init__(params_file: str | None = None, **kwargs)[源代码]

Initialize supervoxel radiomics feature extractor

参数:

params_file -- Path to PyRadiomics parameter file or YAML string containing parameters
**kwargs -- Additional parameters

extract_features(image_data: str | Image, supervoxel_map: str | Image, config_file: str | None = None, **kwargs) → DataFrame[源代码]

Extract radiomics features for each supervoxel in the supervoxel map

参数:

image_data -- Path to image file or SimpleITK image object
supervoxel_map -- Path to supervoxel map file or SimpleITK image object
config_file -- Path to PyRadiomics parameter file (overrides the one in constructor)
**kwargs -- Additional parameters

返回:

DataFrame with radiomics features for each supervoxel

返回类型:

pd.DataFrame

get_feature_names() → List[str][源代码]

Get feature names

返回:: List of feature names
返回类型:: List[str]