common 模块
Common components shared across HABIT modules.
配置系统 (Configuration System)
HABIT 使用一套基于 YAML 的强类型配置系统。
Configuration utilities for loading, saving, and resolving configurations. This module combines configuration I/O and path resolution capabilities.
- habit.core.common.config_loader.load_config(config_path: str, resolve_paths: bool = True) Dict[str, Any][源代码]
Load configuration file and optionally resolve relative paths.
- 参数:
- 返回:
Configuration dictionary
- 返回类型:
Dict[str, Any]
- 抛出:
FileNotFoundError -- If configuration file is not found
ValueError -- If file format is not supported
- habit.core.common.config_loader.save_config(config: Dict[str, Any], config_path: str) None[源代码]
Save configuration to file
- 参数:
- 抛出:
ValueError -- If file format is not supported
- habit.core.common.config_loader.validate_config(config: Dict[str, Any], required_keys: List[str] | None = None) bool[源代码]
Validate if configuration contains required keys
- 参数:
- 返回:
Whether the configuration is valid
- 返回类型:
- 抛出:
ValueError -- If required keys are missing
- class habit.core.common.config_loader.PathResolver(config_path: str | Path | None = None, base_dir: str | Path | None = None, extra_suffixes: List[str] | None = None, extra_exact: List[str] | None = None, custom_patterns: Dict[str, List[str]] | None = None)[源代码]
基类:
objectA flexible path resolver for configuration files.
Resolves relative paths in configuration dictionaries to absolute paths, using the configuration file's directory as the base.
- base_dir
Base directory for resolving relative paths
- Type:
Path
- patterns
Patterns for identifying path fields
- Type:
Dict
示例
>>> resolver = PathResolver('/path/to/config.yaml') >>> resolved_config = resolver.resolve(config_dict) >>> print(f"Resolved {resolver.resolved_count} paths")
- __init__(config_path: str | Path | None = None, base_dir: str | Path | None = None, extra_suffixes: List[str] | None = None, extra_exact: List[str] | None = None, custom_patterns: Dict[str, List[str]] | None = None)[源代码]
Initialize the PathResolver.
- 参数:
config_path -- Path to the configuration file (used to determine base_dir)
base_dir -- Explicit base directory for resolving paths (overrides config_path)
extra_suffixes -- Additional suffix patterns to match (e.g., ['_location'])
extra_exact -- Additional exact match patterns (e.g., ['my_path_field'])
custom_patterns -- Complete custom patterns dict to replace defaults
备注
Either config_path or base_dir must be provided.
- is_path_field(key: str) bool[源代码]
Check if a field name represents a path (key-based detection).
- 参数:
key -- The field name to check
- 返回:
True if the field is likely a path field
- is_path_value(value: str) bool[源代码]
Check if a string value looks like a path (value-based detection).
Detection strategies: 1. Starts with relative path prefix: ./ .../ .. 2. Ends with common file extension: .yaml, .nii.gz, .csv, etc. 3. Matches path-like pattern: contains path separators in meaningful way
- 参数:
value -- The string value to check
- 返回:
True if the value looks like a path
- should_resolve(key: str, value: Any) bool[源代码]
Determine if a key-value pair should have its path resolved.
Combines key-based and value-based detection strategies.
- 参数:
key -- The field name
value -- The field value
- 返回:
True if this field should be resolved as a path
- resolve_path(path_value: str) str[源代码]
Resolve a single path value.
- 参数:
path_value -- The path string to resolve
- 返回:
Absolute path if the input was relative and exists, otherwise original path
- resolve(config: Dict[str, Any], _path_prefix: str = '') Dict[str, Any][源代码]
Resolve all path fields in a configuration dictionary.
- 参数:
config -- Configuration dictionary to process
_path_prefix -- Internal use for tracking nested paths
- 返回:
New dictionary with resolved paths (original dict is not modified)
- habit.core.common.config_loader.resolve_config_paths(config: Dict[str, Any], config_path: str | Path, extra_patterns: List[str] | None = None, verbose: bool = False) Dict[str, Any][源代码]
Convenience function to resolve paths in a configuration dictionary.
This is the recommended way to use path resolution in most cases.
- 参数:
config -- Configuration dictionary to process
config_path -- Path to the configuration file (for determining base directory)
extra_patterns -- Additional patterns for path field detection
verbose -- If True, print information about resolved paths
- 返回:
New configuration dictionary with resolved paths
示例
>>> config = load_config('demo_data/config.yaml') >>> config = resolve_config_paths(config, 'demo_data/config.yaml')
- habit.core.common.config_loader.load_config_with_paths(config_path: str | Path, extra_patterns: List[str] | None = None, resolve_paths: bool = True) Dict[str, Any][源代码]
Load a configuration file and optionally resolve relative paths.
This is a convenience function that combines load_config and path resolution.
- 参数:
config_path -- Path to the configuration file
extra_patterns -- Additional patterns for path field detection
resolve_paths -- Whether to resolve relative paths (default: True)
- 返回:
Configuration dictionary with resolved paths
示例
>>> config = load_config_with_paths('demo_data/config.yaml')
Configuration validation middleware and utilities.
Provides unified configuration validation and loading across all HABIT modules.
- class habit.core.common.config_validator.ConfigValidator[源代码]
基类:
objectUnified configuration validator and loader.
Provides a single entry point for loading and validating configurations across all HABIT modules.
- static validate_and_load(config_path: str | Path, config_class: Type[ConfigType], resolve_paths: bool = True, strict: bool = True) ConfigType[源代码]
Load and validate configuration from file.
This is the recommended way to load configurations in HABIT. It provides: - Automatic path resolution - Unified error handling - Type-safe configuration objects
- 参数:
config_path -- Path to configuration file
config_class -- Configuration class (must inherit from BaseConfig)
resolve_paths -- Whether to resolve relative paths (default: True)
strict -- Whether to raise exceptions on validation errors (default: True)
- 返回:
Validated configuration instance
- 抛出:
FileNotFoundError -- If configuration file not found
ConfigValidationError -- If validation fails and strict=True
示例
>>> from habit.core.habitat_analysis.config_schemas import HabitatAnalysisConfig >>> config = ConfigValidator.validate_and_load( ... 'config.yaml', ... HabitatAnalysisConfig ... )
- static validate_dict(config_dict: Dict[str, Any], config_class: Type[ConfigType], config_path: str | None = None, strict: bool = True) ConfigType[源代码]
Validate configuration dictionary.
- 参数:
config_dict -- Configuration dictionary
config_class -- Configuration class
config_path -- Optional path for error reporting
strict -- Whether to raise exceptions on validation errors
- 返回:
Validated configuration instance
- 抛出:
ConfigValidationError -- If validation fails and strict=True
- static safe_validate(config_dict: Dict[str, Any], config_class: Type[ConfigType], default: ConfigType | None = None) ConfigType | None[源代码]
Safely validate configuration (returns None on failure instead of raising).
Useful for optional configurations or when you want to handle validation errors gracefully.
- 参数:
config_dict -- Configuration dictionary
config_class -- Configuration class
default -- Default value to return on validation failure
- 返回:
Validated configuration instance or default
- habit.core.common.config_validator.load_and_validate_config(config_path: str | Path, config_class: Type[ConfigType], resolve_paths: bool = True) ConfigType[源代码]
Convenience function for loading and validating configurations.
This is a shorthand for ConfigValidator.validate_and_load().
- 参数:
config_path -- Path to configuration file
config_class -- Configuration class
resolve_paths -- Whether to resolve relative paths
- 返回:
Validated configuration instance
示例
>>> from habit.core.habitat_analysis.config_schemas import HabitatAnalysisConfig >>> config = load_and_validate_config('config.yaml', HabitatAnalysisConfig)
Base configuration classes for unified configuration management.
This module provides: 1. BaseConfig: Abstract base class for all configuration schemas 2. ConfigValidationError: Custom exception for configuration validation errors 3. ConfigAccessor: Unified interface for accessing configuration values
- exception habit.core.common.config_base.ConfigValidationError(message: str, errors: Dict[str, Any] | None = None, config_path: str | None = None)[源代码]
基类:
ExceptionCustom exception for configuration validation errors.
Provides detailed information about validation failures.
- class habit.core.common.config_base.BaseConfig(*, config_file: str | None = None, config_version: str | None = None)[源代码]
基类:
BaseModel,ABCAbstract base class for all configuration schemas in HABIT.
Provides common functionality: - Version tracking - Configuration file path tracking - Validation hooks - Accessor methods
All configuration classes should inherit from this base class.
- class Config[源代码]
基类:
objectPydantic configuration.
- extra = 'forbid'
- validate_assignment = True
- use_enum_values = True
- __init__(**data: Any)[源代码]
Initialize configuration with validation.
- 参数:
**data -- Configuration data
- 抛出:
ConfigValidationError -- If validation fails
- classmethod from_dict(config_dict: Dict[str, Any], config_path: str | None = None) ConfigType[源代码]
Create configuration instance from dictionary.
- 参数:
config_dict -- Configuration dictionary
config_path -- Optional path to configuration file (for error reporting)
- 返回:
Configuration instance
- 抛出:
ConfigValidationError -- If validation fails
- classmethod from_file(config_path: str | Path) ConfigType[源代码]
Load configuration from file.
- 参数:
config_path -- Path to configuration file (YAML or JSON)
- 返回:
Configuration instance
- 抛出:
FileNotFoundError -- If configuration file not found
ConfigValidationError -- If validation fails
- to_dict(exclude_none: bool = False, exclude_unset: bool = False) Dict[str, Any][源代码]
Convert configuration to dictionary.
- 参数:
exclude_none -- Whether to exclude None values
exclude_unset -- Whether to exclude unset values
- 返回:
Configuration dictionary
- get(key: str, default: Any | None = None) Any[源代码]
Get configuration value by key (dictionary-like access).
This method provides backward compatibility with dictionary access patterns. However, direct attribute access (config.field_name) is preferred.
- 参数:
key -- Configuration key (supports dot notation for nested keys)
default -- Default value if key not found
- 返回:
Configuration value or default
- validate() bool[源代码]
Validate configuration (re-validate after modifications).
- 返回:
True if valid
- 抛出:
ConfigValidationError -- If validation fails
- __getitem__(key: str) Any[源代码]
Dictionary-like access for backward compatibility.
Prefer direct attribute access: config.field_name
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class habit.core.common.config_base.ConfigAccessor(config: BaseConfig | Dict[str, Any])[源代码]
基类:
objectUnified interface for accessing configuration values.
Provides a consistent API for accessing configuration regardless of whether it's a Pydantic model or a dictionary.
This class helps transition from dictionary-based config access to strongly-typed Pydantic model access.
- __init__(config: BaseConfig | Dict[str, Any])[源代码]
Initialize config accessor.
- 参数:
config -- Configuration object (BaseConfig instance or dict)
- get(key: str, default: Any | None = None) Any[源代码]
Get configuration value by key.
Supports: - Direct attribute access for Pydantic models: config.field_name - Dot notation for nested access: config.section.subsection.field - Dictionary access for backward compatibility
- 参数:
key -- Configuration key (supports dot notation)
default -- Default value if key not found
- 返回:
Configuration value or default
- has(key: str) bool[源代码]
Check if configuration contains a key.
- 参数:
key -- Configuration key (supports dot notation)
- 返回:
True if key exists
- get_section(section_name: str) BaseConfig | Dict[str, Any] | None[源代码]
Get a configuration section.
- 参数:
section_name -- Section name (supports dot notation)
- 返回:
Configuration section or None
- property raw_config: BaseConfig | Dict[str, Any]
Get raw configuration object.
数据工具 (Data Utilities)
DataFrame Utilities
Common utility functions for DataFrame operations. Eliminates code duplication across the codebase.
- habit.core.common.dataframe_utils.remove_nan_arrays(*arrays: ndarray) List[ndarray][源代码]
Remove NaN values from multiple arrays simultaneously.
- 参数:
*arrays -- Variable number of numpy arrays
- 返回:
List of arrays with NaN rows removed
示例
>>> y_true = np.array([0, 1, np.nan, 1]) >>> y_pred = np.array([0.2, 0.8, 0.5, np.nan]) >>> clean_true, clean_pred = remove_nan_arrays(y_true, y_pred) >>> len(clean_true) 2
- habit.core.common.dataframe_utils.create_prediction_dataframe(y_true: ndarray, y_pred_proba: ndarray, y_pred: ndarray | None = None) DataFrame[源代码]
Create a DataFrame for prediction data.
- 参数:
y_true -- True labels array
y_pred_proba -- Predicted probabilities array
y_pred -- Optional predicted labels array
- 返回:
y_true, y_pred_proba, [y_pred]
- 返回类型:
DataFrame with columns
示例
>>> df = create_prediction_dataframe( ... y_true=np.array([0, 1, 0]), ... y_pred_proba=np.array([0.2, 0.8, 0.3]), ... y_pred=np.array([0, 1, 0]) ... ) >>> df.columns.tolist() ['y_true', 'y_pred_proba', 'y_pred']
- habit.core.common.dataframe_utils.clean_prediction_data(y_true: ndarray | List, y_pred_proba: ndarray | List, y_pred: ndarray | List | None = None) Tuple[ndarray, ndarray, ndarray | None][源代码]
Clean prediction data by removing NaN values.
- 参数:
y_true -- True labels
y_pred_proba -- Predicted probabilities
y_pred -- Optional predicted labels
- 返回:
Tuple of (y_true_clean, y_pred_proba_clean, y_pred_clean)
示例
>>> y_true = np.array([0, 1, np.nan, 1]) >>> y_pred_proba = np.array([0.2, 0.8, 0.5, np.nan]) >>> clean_true, clean_prob, clean_pred = clean_prediction_data(y_true, y_pred_proba) >>> len(clean_true) 2
- habit.core.common.dataframe_utils.ensure_dataframe(data: DataFrame | ndarray, columns: List[str] | None = None) DataFrame[源代码]
Ensure input is a DataFrame, converting from numpy if necessary.
- 参数:
data -- Input data (DataFrame or numpy array)
columns -- Optional column names for numpy arrays
- 返回:
DataFrame representation of the data
示例
>>> arr = np.array([[1, 2], [3, 4]]) >>> df = ensure_dataframe(arr, columns=['a', 'b']) >>> isinstance(df, pd.DataFrame) True
- habit.core.common.dataframe_utils.validate_binary_labels(y: ndarray) None[源代码]
Validate that labels are binary (0 or 1).
- 参数:
y -- Label array to validate
- 抛出:
ValueError -- If labels are not binary
示例
>>> validate_binary_labels(np.array([0, 1, 0, 1])) >>> validate_binary_labels(np.array([0, 1, 2])) ValueError: Labels must be binary (0 or 1)
- habit.core.common.dataframe_utils.validate_probabilities(y_pred_proba: ndarray) None[源代码]
Validate that predicted probabilities are in valid range [0, 1].
- 参数:
y_pred_proba -- Predicted probabilities array
- 抛出:
ValueError -- If probabilities are outside [0, 1] range
示例
>>> validate_probabilities(np.array([0.2, 0.8, 0.5])) >>> validate_probabilities(np.array([0.2, 1.5, 0.5])) ValueError: Probabilities must be in range [0, 1]
- habit.core.common.dataframe_utils.normalize_probabilities(y_pred_proba: ndarray) ndarray[源代码]
Normalize probabilities to [0, 1] range using min-max scaling.
- 参数:
y_pred_proba -- Predicted probabilities array
- 返回:
Normalized probabilities in [0, 1] range
示例
>>> probs = np.array([0.1, 0.2, 0.3]) >>> norm_probs = normalize_probabilities(probs) >>> np.all((norm_probs >= 0) & (norm_probs <= 1)) True