Core Module
- class mmpp.core.ScanResult(path, attributes, error=None)[source]
Bases:
object
Data class for storing scan results from a single zarr folder.
- class mmpp.core.ZarrJobResult(path, attributes)[source]
Bases:
object
Enhanced zarr job result with integrated Pyzfn functionality.
- __init__(path, attributes)[source]
Initialize ZarrJobResult with path and attributes.
Parameters:
- pathstr
Path to the zarr folder
- attributesDict[str, Any]
Metadata attributes
- property z: Group
Get the zarr group (lazy loaded).
- property script: Syntax | None
Check if there’s a .mx3* file in the parent directory with the same name as the zarr simulation. If found, return syntax-highlighted content using rich.
- Returns:
Syntax-highlighted script or None if no file found
- Return type:
Optional[Syntax]
- property pp
Pretty print the zarr tree.
- rm(dset)[source]
Remove a group or dataset.
- Return type:
Parameters:
- dsetstr
Name of dataset or group to remove
- param dset:
- type dset:
- mkdir(name)[source]
Create nested directories.
- Return type:
Parameters:
- namestr
Directory path to create
- param name:
- type name:
- get_raw(dset, slices=slice(None, None, None))[source]
Get raw zarr dataset or data using direct indexing. Handles datasets with special characters (like minus) in names.
Parameters:
- dsetstr
Dataset name (can contain special characters)
- slicesArraySlice, optional
Array slicing specification (default: all data)
Returns:
: Union[zarr.Array, np.ndarray]
Raw zarr dataset or numpy array if sliced
Example:
# For dataset names with special characters like “m_z5-8” data = result.get_raw(“m_z5-8”)[:] # or with slicing data = result.get_raw(“m_z5-8”, slice(0, 100))
- get_raw_data(dset, slices=slice(None, None, None))[source]
Get raw data as numpy array from dataset with special characters.
- Return type:
Parameters:
- dsetstr
Dataset name (can contain special characters)
- slicesArraySlice, optional
Array slicing specification (default: all data)
Returns:
: np.ndarray
Numpy array with original dtype
- get_raw_f32(dset, slices=slice(None, None, None))[source]
Get raw data as float32 array from dataset with special characters.
- Return type:
Parameters:
- dsetstr
Dataset name (can contain special characters)
- slicesArraySlice, optional
Array slicing specification (default: all data)
Returns:
: npf32
Float32 numpy array
- get_raw_c64(dset, slices=slice(None, None, None))[source]
Get raw data as complex64 array from dataset with special characters.
- Return type:
Parameters:
- dsetstr
Dataset name (can contain special characters)
- slicesArraySlice, optional
Array slicing specification (default: all data)
Returns:
: npc64
Complex64 numpy array
- list_datasets()[source]
List all available datasets in the zarr group. Useful for finding datasets with special characters.
Returns:
: List[str]
List of dataset names
- find_datasets(pattern)[source]
Find datasets matching a pattern (supports wildcards).
Parameters:
- patternstr
Pattern to match (supports * and ? wildcards)
Returns:
: List[str]
List of matching dataset names
- param pattern:
- type pattern:
- get_dset(dset)[source]
Get zarr dataset.
- Return type:
Array
Parameters:
- dsetstr
Dataset name
Returns:
: zarr.Array
The zarr dataset
- param dset:
- type dset:
- get_f32(dset, slices)[source]
Get float32 array from dataset.
- Return type:
Parameters:
- dsetstr
Dataset name
- slicesArraySlice
Array slicing specification
Returns:
: npf32
Float32 numpy array
- get_c64(dset, slices)[source]
Get complex64 array from dataset.
- Return type:
Parameters:
- dsetstr
Dataset name
- slicesArraySlice
Array slicing specification
Returns:
: npc64
Complex64 numpy array
- property mpl: MMPPlotter
- property matplotlib: MMPPlotter
Get matplotlib plotter for this single result (alias for mpl).
- mmpp.core.find_largest_m_dataset(zarr_path)[source]
Automatically find the m dataset with the largest time dimension.
- Return type:
Parameters:
- zarr_pathstr
Path to zarr file
Returns:
: str
Name of the largest m dataset (e.g., “m_z5-8”, “m_z11-12”, or fallback “m”)
- param zarr_path:
- type zarr_path:
- class mmpp.core.MMPP(base_path, max_workers=8, database_name='mmpy_database', debug=False)[source]
Bases:
object
Multi-threaded scanner for zarr folders with pandas database creation and search functionality.
This class scans directories recursively for .zarr folders, extracts metadata using Pyzfn, and creates a searchable pandas database.
- __init__(base_path, max_workers=8, database_name='mmpy_database', debug=False)[source]
Initialize the MMPP.
Parameters:
- base_pathstr
Base directory path to scan for zarr folders OR direct path to .zarr file
- max_workersint, optional
Maximum number of worker threads for scanning (default: 8)
- database_namestr, optional
Name of the database file (without extension, default: “mmpy_database”)
- debugbool, optional
Enable debug logging (default: False)
- __getitem__(index)[source]
Get zarr result by index or batch operations by slice.
- Return type:
Parameters:
- indexUnion[int, slice]
Index of the result to get or slice for batch operations
Returns:
: Union[ZarrJobResult, BatchOperations]
Single zarr result for integer index or batch operations for slice
- property mpl: MMPPlotter
Get matplotlib plotter for all results.
- property matplotlib: MMPPlotter
Get matplotlib plotter for all results (alias for mpl).
- scan(force=False)[source]
Scan the base directory for zarr folders and create/update the database.
- Return type:
Parameters:
- forcebool, optional
If True, force rescan even if database exists (default: False)
Returns:
: pd.DataFrame
The resulting database DataFrame
- param force:
- type force:
- force_rescan()[source]
Force a complete rescan of the directory structure.
- Return type:
Returns:
: pd.DataFrame
The resulting database DataFrame
- get_parsing_examples(zarr_path)[source]
Get examples of how a specific path would be parsed. Useful for debugging path parsing.
Parameters:
- zarr_pathstr
Path to analyze
Returns:
: Dict[str, Any]
Dictionary showing parsing results for each component
- param zarr_path:
- type zarr_path:
- find(**kwargs)[source]
Find zarr folders that match the given criteria. Now returns a PlotterProxy with plotting capabilities.
- Return type:
Union
[PlotterProxy
,list
[ZarrJobResult]]
Parameters:
- **kwargsAny
Attribute criteria to match (e.g., PBCx=1, Nx=1296, solver=3)
Returns:
: PlotterProxy
Proxy object containing ZarrJobResult objects with plotting capabilities
- param kwargs:
- type kwargs:
- find_paths(**kwargs)[source]
Find zarr folder paths that match the given criteria.
Parameters:
- **kwargsAny
Attribute criteria to match (e.g., PBCx=1, Nx=1296)
Returns:
: List[str]
List of paths to zarr folders matching the criteria
- param kwargs:
- type kwargs:
- find_by_path_param(**kwargs)[source]
Find zarr folders that match path-extracted parameters specifically.
- Return type:
Parameters:
- **kwargsAny
Path parameter criteria to match (e.g., solver=3, f0=2.15e+09)
Returns:
: List[ZarrJobResult]
List of ZarrJobResult objects matching the criteria
- param kwargs:
- type kwargs:
- find_by_path_param_paths(**kwargs)[source]
Find zarr folder paths that match path-extracted parameters specifically.
Parameters:
- **kwargsAny
Path parameter criteria to match (e.g., solver=3, f0=2.15e+09)
Returns:
: List[str]
List of paths to zarr folders matching the criteria
- param kwargs:
- type kwargs:
- get_job(path)[source]
Get a specific job by its path.
- Return type:
Parameters:
- pathstr
Path to the zarr folder
Returns:
: Optional[ZarrJobResult]
ZarrJobResult object or None if not found
- param path:
- type path:
- get_all_jobs()[source]
Get all jobs as ZarrJobResult objects.
- Return type:
Returns:
: List[ZarrJobResult]
List of all ZarrJobResult objects in the database
- get_database()[source]
Get the current database DataFrame.
Returns:
: Optional[pd.DataFrame]
The database DataFrame or None if not loaded
- get_unique_values(column)[source]
Get unique values for a specific column.
Parameters:
- columnstr
Column name
Returns:
: List[Any]
List of unique values in the column
- param column:
- type column:
- get_summary()[source]
Get a summary of the database.
Returns:
: Dict[str, Any]
Summary information about the database
- get_path_parameters(zarr_path)[source]
Get parameters extracted from a specific zarr path.
Parameters:
- zarr_pathstr
Path to the zarr folder
Returns:
: Dict[str, Any]
Dictionary of parameters extracted from the path
- param zarr_path:
- type zarr_path:
- get_path_parameter_summary()[source]
Get a summary of all path-extracted parameters and their unique values.
Returns:
: Dict[str, List[Any]]
Dictionary mapping parameter names to lists of unique values
- list_data(limit=10)[source]
Display a formatted list of all data in the database.
- Return type:
Parameters:
- limitint, optional
Maximum number of entries to display (default: 10, use -1 for all)
- param limit:
- type limit:
- show(max_rows=1000, height=400)[source]
Show interactive pandas DataFrame viewer.
- Return type:
Parameters:
- max_rowsint, optional
Maximum number of rows to display (default: 1000)
- heightint, optional
Height of the table in pixels (default: 400)
- set_interactive_mode(enabled=True)[source]
Enable or disable interactive display mode.
- Return type:
Parameters:
- enabledbool, optional
Whether to enable interactive mode (default: True)
- param enabled:
- type enabled:
- mmpp.core.mmpp(base_path, force=False, **kwargs)[source]
Convenience function to create and initialize a MMPP.
- Return type:
Parameters:
- base_pathstr
Base directory path to scan
- forcebool, optional
If True, force rescan even if database exists (default: False)
- **kwargsAny
Additional arguments passed to MMPP constructor
Returns:
: MMPP
Initialized processor instance
Main Classes
MMPP
- class mmpp.core.MMPP(base_path, max_workers=8, database_name='mmpy_database', debug=False)[source]
Bases:
object
Multi-threaded scanner for zarr folders with pandas database creation and search functionality.
This class scans directories recursively for .zarr folders, extracts metadata using Pyzfn, and creates a searchable pandas database.
- __init__(base_path, max_workers=8, database_name='mmpy_database', debug=False)[source]
Initialize the MMPP.
Parameters:
- base_pathstr
Base directory path to scan for zarr folders OR direct path to .zarr file
- max_workersint, optional
Maximum number of worker threads for scanning (default: 8)
- database_namestr, optional
Name of the database file (without extension, default: “mmpy_database”)
- debugbool, optional
Enable debug logging (default: False)
- __getitem__(index)[source]
Get zarr result by index or batch operations by slice.
- Return type:
Parameters:
- indexUnion[int, slice]
Index of the result to get or slice for batch operations
Returns:
: Union[ZarrJobResult, BatchOperations]
Single zarr result for integer index or batch operations for slice
- property mpl: MMPPlotter
Get matplotlib plotter for all results.
- property matplotlib: MMPPlotter
Get matplotlib plotter for all results (alias for mpl).
- scan(force=False)[source]
Scan the base directory for zarr folders and create/update the database.
- Return type:
Parameters:
- forcebool, optional
If True, force rescan even if database exists (default: False)
Returns:
: pd.DataFrame
The resulting database DataFrame
- param force:
- type force:
- force_rescan()[source]
Force a complete rescan of the directory structure.
- Return type:
Returns:
: pd.DataFrame
The resulting database DataFrame
- get_parsing_examples(zarr_path)[source]
Get examples of how a specific path would be parsed. Useful for debugging path parsing.
Parameters:
- zarr_pathstr
Path to analyze
Returns:
: Dict[str, Any]
Dictionary showing parsing results for each component
- param zarr_path:
- type zarr_path:
- find(**kwargs)[source]
Find zarr folders that match the given criteria. Now returns a PlotterProxy with plotting capabilities.
- Return type:
Union
[PlotterProxy
,list
[ZarrJobResult]]
Parameters:
- **kwargsAny
Attribute criteria to match (e.g., PBCx=1, Nx=1296, solver=3)
Returns:
: PlotterProxy
Proxy object containing ZarrJobResult objects with plotting capabilities
- param kwargs:
- type kwargs:
- find_paths(**kwargs)[source]
Find zarr folder paths that match the given criteria.
Parameters:
- **kwargsAny
Attribute criteria to match (e.g., PBCx=1, Nx=1296)
Returns:
: List[str]
List of paths to zarr folders matching the criteria
- param kwargs:
- type kwargs:
- find_by_path_param(**kwargs)[source]
Find zarr folders that match path-extracted parameters specifically.
- Return type:
Parameters:
- **kwargsAny
Path parameter criteria to match (e.g., solver=3, f0=2.15e+09)
Returns:
: List[ZarrJobResult]
List of ZarrJobResult objects matching the criteria
- param kwargs:
- type kwargs:
- find_by_path_param_paths(**kwargs)[source]
Find zarr folder paths that match path-extracted parameters specifically.
Parameters:
- **kwargsAny
Path parameter criteria to match (e.g., solver=3, f0=2.15e+09)
Returns:
: List[str]
List of paths to zarr folders matching the criteria
- param kwargs:
- type kwargs:
- get_job(path)[source]
Get a specific job by its path.
- Return type:
Parameters:
- pathstr
Path to the zarr folder
Returns:
: Optional[ZarrJobResult]
ZarrJobResult object or None if not found
- param path:
- type path:
- get_all_jobs()[source]
Get all jobs as ZarrJobResult objects.
- Return type:
Returns:
: List[ZarrJobResult]
List of all ZarrJobResult objects in the database
- get_database()[source]
Get the current database DataFrame.
Returns:
: Optional[pd.DataFrame]
The database DataFrame or None if not loaded
- get_unique_values(column)[source]
Get unique values for a specific column.
Parameters:
- columnstr
Column name
Returns:
: List[Any]
List of unique values in the column
- param column:
- type column:
- get_summary()[source]
Get a summary of the database.
Returns:
: Dict[str, Any]
Summary information about the database
- get_path_parameters(zarr_path)[source]
Get parameters extracted from a specific zarr path.
Parameters:
- zarr_pathstr
Path to the zarr folder
Returns:
: Dict[str, Any]
Dictionary of parameters extracted from the path
- param zarr_path:
- type zarr_path:
- get_path_parameter_summary()[source]
Get a summary of all path-extracted parameters and their unique values.
Returns:
: Dict[str, List[Any]]
Dictionary mapping parameter names to lists of unique values
- list_data(limit=10)[source]
Display a formatted list of all data in the database.
- Return type:
Parameters:
- limitint, optional
Maximum number of entries to display (default: 10, use -1 for all)
- param limit:
- type limit:
- show(max_rows=1000, height=400)[source]
Show interactive pandas DataFrame viewer.
- Return type:
Parameters:
- max_rowsint, optional
Maximum number of rows to display (default: 1000)
- heightint, optional
Height of the table in pixels (default: 400)
- set_interactive_mode(enabled=True)[source]
Enable or disable interactive display mode.
- Return type:
Parameters:
- enabledbool, optional
Whether to enable interactive mode (default: True)
- param enabled:
- type enabled:
ZarrJobResult
- class mmpp.core.ZarrJobResult(path, attributes)[source]
Bases:
object
Enhanced zarr job result with integrated Pyzfn functionality.
- __init__(path, attributes)[source]
Initialize ZarrJobResult with path and attributes.
Parameters:
- pathstr
Path to the zarr folder
- attributesDict[str, Any]
Metadata attributes
- property z: Group
Get the zarr group (lazy loaded).
- property script: Syntax | None
Check if there’s a .mx3* file in the parent directory with the same name as the zarr simulation. If found, return syntax-highlighted content using rich.
- Returns:
Syntax-highlighted script or None if no file found
- Return type:
Optional[Syntax]
- property pp
Pretty print the zarr tree.
- rm(dset)[source]
Remove a group or dataset.
- Return type:
Parameters:
- dsetstr
Name of dataset or group to remove
- param dset:
- type dset:
- mkdir(name)[source]
Create nested directories.
- Return type:
Parameters:
- namestr
Directory path to create
- param name:
- type name:
- get_raw(dset, slices=slice(None, None, None))[source]
Get raw zarr dataset or data using direct indexing. Handles datasets with special characters (like minus) in names.
Parameters:
- dsetstr
Dataset name (can contain special characters)
- slicesArraySlice, optional
Array slicing specification (default: all data)
Returns:
: Union[zarr.Array, np.ndarray]
Raw zarr dataset or numpy array if sliced
Example:
# For dataset names with special characters like “m_z5-8” data = result.get_raw(“m_z5-8”)[:] # or with slicing data = result.get_raw(“m_z5-8”, slice(0, 100))
- get_raw_data(dset, slices=slice(None, None, None))[source]
Get raw data as numpy array from dataset with special characters.
- Return type:
Parameters:
- dsetstr
Dataset name (can contain special characters)
- slicesArraySlice, optional
Array slicing specification (default: all data)
Returns:
: np.ndarray
Numpy array with original dtype
- get_raw_f32(dset, slices=slice(None, None, None))[source]
Get raw data as float32 array from dataset with special characters.
- Return type:
Parameters:
- dsetstr
Dataset name (can contain special characters)
- slicesArraySlice, optional
Array slicing specification (default: all data)
Returns:
: npf32
Float32 numpy array
- get_raw_c64(dset, slices=slice(None, None, None))[source]
Get raw data as complex64 array from dataset with special characters.
- Return type:
Parameters:
- dsetstr
Dataset name (can contain special characters)
- slicesArraySlice, optional
Array slicing specification (default: all data)
Returns:
: npc64
Complex64 numpy array
- list_datasets()[source]
List all available datasets in the zarr group. Useful for finding datasets with special characters.
Returns:
: List[str]
List of dataset names
- find_datasets(pattern)[source]
Find datasets matching a pattern (supports wildcards).
Parameters:
- patternstr
Pattern to match (supports * and ? wildcards)
Returns:
: List[str]
List of matching dataset names
- param pattern:
- type pattern:
- get_dset(dset)[source]
Get zarr dataset.
- Return type:
Array
Parameters:
- dsetstr
Dataset name
Returns:
: zarr.Array
The zarr dataset
- param dset:
- type dset:
- get_f32(dset, slices)[source]
Get float32 array from dataset.
- Return type:
Parameters:
- dsetstr
Dataset name
- slicesArraySlice
Array slicing specification
Returns:
: npf32
Float32 numpy array
- get_c64(dset, slices)[source]
Get complex64 array from dataset.
- Return type:
Parameters:
- dsetstr
Dataset name
- slicesArraySlice
Array slicing specification
Returns:
: npc64
Complex64 numpy array
- property mpl: MMPPlotter
- property matplotlib: MMPPlotter
Get matplotlib plotter for this single result (alias for mpl).