Batch Operations Tutorial

Overview

MMPP provides powerful batch processing capabilities that allow you to perform operations on entire directories of simulation results. This is particularly useful when you have many simulation outputs and want to process them all at once.

Basic Usage

Getting All Results

To work with all results in a directory, use slice notation:

import mmpp

# Load all results from a directory
op = mmpp.MMPP('path/to/results')
all_results = op[:]  # Returns BatchOperations instance

FFT Analysis on Multiple Results

# Perform FFT analysis on all results
batch_fft = all_results.fft
fft_results = batch_fft.compute_all(
    dset='m_z5-8'
)

print(f"Processed {len(fft_results)} results")
for result in fft_results:
    if result['success']:
        print(f"✓ {result['file']}: {result['result'].shape}")
    else:
        print(f"✗ {result['file']}: {result['error']}")

Mode Analysis

# Compute FMR modes for all results
mode_results = all_results.fft.modes.compute_modes(
    dset='m_z5-8',
    parallel=True,
    max_workers=4
)

# Filter successful results
successful_modes = [r for r in mode_results if r['success']]
print(f"Successfully computed modes for {len(successful_modes)} files")

Advanced Features

Custom Processing

You can define custom processing functions:

def custom_analysis(zarr_result):
    """Custom analysis function"""
    # Your custom processing logic here
    return {"custom_metric": some_value}

# Note: Custom analysis functions would need custom implementation
# This is an example of what could be possible:
# custom_results = process_custom_analysis(all_results, custom_analysis)

Error Handling

The batch operations provide comprehensive error handling:

results = all_results.fft.compute_all('m_z5-8')

# Check for errors
errors = [r for r in results if not r['success']]
if errors:
    print(f"Found {len(errors)} errors:")
    for error in errors:
        print(f"  {error['file']}: {error['error']}")

Progress Tracking

All batch operations support progress tracking:

# Enable progress bars
results = all_results.fft.modes.compute_modes(
    dset='m_z5-8',
    progress=True,  # Shows progress bar
    parallel=True,
    max_workers=8
)

Best Practices

  1. Use parallel processing for large datasets with parallel=True

  2. Enable progress tracking with progress=True for long operations

  3. Handle errors gracefully by checking the success field in results

  4. Limit workers with max_workers to avoid overwhelming your system

  5. Use appropriate datasets - make sure the dataset name exists in your files

Example: Complete Workflow

import mmpp

# Load data
op = mmpp.MMPP('simulation_results/')
batch = op[:]

print(f"Found {len(batch.zarr_results)} simulation files")

# Compute FFT spectra
print("Computing FFT spectra...")
spectra = batch.fft.compute_all(
    dset='m_z5-8',
    parallel=True,
    progress=True,
    max_workers=6
)

# Compute modes
print("Computing FMR modes...")  
modes = batch.fft.modes.compute_modes(
    dset='m_z5-8', 
    parallel=True,
    progress=True,
    max_workers=6
)

# Generate summary
successful_spectra = sum(1 for r in spectra if r['success'])
successful_modes = sum(1 for r in modes if r['success'])

print(f"\\nResults Summary:")
print(f"  Spectra computed: {successful_spectra}/{len(spectra)}")
print(f"  Modes computed: {successful_modes}/{len(modes)}")

# Save results for later analysis
import pickle
with open('batch_results.pkl', 'wb') as f:
    pickle.dump({
        'spectra': spectra,
        'modes': modes
    }, f)

This workflow demonstrates the power of batch operations for processing large simulation datasets efficiently.