Workflow Creation

The MARRMOTWorkflow class is the central component of MarrmotFlow. This guide explains how to create and configure workflows for different use cases.

Basic Workflow Setup

Creating a basic workflow requires minimal configuration:

from marrmotflow import MARRMOTWorkflow
import geopandas as gpd

# Load catchment data
catchments = gpd.read_file("catchments.shp")

# Create workflow
workflow = MARRMOTWorkflow(
    name="MyWorkflow",
    cat=catchments,
    forcing_files=["climate_data.nc"],
    forcing_vars={"precip": "precipitation", "temp": "temperature"}
)

Required Parameters

name

A string identifier for your workflow:

workflow = MARRMOTWorkflow(name="WatershedAnalysis2024")

cat

Catchment data as a GeoDataFrame or file path:

# From GeoDataFrame
catchments = gpd.read_file("catchments.shp")
workflow = MARRMOTWorkflow(cat=catchments)

# From file path
workflow = MARRMOTWorkflow(cat="catchments.shp")

forcing_vars

Dictionary mapping standard variable names to your data variable names:

forcing_vars = {
    "precip": "precipitation",  # Required
    "temp": "temperature"       # Required
}

Optional Parameters

forcing_files

Paths to your forcing data files:

# Single file
forcing_files = "climate_data.nc"

# Multiple files
forcing_files = [
    "precip_2020.nc",
    "temp_2020.nc",
    "climate_2021.nc"
]

forcing_units

Units for your forcing variables:

forcing_units = {
    "precip": "mm/day",
    "temp": "celsius"
}

pet_method

Method for calculating potential evapotranspiration:

# Available methods
pet_method = "penman_monteith"  # Default
pet_method = "hamon"

model_number

MARRMOT model(s) to use:

# Single model
model_number = 7  # HBV-96

# Multiple models
model_number = [7, 37, 1]  # HBV-96, GR4J, and Collie River Basin 1

Time Zone Configuration

forcing_time_zone

Time zone of your forcing data:

forcing_time_zone = "UTC"
forcing_time_zone = "America/Edmonton"
forcing_time_zone = "Europe/London"

model_time_zone

Time zone for model execution:

model_time_zone = "America/Vancouver"

Advanced Configuration Examples

Multi-Model Watershed Analysis

workflow = MARRMOTWorkflow(
    name="MultiModelComparison",
    cat="large_watershed.shp",
    forcing_files=[
        "era5_precip_2010_2020.nc",
        "era5_temp_2010_2020.nc"
    ],
    forcing_vars={
        "precip": "total_precipitation",
        "temp": "2m_temperature"
    },
    forcing_units={
        "precip": "m/day",  # ERA5 uses meters
        "temp": "kelvin"    # ERA5 uses Kelvin
    },
    pet_method="penman_monteith",
    model_number=[7, 37, 1, 2],  # Multiple models for comparison
    forcing_time_zone="UTC",
    model_time_zone="America/Edmonton"
)

Regional Climate Study

workflow = MARRMOTWorkflow(
    name="ClimateChangeImpact",
    cat=gpd.read_file("regional_catchments.geojson"),
    forcing_files="gcm_downscaled_data.nc",
    forcing_vars={
        "precip": "pr",      # CMIP6 standard names
        "temp": "tas"
    },
    forcing_units={
        "precip": "kg m-2 s-1",  # CMIP6 standard units
        "temp": "K"
    },
    pet_method="penman_monteith",
    model_number=37,  # GR4J for this study
    forcing_time_zone="UTC",
    model_time_zone="local"
)

Error Handling

Common errors and solutions:

Missing Required Parameters:

try:
    workflow = MARRMOTWorkflow(name="Test")
except ValueError as e:
    print(f"Error: {e}")
    # Catchment (cat) must be provided

Invalid File Paths:

import os

forcing_file = "nonexistent_file.nc"
if not os.path.exists(forcing_file):
    print(f"Warning: {forcing_file} does not exist")

Incompatible Units:

import pint

ureg = pint.UnitRegistry()
try:
    ureg.parse_expression("invalid_unit")
except pint.UndefinedUnitError:
    print("Invalid unit specified")

Best Practices

Use descriptive names: Choose meaningful workflow names for easier identification
Validate inputs: Check that files exist and units are valid before creating workflows
Document your choices: Keep track of why you chose specific models and methods
Start simple: Begin with single models and basic configurations before advancing
Test with subsets: Use small catchments or short time periods for initial testing