Isoxml Rates In Metadata

Task Details
← Task Board

Task Description

# TASK: ISOXML Processor — Add Rate Summaries to metadata.json

**Priority:** HIGH
**Agent:** engineer
**Filed:** 2026-02-16
**Filed by:** Doug (via Claude Code session)

---

## Goal

The ISOXML processor (`FarmData_API_Integration/processors/isoxml_processor.py`) outputs metadata.json with product names (varieties, fertilizers) but does NOT include rates. The rates already exist in coverage_points.csv (columns: `rate_raw`, `rate`, `rate_fert_raw`, `rate_fert`) but are never summarized into metadata.json. Add average/target rates per product to the metadata summary.

## The Gap

**Current metadata.json output:**
```json
{
  "field": "2025-09-05-15-47-55",
  "crop_year": 2026,
  "crop_name": "WINTER WHEAT",
  "varieties": ["WHEAT-HR"],
  "fertilizers": ["46-00-00", "11-52-0-0"],
  "seed_date": "20250905",
  "planted_acres": 39.91,
  "point_count": 38008
}
```

**What it should also include:**
```json
{
  "rates": {
    "seed": {
      "product": "WHEAT-HR",
      "avg_rate": 85.3,
      "target_rate": 85.0,
      "unit": "lb/ac"
    },
    "fertilizer": {
      "products": [
        {"product": "46-00-00", "avg_rate": 60.0, "target_rate": 60.0, "unit": "lb/ac"},
        {"product": "11-52-0-0", "avg_rate": 45.2, "target_rate": 45.0, "unit": "lb/ac"}
      ]
    }
  }
}
```

## What Exists

- `coverage_points.csv` already has: `rate_raw`, `rate` (seed rate, scaled), `rate_fert_raw`, `rate_fert` (fertilizer rate, scaled)
- The processor already reads DDI values and applies scale factors during BIN parsing
- Product names are already detected and written to metadata
- The `rate` column = seed product rate; `rate_fert` column = fertilizer product rate

## Steps

### 1. After coverage_points DataFrame is built, compute rate summaries
```python
# Filter to work_state=1 only (active application, not headlands/turns)
active = df[df['work_state'] == 1]

# Seed rate summary (from 'rate' column)
if 'rate' in active.columns and active['rate'].notna().any():
    nonzero = active[active['rate'] > 0]['rate']
    seed_avg = round(nonzero.mean(), 2) if len(nonzero) > 0 else 0
    seed_median = round(nonzero.median(), 2) if len(nonzero) > 0 else 0
    # target_rate = mode or most common rounded value

# Fertilizer rate summary (from 'rate_fert' column)
if 'rate_fert' in active.columns and active['rate_fert'].notna().any():
    nonzero_fert = active[active['rate_fert'] > 0]['rate_fert']
    fert_avg = round(nonzero_fert.mean(), 2) if len(nonzero_fert) > 0 else 0
```

### 2. Add to metadata dict before writing
Add a `"rates"` key to the metadata dict. Include:
- `avg_rate` — mean of non-zero active points
- `median_rate` — median of non-zero active points
- `target_rate` — mode (most common value when rounded to 1 decimal), represents the prescription target
- `min_rate` / `max_rate` — for VR verification (variable rate spread)
- `unit` — detect from DDI if possible, otherwise default to raw units with a note

### 3. Handle edge cases
- Some ZIPs have rate columns that are ALL zero (e.g., work_state tracking only) — omit rates section, don't write zeros
- Some ZIPs have rate but no rate_fert (single-product runs) — only include what exists
- When multiple fertilizer products exist but only one rate_fert column, note that rates are combined (Topcon limitation: single fert rate channel for multi-product tanks)
- Spray operations may have rate columns too — include them with appropriate labeling

### 4. Backfill existing processed data (optional follow-up)
After the processor is updated, a batch re-read of existing coverage_points.csv files can backfill rates into their metadata.json without re-processing the ZIPs. This is a separate step — just adding the capability to the processor is the primary goal.

## Files to Modify

- **Primary:** `FarmData_API_Integration/processors/isoxml_processor.py` — the `_build_metadata()` or equivalent function that writes metadata.json
- **Test with:** Any Blanchet or Doug seeding ZIP that produces rate columns (e.g., the 91300 4TANK-MTRG files)

## Verification

- Process a known ZIP and confirm metadata.json now has a `"rates"` section
- Rates should match manual spot-check of coverage_points.csv (mean of non-zero work_state=1 rows)
- Existing metadata fields must not change (backward compatible — only adding new keys)

## DO NOT
- Do NOT change the BIN parsing or rate scaling logic — that already works
- Do NOT change the coverage_points.csv output format
- Do NOT remove or rename any existing metadata.json fields
- Do NOT break batch mode (`--batch`) — rate summary must work silently

Job Queue (0)

No job queue entries for this task yet