Gdrive Full Sync And Annotate

Task Details
← Task Board

Task Description

# TASK: Full GDrive Sync + Run Annotator on FarmDefender Field Pics

**Priority:** HIGH
**Agent:** engineer
**Filed:** 2026-02-16
**Filed by:** Doug (via Claude Code session)

---

## Goal

1. Sync ALL GDrive data to the Pi's external drive so it's always available locally
2. Set up a cron job so it stays in sync automatically
3. Run the auto-annotator on the REAL FarmDefender ground-level field photos (NOT drone photos)
4. Update the web review interface to point at the correct data

**IMPORTANT:** The previous test (Job #56) used DRONE photos from Dailey. That was WRONG. We need ground-level phone/camera photos from the FarmDefender field pics folder. Delete any drone-based test results and re-do with the correct photos.

---

## Step 1: Full GDrive Sync

Two rclone remotes are configured:

| Remote | What | Size | Sync To |
|--------|------|------|---------|
| `gdrive:` | Doug's personal GDrive | ~267 GB | `/data/gdrive/` |
| `clients:` | Client data GDrive | ~137 GB (already at `/data/clients/`, may be stale) | `/data/clients/` |

### Sync commands:

```bash
# Sync personal GDrive (267 GB — will take hours, that's fine)
mkdir -p /data/gdrive
rclone sync gdrive: /data/gdrive/ --transfers 8 --checkers 16 --progress 2>&1 | tee /data/gdrive_sync.log

# Re-sync clients GDrive (refresh what's already there)
rclone sync clients: /data/clients/ --transfers 8 --checkers 16 --progress 2>&1 | tee /data/clients_sync.log
```

**Notes:**
- Use `rclone sync` (mirror), not `rclone copy` (additive). We want an exact mirror.
- If `clients:` token is expired, skip it and note the error. Doug will need to re-auth.
- 1.8TB drive, 151GB used, ~1.4TB free. Plenty of room for both (~400GB total).

### Set up cron for daily sync:

```bash
# Add to crontab — sync both drives at 3am daily
(crontab -l 2>/dev/null; echo "0 3 * * * rclone sync gdrive: /data/gdrive/ --transfers 4 --checkers 8 --log-file /data/gdrive_sync.log --log-level INFO 2>&1") | crontab -
(crontab -l 2>/dev/null; echo "30 3 * * * rclone sync clients: /data/clients/ --transfers 4 --checkers 8 --log-file /data/clients_sync.log --log-level INFO 2>&1") | crontab -
```

---

## Step 2: Run Annotator on FarmDefender Field Pics

Once the GDrive sync completes, the FarmDefender photos will be at:

```
/data/gdrive/FarmTech/FarmDefender/Field pics/
├── Weeds/          # 924 photos — PRIMARY TARGET for green-on-brown
├── Bare ground/    # 517 photos — negative examples (no weeds)
├── Rocks/          # 114 photos — for rock annotator
├── Green Peas/     # 186 photos — for green-on-green
├── Winter wheat/   # 117 photos — for green-on-green
├── Barley/         # 71 photos — for green-on-green
├── Lentils/        # 25 photos — for green-on-green
├── Alfalfa/        # 21 photos — for green-on-green
├── Show pics/      # 209 photos — demo shots
├── Canola/         # empty
├── Chickpeas/      # empty
└── Spring Wheat/   # empty
```

**All photos are HEIC format** (iPhone). The annotator supports HEIC via `pillow-heif`. Make sure it's installed:

```bash
pip install pillow-heif
```

### Run HSV auto-annotator on the Weeds folder:

```bash
cd /data/Sandbox/Annotator

# Test on a small batch first (10 images) at default sensitivity
python3 auto_annotators/hsv_green.py "/data/gdrive/FarmTech/FarmDefender/Field pics/Weeds/" 5 2>&1 | head -30

# If that works, run on all 924
python3 auto_annotators/hsv_green.py "/data/gdrive/FarmTech/FarmDefender/Field pics/Weeds/" 5

# Also test at sensitivity 3 and 8 on a small batch to compare
python3 auto_annotators/hsv_green.py "/data/gdrive/FarmTech/FarmDefender/Field pics/Weeds/" 3 2>&1 | head -15
python3 auto_annotators/hsv_green.py "/data/gdrive/FarmTech/FarmDefender/Field pics/Weeds/" 8 2>&1 | head -15
```

### Run green-on-green on crop folders:

```bash
# Green peas (most photos)
python3 auto_annotators/green_on_green.py "/data/gdrive/FarmTech/FarmDefender/Field pics/Green Peas/" 40

# Winter wheat
python3 auto_annotators/green_on_green.py "/data/gdrive/FarmTech/FarmDefender/Field pics/Winter wheat/" 40

# Barley
python3 auto_annotators/green_on_green.py "/data/gdrive/FarmTech/FarmDefender/Field pics/Barley/" 40
```

---

## Step 3: Update Web Review Interface

Point the web review at the FarmDefender weed photos (not drone photos):

```bash
cd /data/Sandbox/Annotator/web_review
python3 app.py --data-dir "/data/gdrive/FarmTech/FarmDefender/Field pics/Weeds" --port 5100
```

If the review server is already running, kill it and restart with the correct path.

---

## Step 4: Clean Up Drone Test Data

Remove all drone-related test data from the previous run:

```bash
# Remove drone test batch (if not already removed)
rm -rf /data/Sandbox/Annotator/data/test_batch/
rm -rf /data/Sandbox/Annotator/data/test_results/

# Update TEST_SUMMARY.md to note it was replaced
```

---

## Step 5: Report Results

Update `/data/Sandbox/Annotator/TEST_SUMMARY.md` with:

- Confirmation that GDrive sync is running and cron is set up
- How many FarmDefender photos were found at each path
- HSV annotator results on Weeds folder (detections per image at sensitivity 3, 5, 8)
- Green-on-green results on crop folders
- Any HEIC-related issues
- Whether the web review is running and accessible
- Disk usage after sync

---

## DO NOT

- Do NOT use drone photos for annotator testing (that was the wrong photo type)
- Do NOT delete any GDrive files (read-only remotes)
- Do NOT install ultralytics/YOLO (not needed for HSV/green-on-green testing)
- Do NOT modify the auto-annotator core code (hsv_green.py, green_on_green.py)
- Do NOT conflict with existing Pi dashboard (port 5000). Use port 5100 for review
- Do NOT start annotation with drone/satellite/orthomosaic imagery — ground-level only

## IMPORTANT CONTEXT

- These are GROUND-LEVEL photos taken with a phone/camera, NOT drone imagery
- The whole point is training AI for FarmDefender spot spraying cameras (ground-level mounted)
- Green on brown = green weeds on bare soil. The easiest CV problem in ag.
- HEIC support requires `pillow-heif` package
- The previous test (Job #56) must be superseded — those results are invalid

Job Queue (0)

No job queue entries for this task yet