From Research Plot to Commercial Scale: A Two-Season Playbook

Research plots are the wrong model for evaluating commercial agricultural robots. A 5-acre research plot generates academic performance data — the machine works under controlled conditions with dedicated oversight. It tells you almost nothing about how the machine performs at 500 acres with normal crew rotation, variable soil conditions, and the ordinary entropy of a working farm.

But most operators don't have the resources to deploy a machine at 500 acres until they've tested it. The answer is a structured two-phase program that treats season one as deliberate measurement infrastructure — not a technology trial — and uses the data from season one to make a fully-informed season-two scale decision.

This playbook is a step-by-step framework. It's designed for operations considering their first or second autonomous robot deployment in a single major crop type. The timelines are indexed to a standard Northern Hemisphere growing season with a primary weeding window; adjust for your crop calendar.

Season One: Measurement Infrastructure, Not Deployment

The frame shift matters. If you deploy a robot in season one with the goal of "seeing if it works," you will end with anecdote and a vendor sales rep's interpretation of your data. If you deploy with the goal of building measurement infrastructure to answer specific economic questions, you will end with defensible numbers.

8 Weeks Before First Field Pass: Define the Questions

Before the machine arrives — ideally before you've signed the pilot agreement — write down the three economic questions you need season one to answer.

Good examples:

"What is the all-in per-acre cost of robotic weeding at our actual operating schedule, compared to our current cost of $X/acre for hand weeding?"
"Does the machine's weed control effectiveness reduce our crop loss rate measurably vs. prior seasons?"
"What is the operator time required per acre to support machine operation, including startup, monitoring, maintenance, and recalibration?"

Bad examples:

"Does the robot work?"
"Is it worth it?"
"Do we like it?"

The three economic questions determine exactly what data you need to collect. Write the data collection plan — who collects what, in what format, how often — before the machine arrives.

6 Weeks Before First Field Pass: Baseline Measurement Period

Run your operation as you normally would for four weeks. Log:

Labor hours by task (hand weeding, crop scouting, chemical application) — to the nearest 15 minutes per task
Chemical inputs consumed and cost, by field and date
Crop quality metrics relevant to your operation (weed density counts, crop loss rate, canopy closure timing — whatever your agronomist already monitors)
Machinery time for any field preparation, cultivation, or spray passes related to the target use case

This is your baseline. You will compare everything in season one against these numbers. Without it, you have no comparison.

Some operations have historical records going back 3–5 seasons that can serve as baseline with less burden. Use them if they exist and are in a format that makes per-acre and per-task cost extraction feasible.

4 Weeks Before First Field Pass: Infrastructure Readiness

Run the site assessment checklist from the field constraints article (article 4 in this series). Close every gap before the machine arrives.

Confirm in writing:

Cellular or LTE coverage mapped across all planned operating areas
RTK correction source confirmed and tested at the field level
Charging infrastructure installed and load-tested
Operator assigned, trained, and scheduled for the full season (not just the first week)
Agronomist briefed and committed to crop-health KPI monitoring

Sign-off checklist: have the operator, the agronomist, and the operations manager all review and sign off on the readiness checklist. This document becomes evidence at the season-one review.

3 Weeks Before First Field Pass: Staff Preparation

Run a mandatory hands-on session with every crew member who will work alongside the machine. Not a briefing meeting — a session where they physically interact with the machine, see its range of motion, understand what it detects and what it doesn't, and practice the responses to common machine-stop situations.

Cover explicitly:

What the machine does and what it doesn't do (which tasks stay with the crew)
How to respond when the machine stops (what to check, what not to touch)
The communication protocol for reporting issues (who to tell, how)
Why the machine is here (economic framing — this is the crew's job, not a threat to it)

Hold this session in the language your crew works in. If translation is required, provide it — a briefing that 30% of your crew didn't understand is not a briefing.

Season One, Weeks 1–4: Constrained Deployment on a Single Field

Deploy the machine on one field, one crop, one shift pattern. Resist vendor pressure to expand scope in the first month.

Data collection in the first month is the job. Log every incident (machine stop, sensor cleaning required, map recalibration, maintenance event). Log every operator hour. Log coverage per day against planned coverage. Identify and document any variance from the vendor's performance spec.

At week 4, hold a structured review with the operator, agronomist, and operations manager. Ask:

Is the data collection protocol working? Are we getting the numbers we need?
Are there any field-constraint issues not identified in the pre-deployment assessment?
Is the operator time per acre tracking above or below the plan?
Is the machine operating within its spec, or are there performance gaps?

Adjust the protocol based on the review. Do not expand scope. Fix the measurement before expanding the operation.

Season One, Weeks 5–12: Full-Season Operation and Data Collection

Run the machine for the remainder of the operating window on the original field scope. Log consistently. Let the agronomist monitor crop health in the treated area against an adjacent untreated control area if possible — even a 5-acre control strip generates useful comparison data.

At the midpoint of the operating season (week 8 in a 12-week window), conduct a second structured review. By this point you should have enough data to identify whether the per-acre cost trajectory is tracking toward a result that would support a scale decision. If the trajectory is clearly negative at week 8, there is no reason to wait until week 12 to assess — and you should have a written kill criterion from the pilot agreement that defines this.

End of Season One: The Scale Decision Package

Compile the following into a single document before the post-season review meeting:

Operational performance summary:

Total acres covered vs. plan
Uptime rate (hours operated / hours planned × 100)
Average coverage rate (acres/hour) vs. vendor spec
Incident rate and categorization (machine stops by cause)
Operator hours per acre: actual vs. plan

Economic summary:

All-in cost per acre: capital cost, energy, software, operator time, maintenance, vendor service
Comparison to baseline hand-crew cost per acre
Comparison to baseline chemical input cost per acre

Crop-health summary (agronomist-authored):

Weed control effectiveness: weed density in treated area vs. control area
Crop quality metrics: any difference in loss rate, canopy closure, uniformity
Any concerns about secondary effects (soil compaction in machine tracks, mechanical damage to root zone, residue from cultivation implements)

Infrastructure and operational lessons:

What would you do differently in year one?
What infrastructure investments did the operation require beyond the plan?
What is the operator's assessment of the machine's integration into the crew's workflow?

This document is what supports a scale decision. If it doesn't contain these sections, the decision will be made on intuition rather than data.

The Scale Decision: What Season One Data Should Answer

The scale decision is binary: proceed to commercial deployment at X acres in season two, or exit the program.

Proceed if:

All-in per-acre cost beats current method by ≥15%, or is projected to beat it by ≥20% at season-two utilization (because capital is largely sunk and per-acre fixed cost declines as acres increase)
Crop-health KPIs show no degradation vs. baseline (the agronomist confirms no secondary effects that offset the cost savings)
Operator time per acre is within 20% of plan (if it's significantly higher, the scale economics change materially)
Uptime rate is ≥75% (below this, the effective coverage rate makes the per-acre capital cost uncompetitive)

Exit or pause if:

Per-acre cost is within 10% of current method — the margin is insufficient to justify the operational complexity
Crop-health concerns identified by the agronomist that require further assessment
Operator time per acre is >40% above plan — the labor offset required to run the machine may eliminate the labor savings the machine was supposed to generate
Uptime below 60% — the machine was not reliable enough to generate clean economics data, and a second season at the same scale will not produce a different result

The hardest case: results are ambiguous — per-acre cost is within the margin, crop health is mixed, uptime is borderline. This is where the kill criterion you wrote before season one matters. If you defined a binary outcome condition in advance and the data doesn't pass it, the decision is already made. If you didn't define it in advance, the vendor will argue for a season-two extension at the same scale with adjustments. Vendor-optimism about "once we fix X" is not a season-two plan.

Season Two: Commercial Deployment

If season one produced a proceed decision:

Expand scope incrementally: Move from one field to two or three fields in the same crop type. Do not immediately expand to a different crop type — each crop type requires its own validation of performance assumptions.

Build the operational capability: Season two is when you hire or designate the technician who can handle routine maintenance, train a backup operator, and integrate the robot's data output into your farm management system rather than keeping it in a separate vendor dashboard.

Renegotiate the contract: You now have season-one performance data. Use it to renegotiate any per-acre service fees, software subscription rates, or support contract terms. A vendor who wants to be your long-term partner will engage with this data. A vendor who deflects it is a vendor planning to raise prices in year three.

Continue measuring: Season two is not the end of the measurement discipline. Cost per acre in season two should be tracked monthly against the season-one baseline. Any significant deviation — upward or downward — is a signal worth investigating.

The One Thing Pilots That Scale Get Right

The programs that move from pilot to commercial deployment in two seasons are almost always operated by someone who owns the program end-to-end — from the pre-deployment baseline to the post-season scale decision. That person usually isn't the farm manager or the tech lead. It's often a hybrid — someone who understands field operations, reads financial data, and can talk to the vendor's engineering team without getting lost.

If that person doesn't exist at your operation today, the first investment isn't the robot. It's building the operational competency to run one. That investment has a useful life much longer than any single vendor relationship.

Next in this series: vendor selection for agricultural robots — what to look for in pilot data, financing terms, and exit clauses.