Guides

Experiments & A/B Testing

zend.sh experiments let you run A/B (or A/B/C) tests on email copy across a campaign. The system tracks reply rates, interested rates, and bounce rates per variant, then uses statistical significance testing to identify a winner.

How experiments work

  1. Create an experiment on a campaign, specifying the variants and primary metric
  2. The send worker automatically distributes sends across variants per their weights
  3. Use the results endpoint to check statistical significance at any time
  4. Promote the winning variant when ready

Creating an experiment

import Zend from 'zend-sh'

const zend = new Zend({ apiKey: process.env.ZEND_API_KEY })

const experiment = await zend.experiments.create({
  campaign_id: 'cmp_abc123',
  name: 'Subject line test — Q1 outreach',
  primary_metric: 'reply_rate',
  optimization_mode: 'thompson_sampling',
  duration_days: 14,
  variants: [
    {
      name: 'Control',
      config: {
        subject: 'Quick question about {{company}}',
        body_html: '<p>Hi {{first_name}}, I noticed you are hiring engineers at {{company}}...'
      },
      weight: 50
    },
    {
      name: 'Direct',
      config: {
        subject: '{{first_name}} — can we talk?',
        body_html: '<p>Hi {{first_name}}, I will keep this short...'
      },
      weight: 50
    }
  ]
})

console.log(experiment.id) // exp_xyz789

Optimization modes

Fixed split

Traffic is distributed according to the weights you specify (e.g., 50/50). Weights do not change during the experiment.

optimization_mode: 'fixed_split'

Thompson sampling

The bandit algorithm dynamically adjusts allocation toward the better-performing variant over time. Weights are recalculated after each batch of sends.

optimization_mode: 'thompson_sampling'

Thompson sampling is recommended when you want to minimize sends to poor-performing variants. Fixed split is better when you need a clean statistical test.

Viewing results

const results = await zend.experiments.results('exp_xyz789')

console.log(results.winner)          // 'Direct' or null
console.log(results.recommendation)  // Human-readable recommendation
console.log(results.variants)        // Per-variant rates

for (const variant of results.variants) {
  console.log(
    `${variant.name}: ${variant.emails_sent} sent, ` +
    `${variant.reply_rate}% reply rate`
  )
}

// Pairwise significance tests
for (const test of results.significance) {
  console.log(
    `${test.variant_a} vs ${test.variant_b}: ` +
    `significant=${test.significant}, p=${test.p_value}`
  )
}

A variant is declared a winner when the difference in the primary metric is statistically significant at p < 0.05.

Getting the full experiment detail

const experiment = await zend.experiments.get('exp_xyz789')

// When using thompson_sampling, win probabilities are included
if (experiment.win_probabilities) {
  for (const [name, prob] of Object.entries(experiment.win_probabilities)) {
    console.log(`${name}: ${(prob * 100).toFixed(1)}% probability of being best`)
  }
}

Promoting a winning variant

When the experiment has a clear winner, promote it to lock in the result:

// Auto-select the significant winner
const result = await zend.experiments.promote('exp_xyz789')

// Or force-promote a specific variant
const result = await zend.experiments.promote('exp_xyz789', {
  variant_name: 'Direct'
})

console.log(result.winner)        // 'Direct'
console.log(result.winner_variant_id)

Promoting marks the experiment as completed and records the winner_variant_id. You can then update the campaign step to use the winning copy for all future sends.

Best practices

  • Run experiments for at least 7 days to account for day-of-week variation in reply rates
  • Use sufficient sample sizes — aim for 100+ sends per variant before drawing conclusions
  • Test one variable at a time — subject line vs subject line, or body copy vs body copy
  • Set duration_days based on your send volume — a high-volume campaign needs fewer days to reach significance
  • Use Thompson sampling for conversion-oriented tests — it reduces waste by routing fewer sends to poor performers

API reference