Demand Gen
4 min read
Eleven Questions Most Buyers Forget When Evaluating a Demand Gen Agency
This guide is eleven questions to ask before you sign. Each one is built to surface a specific failure mode that decks are designed to hide. Use them in the evaluation, not in the post-mortem.

Most B2B marketing leaders spend more time reviewing agency decks than they do stress-testing the agency’s ability to actually deliver pipeline, and that gap is where bad retainers get signed. The slides are polished, the case studies are real, the team in the pitch is senior. Six months later you’re staring at a dashboard full of CTR and CPL and wondering where the pipeline went.
This guide is eleven questions to ask before you sign. Each one is built to surface a specific failure mode that decks are designed to hide. Use them in the evaluation, not in the post-mortem.
1. What Pipeline Number Will You Be Accountable To?
Ask the agency to commit to a pipeline number in writing, with the math that produced it, before you sign. If the answer is CPL, CTR, or “qualified leads,” the agency is selling activity, not outcomes. Pipeline accountability means a stated dollar figure of sourced or influenced pipeline per quarter, tied to your conversion rates, your average deal size, and your sales cycle.
The strong answer distinguishes between pipeline sourced (marketing as first touch) and pipeline influenced (a marketing touchpoint inside an active deal). Both matter. Sourced pipeline tells you the agency is creating new demand; influenced pipeline tells you the agency is accelerating deals already in motion. An agency that conflates the two, or only reports one, is hiding something.
The math is the test. A real answer sounds like: pipeline target → conversion rates → budget model → channel mix. A fake answer sounds like “we’ll drive qualified pipeline to your goals.”
2. Who Actually Runs the Account, and Are They in This Room?
The strategist in the pitch should be the strategist on the account. If the senior team is presenting and the work hands off to a junior account manager you’ve never met, you’re buying two different products: the sales experience and the delivery experience. They are rarely the same quality.
Ask for the names, the LinkedIn profiles, and the hours-per-week commitment of every person who will touch your account. Get it in the SOW. The “senior-only pod” pitch is meaningful only if it survives contract negotiation. Most agencies will quietly swap in a junior AM after month three, and the only protection is naming the team in writing with a re-staffing clause that requires your approval.
3. How Do You Handle the CRM and Attribution Mess We Already Have?
The agency’s answer reveals whether they’ve ever run inside a real company. A strong answer is specific: they explain exactly how they connect MQL to SQL to Won inside your CRM, name the enrichment layer they use for targeting, and describe the holdout test they run when last-click attribution lies to you. Vague assurances about “clean data” are a signal to keep interviewing.
A weak agency will tell you they need clean data before they can start. A strong one will tell you what they do with the data you have. They’ll name the platforms they integrate with, the conversion events they map, and the incrementality method they use to validate what paid media actually sourced. Forrester research finds that aligned organizations sharing CRM dashboards and unified lead definitions convert 30%+ of MQLs, compared to a 13% baseline at siloed organizations, which means attribution hygiene isn’t a reporting nicety; it’s a pipeline multiplier.
The alignment perception gap is stark: 82% of C-level executives believe their sales and marketing teams are aligned, while only 65% of frontline sales and marketing professionals report alignment (Alignment perception gap, 2024), and companies struggling with that misalignment are 2x more likely to miss revenue goals.
This matters because last-click attribution credits paid social with roughly 60% of revenue it didn’t earn in most B2B environments. If the agency reports against last-click without naming the distortion, every monthly recap will overstate their contribution. The numbers won’t survive a CFO audit. Ask for their last three clients’ CRM audit findings and what they changed based on data.
4. What’s the First 90 Days, and What Happens Before Media Goes Live?
The first 90 days separates agencies that have a system from agencies that have a process deck. Ask for the week-by-week plan. If campaigns go live in week one, the agency is skipping foundations and you’ll pay for it in quarter two when the program plateaus.
A real first 90 days looks like 4 to 6 weeks of foundations work (audit, positioning review, performance model, CRM enrichment, conversion-event mapping), then campaigns live in week 4 to 6, then a structured experiment cadence from there. The agencies that compress this to “we’ll start spending next Monday” are the same agencies that will tell you in month four that “we need to rebuild the positioning”; at your expense, on your time.
5. Show Me a Recent Client Where the Pipeline Math Didn’t Work
This question is the single best filter in the evaluation. Every agency has a Slalom-style case study with a 6-point brand lift and a 34% conversion improvement. Ask for the opposite: a recent engagement where the pipeline math didn’t work, and what they did about it.
The answer tells you three things. Whether the agency runs honest reads, or declares victory before the holdout test. Whether they name confounders (sales team turnover, ICP shift, product launch slip) before you have to ask. And whether they have the operating maturity to course-correct mid-retainer instead of riding out the contract while reporting activity metrics.
If the agency cannot name a recent loss, they either don’t have enough clients to have lost one, or they’re not willing to tell you the truth. Either answer is disqualifying.
6. How Do You Report Brand and Performance as One Program?
Brand and performance are the same program on different timelines. Long-term brand lift accounts for 95% of ROI, but most agencies report only the 5% (performance campaigns), because only about 5% of B2B buyers are in-market at any given moment (LinkedIn B2B Institute, Ehrenberg-Bass, 2021). Branded search returns roughly $13 per $1 spent; non-branded search returns roughly $0.68 per $1 spent.
The gap widens when attribution is precise: average Google Ads lead costs are $70.11 in 2025, up 5% year-over-year, with Display CPA averaging $65.80 to $90.80, which means the non-branded channel’s unit economics collapse if incrementality isn’t validated (Google Ads B2B benchmark, 2025). If the agency cannot connect the two in reporting, they cannot optimize the two together.
Ask whether they’ll report brand and performance metrics in one unified dashboard or separate decks, and which one your exec team will actually use.
This reporting choice determines whether the team can course-correct: companies with poor sales-marketing alignment rank it as a top challenge 44% of the time, and those struggling with alignment are 2x more likely to miss revenue goals, making unified dashboards a structural requirement for accountability (Pipeline360 alignment ranking, 2024). Ask how they measure branded search incrementality and baseline demand. An agency that hands you a separate “brand deck” presented quarterly and ignored monthly is treating two halves of the same program as unrelated problems. That structure starves one to feed the other, and you absorb the CAC consequence.
7. What Does Your Testing Cadence Actually Like?
Three ad variations per quarter starves the system. Ask the agency how many creative variants they run, how often, across which audiences, and what their hypothesis-to-result documentation looks like. The answer should be a structured experiment slate across creative, audience, channel, and offer, every test logged with a hypothesis and a review date.
The failure mode here is the “creative refresh” model, where the agency makes new ads when the old ones fatigue but never tests competing hypotheses. You end up with a portfolio of ads that all say roughly the same thing, all targeted at roughly the same audience, all converting at roughly the same rate. The system can’t learn because the tests aren’t real.
A strong agency will show you their experiment log from another client (anonymized). If they don’t have one, they don’t run experiments. They run campaigns.
8. What’s Your Stance on MQLs?
The MQL question is a values test, not a tactics test. The wrong answer is “we’ll hit your MQL target.” MQLs are an internal handoff signal between marketing and sales; they are not a business outcome. Agencies that optimize for MQL volume will deliver MQL volume, and your sales team will spend six months telling you the leads are unqualified before you realize the agency hit the number the contract specified.
The right answer is that MQLs are a leading indicator measured against sales-accepted opportunities and pipeline, and the agency will report all three. The MQL-to-SQL benchmark for B2B SaaS is 18 to 22%, with top-quartile teams reaching 25 to 35% (Forrester, 2024).
This benchmarks against a broader pattern: only 17% of B2B buyers’ total buying time is spent in direct contact with vendors, and 80% of the buying journey happens without sales involvement (Gartner B2B Buying Research, 2024), which means agencies and sales teams optimizing for MQL volume are often chasing activity in the 17% window while the 80% moves invisible. If the agency’s program is generating MQLs that convert below 15% to SQL, the targeting is wrong, the offer is wrong, or both; and the agency should be the one telling you, not the sales VP.
Ask how they handle the conversation when MQL volume looks great but SQL conversion is degrading. The agency that says “that’s a sales problem” is the agency you don’t hire.
9. What’s in the Stack, and What Do You Refuse to Touch?
A good demand gen agency names what they do and what they don’t. Demand generation ends at the form submit. What happens after; SDR cadences, sales enablement, CRM administration, deal desk; is someone else’s job, and an agency that claims to do all of it is either lying or spread so thin they’re bad at all of it.
Ask the agency to name their stack: ad platforms, attribution tools, enrichment providers, dashboarding, experimentation tooling. Ask what they refuse to take on. The agencies with the cleanest answer here are usually the ones doing the work; the agencies with the longest list of capabilities are usually the ones selling the work and subcontracting the delivery.
This is also where the AI question lands. Every agency will tell you they “use AI.” Ask where, specifically. Audience builds? Creative production? Reporting? Optimization analysis? AI as operator use is real; AI as a marketing claim with no underlying workflow is not.
10. How Do You Get Fired?
Most agency contracts have a 90-day out clause buried in the fine print and a renewal date that auto-renews unless you give 60 days’ notice. The agency knows this. You should too.
Ask the agency directly: under what conditions do you expect to be fired, and what’s the off-ramp? The honest answer names the conditions (missed pipeline targets for two consecutive quarters, leadership change on either side, strategic shift in your GTM) and the process (a structured wind-down, data and asset handover, no holding your ad accounts hostage). The dishonest answer is “we don’t really think about that.”
This question also surfaces how the agency thinks about its own accountability. Agencies that have written exit criteria into their own SOPs are the ones that take the pipeline number seriously. Agencies that get squirmy here are the ones planning to ride out the contract on activity reports.
11. What Will You Tell Me on Month Four That I Don’t Want to Hear?
The last question is the one that surfaces whether the agency will tell you the truth when uncomfortable. Every retainer hits a moment around month four where something isn’t working. The positioning is off. The ICP is wrong. The sales team isn’t following up on leads. The product page doesn’t match the ad.
The agency’s job in that moment is to tell you, with the data, what they think the problem is; even when the problem is upstream of their work. Ask them what conversation they expect to have with you in month four. If the answer is “we’ll optimize toward the goals,” they’re going to optimize quietly and let the program decay. If the answer is a specific list of risks they expect to surface (positioning gaps, sales handoff friction, attribution distortion, ICP mismatch), they’re going to run the program like operators.
The agencies that name confounders before you ask are the ones worth hiring. Everyone else is selling decks.
The Eleven Questions Together Are the Test
Any single question can be answered well by an agency that’s good at pitching. The eleven together cannot. Run them in the same conversation, take notes on which questions get crisp answers and which get hedged, and at the pattern. An agency that answers seven of eleven well and dodges four is telling you exactly which four will become the problem in month six.
The right partner is the one whose pitch and delivery are the same product. The questions above are the cheapest way to find out which is which, before the SOW is signed.
How the Demand Gen Agency Landscape Stacks Up
Most “demand gen agencies” are shops selling six things and accountable for none of them; a smaller set specialize, and an even smaller set specialize in demand gen specifically. The matrix below compares scope, accountability model, and team structure across the categories you’re most likely to evaluate.
Agency | Scope | Accountability Model | Team Structure |
|---|---|---|---|
Moving Parade | Demand gen only; nothing else | Pipeline number stated in SOW, reported monthly | Senior-only pod; pitch team is delivery team |
Refine Labs | Demand creation + positioning rebuild | Brand-led demand creation over 2 to 3 quarters | Consultancy model; methodology-driven |
Powered by Search | Paid media + SEO + content | Mixed: pipeline plus organic growth | Vertical specialists in B2B SaaS |
Kalungi | Full marketing function as a fractional team | Marketing function output (broad) | Fractional CMO + supporting team |
Heinz Marketing | Sales-marketing alignment + advisory | Alignment outcomes over execution | Advisory-heavy, less hands-on media |
Directive | Strategy + paid media + content | Qualified pipeline elevation from MQL | Mid-market to enterprise B2B focus |
The columns are the dimensions the eleven questions surface. Scope tells you what you’re actually buying. Accountability model tells you what they’ll defend at month six. Team structure tells you who you’ll be talking to when something breaks.
Frequently Asked Questions
How long should a demand gen agency evaluation take?
Four to eight weeks for a real evaluation, not two weeks. The short version is a pitch competition; the longer version includes reference calls, a paid audit or strategy sprint, and a working session with the team that would actually run your account. Compressing the process past four weeks usually means you’re picking on chemistry.
Should I ask for references, and what do I ask them?
Yes, and ask for references whose retainers ended, not just current clients. Current-client references are coached. Past-client references will tell you why the engagement ended, what the off-ramp looked like, and whether the agency held the data and assets hostage. Those are the answers that matter.
What’s a fair retainer range for a mid-market B2B demand gen program?
$8K to $15K per month for a Scale-tier retainer managing $25K to $100K of media, and 12 to 18% of spend (or $15K to $40K flat) for larger programs.
These retainer ranges reflect the true loaded cost of running a demand function: the fully loaded annual cost of a B2B marketing hire is $130,000 to $715,000, with base salary representing only 50 to 65% of total cost (True loaded cost of a B2B marketing hire, 2026), which means a $10K monthly retainer ($120K annual) buys you roughly one senior-equivalent FTE plus tooling and media infrastructure. Anything below $5K per month buys you a junior team and a template; anything above $40K per month should come with a dedicated senior pod and custom reporting infrastructure.
How do I tell if an agency’s case studies are real?
Ask for the named client contact, the time period, and the math behind the headline number. Real case studies survive that ask. Fabricated or exaggerated ones get vague about the timeframe, the attribution model, or whether the headline number was sourced pipeline or influenced pipeline. The Slalom Zero Legacy program is a useful reference point for what a real B2B case study reads like: named channels, named partners, specific lift numbers, and the parts that didn’t work.
Should the agency take ownership of my CRM?
No. Demand generation ends at the form submit. An agency that wants ownership of your CRM is expanding scope into territory where they will be bad. A good demand gen agency integrates with your CRM as a targeting and reporting asset, and refers you to specialist partners for CRM administration and sales enablement.
What’s the right way to handle the holdout test conversation?
Ask the agency to design a holdout in the first 90 days, run it in quarter two, and report the result whether it’s flattering or not. A geo holdout or audience holdout is the cleanest way to validate that paid media is creating incremental pipeline rather than capturing demand that would have converted anyway. Agencies that refuse holdouts are agencies that don’t want to know the answer.
How do I know if the agency is using AI as use or as a marketing claim?
Ask them to walk you through a specific AI-driven workflow in their stack: which platform, which task, what the human reviews, what the output is. Real AI use shows up as named workflows with humans in the loop. AI as a marketing claim shows up as “we use AI across the platform” with no specifics. The difference is whether the AI is shipping work or shipping slides.
What’s the single biggest red flag in a demand gen pitch?
The agency that promises a specific MQL number in the first 60 days without asking about your conversion rates, sales cycle, or average deal size.
This red flag is especially visible when the agency hasn’t validated buying-journey participation: 94% of B2B buyers now use LLMs during their buying process, yet vendor-interaction count remains constant at 16 touchpoints per person with the winning vendor (Buyer-side LLM adoption, 2025), meaning the agency’s ability to influence the journey depends on understanding where those 16 touches land, not on MQL volume alone. They are selling you activity that fits a number, not pipeline that fits your business. The pipeline math has to come before the commitment, not after.
How Moving Parade Answers These Eleven Questions in the SOW
Moving Parade exists because most agencies that call themselves “demand gen” are shops with better positioning, accountable for activity instead of pipeline. The eleven questions above are the questions Moving Parade SOWs are written to answer. Pipeline numbers are stated, with the math that produced them. The senior pod in the pitch is the senior pod on the account. The 4 to 6 week foundations phase is a contractual sequence, not a recommendation. Branded search incrementality and holdout tests are part of standard reporting, not a premium add-on.
The single-discipline focus is the strategy. Demand generation ends at the form submit, and pretending the line doesn’t exist is how agencies end up bad at everything. Moving Parade refers trusted partners for CRM administration, sales enablement, PR, events, and brand identity systems, and stays focused on the one thing it does.
Every retainer ties to a stated pipeline number with stated math, reported monthly. If the math stops working, the conversation is structured, the confounders are named, and the off-ramp is clean.
Moving Parade is the demand gen partner built for the eleven-question evaluation, not the deck review. Free demand gen audit with a pipeline math reality-check for qualified B2B companies past PMF.