Lead Quality Scoring: Criteria and Metrics

Your sales team says the leads are weak. Marketing says sales never follows up. Both are partly right, and you have no way to settle it because nobody agreed on what a good lead looks like in the first place.

Lead quality scoring fixes that argument with numbers. You define what a buyer looks like, attach points to the signals that matter, and let the score sort a noisy pipeline into "call today" and "nurture later." Done well, it raises close rates, shortens sales cycles, and stops your best reps from burning hours on tire-kickers.

This is a working guide. You will get the criteria worth scoring, how to weight them, the metrics that tell you the model is right, and the mistakes that quietly wreck most scoring systems.

What lead quality scoring actually measures

A score is a prediction. Each lead gets a number that estimates how likely they are to become a paying customer worth keeping. That is the whole job.

Two questions sit underneath every score. Does this person match who buys from you (fit)? And are they showing signs of buying intent right now (behavior)? Fit answers "should we sell to them," behavior answers "are they ready." A high score needs both. A VP at your perfect-fit company who never opens an email is a maybe. A junior analyst who downloaded every asset you own still cannot sign a contract.

Scoring sits one layer above lead qualification. Qualification is the human conversation that confirms budget and need. Scoring is the automated triage that decides who gets that conversation first.

Fit criteria: does this lead match your buyer

Fit is the demographic and firmographic match between the lead and your ideal customer profile (ICP). It rarely changes after the form is filled. You score it once and move on.

Start with the attributes that separated your won deals from your lost ones over the last year. Pull that list from your CRM, not from a brainstorm. The patterns are usually clearer than people expect.

Common fit signals worth points:

Company size. Headcount or revenue band. If you close mid-market and choke on enterprise procurement, score 50 to 500 employees high and 5,000-plus low.
Industry. Verticals where your case studies land. A logistics tool scores warehousing high and dentistry near zero.
Job title and seniority. Decision-makers and economic buyers score above influencers, who score above students and job seekers.
Geography. Regions you actually serve and can support in the right time zone and language.
Tech stack. For software, the tools they already run can signal fit (a Salesforce shop for a Salesforce integration).

Score negative points too. A free email domain (gmail, outlook) on a "request enterprise demo" form is a weak signal. A competitor's domain is a negative. A job title of "student" on a B2B form should drop the score, not just leave it flat.

Behavior criteria: is this lead actually interested

Behavior is what the lead does. It moves daily, and recent actions matter more than old ones. This is where intent shows up before anyone fills out a "contact sales" form.

Weight actions by how close they sit to a buying decision. Reading a top-of-funnel blog post is mild interest. Visiting your pricing page three times in two days is a hand going up.

Behavior signal	Buying intent	Suggested weight
Opened a marketing email	Low	+1
Read a blog post	Low	+2
Downloaded a guide or whitepaper	Medium	+5
Attended a webinar	Medium	+8
Visited the pricing page	High	+12
Started a free trial or demo request	Very high	+20
Replied "remove me" / unsubscribed	Negative	-10

Weights are illustrative. Calibrate them against your own closed-won data before trusting them.

Two behavior mechanics separate a decent model from a sharp one. Recency: a pricing-page visit this week beats one from three months ago, so decay old points over time. Frequency: five pricing visits is a stronger signal than one, so let repeated actions stack up to a cap. Without decay, leads accumulate points forever and your "hot" list fills with people who went cold last quarter.

How to weight and combine the criteria

The most useful structure keeps fit and behavior on separate axes instead of crushing them into one number. A single blended score hides the difference between "great fit, no interest" and "poor fit, very active," and those two leads need completely different treatment.

Grade fit A through D and behavior 1 through 4, then plot the grid:

The top-right quadrant (good fit, high intent) goes straight to sales. Good fit with low intent goes to lead nurturing until behavior catches up. High intent with poor fit gets a careful look before anyone spends time on it. Low on both gets dropped.

If you do need a single threshold number, set it from data. Find the score where leads start converting at a rate sales considers worth their time, then call that line your MQL bar. That handoff is the same boundary covered in MQL vs SQL, just expressed as a number instead of a stage name.

Metrics that tell you the model works

A scoring model is a hypothesis. These metrics tell you whether the hypothesis holds.

Conversion rate by score band. The headline check. Split leads into bands (0 to 20, 21 to 40, and so on) and measure how many in each band became customers. Higher bands should close at higher rates, cleanly and consistently. If your 80-plus band converts the same as your 40 band, the score is noise and needs rebuilding.

MQL-to-SQL acceptance rate. Of the leads scoring high enough to pass to sales, how many did sales accept as real opportunities? Below 50 percent usually means your fit criteria are too loose or your behavior weights reward activity that does not signal buying.

Win rate of scored leads. Track close rates for high scorers versus the rest. The gap is the dollar value of your model. No gap, no value.

Average score of closed-won deals. Run this quarterly. If your winners cluster at scores your model calls mediocre, your weights are pointed at the wrong signals.

False negative rate. The one teams forget. How many customers came from leads the model scored low? A few are fine. A pattern means you are leaving revenue on the table by ignoring a segment, and the model needs a new criterion.

For the longer-term view, pair scoring with cohort analysis so you can see whether high-scoring leads also become high-retention, high-LTV customers. A lead that closes fast but churns in three months is not the quality you want to optimize for.

Common mistakes that break scoring

Scoring activity instead of intent. Email opens and blog reads feel like signals because they are easy to track. They barely correlate with revenue. Weight the few high-intent actions (pricing views, demo requests, repeat visits) far above the easy ones.

Building it once and walking away. Your ICP shifts, your content changes, your market moves. A model tuned for last year's buyer slowly drifts wrong. Review weights every quarter against actual closed deals.

No negative scoring. Without subtractions, scores only climb. A lead who unsubscribed, listed a fake company, or works at a competitor should lose points, not sit frozen at their peak.

Sales never sees the reasoning. A bare number ("87") tells a rep nothing. Pass the why alongside it: "VP of Ops at a 200-person SaaS firm, viewed pricing twice this week." That context is what turns a score into a useful call.

Overcomplicating the first version. Forty criteria with fragile half-point weights look sophisticated and predict nothing. Start with five or six strong signals, prove they sort leads, then add nuance. A simple model you trust beats an elaborate one you cannot explain.

How to build your first model in a week

You do not need a data science team to start. You need your CRM history and an afternoon.

Pull your last 50 to 100 closed deals. Note the firmographics they shared and the actions they took before buying. Patterns will jump out.
List the five or six signals that separate winners from losers. Two or three fit, two or three behavior. Resist the urge to add more yet.
Assign rough weights. Strong signal, big number. Use the tables above as a starting point, not gospel.
Score your current open pipeline and sanity-check it. Do the leads sales already loves land near the top? If your obvious best lead scores in the middle, your weights are off.
Set the MQL threshold and route leads. Pick the score where conversion gets interesting and send everything above it to sales.
Review in 30 and 90 days. Compare conversion by band, adjust, repeat.

The model gets smarter every cycle because real outcomes correct your guesses. The first version only has to beat "whoever filled out a form most recently."

FAQ

What is the difference between lead scoring and lead grading?

Grading usually refers to fit only: how well a lead matches your ICP, often graded A to D. Scoring traditionally means behavior: points for actions that signal intent. The strongest models use both together, which is why many teams now use "lead scoring" to mean the combined fit-plus-behavior system described here.

How many points should the MQL threshold be?

There is no universal number, and copying someone else's threshold is a mistake. Find it in your own data: the score at which leads start converting at a rate your sales team considers worth their time. That conversion inflection point is your line. Recheck it as the model changes.

Should I use AI or predictive scoring instead of manual rules?

Predictive scoring, where a model finds the patterns for you, works well once you have enough closed deals to train on (often several hundred). Below that, the math overfits to noise. Start with manual rules you understand, then graduate to predictive scoring when your data volume supports it. Tools in HubSpot, Salesforce, and similar platforms offer both modes.

How often should I update the scoring model?

Review it quarterly at minimum. Update sooner if you launch a new product, shift your target market, or sales starts complaining that high scores are not converting. Treat the model as living: it decays as your business changes.

Can lead scoring hurt my results?

Yes, if it is wrong and nobody checks it. A miscalibrated model sends sales chasing the wrong people and buries good leads in the nurture pile. That is why the metrics section matters: a model you do not measure is a guess you have automated. Watch conversion by band and the false negative rate.

Does lead scoring work for small B2B teams?

It works especially well for small teams, because they cannot afford to waste rep hours. You do not need automation software to start: a simple spreadsheet that tags leads by fit and recent behavior already beats gut feel. Add tooling once the volume justifies it.

Checklist before you ship a scoring model

Fit and behavior scored on separate axes, not blended into one number
Criteria pulled from real closed-deal patterns, not opinions
High-intent actions weighted well above easy actions like email opens
Negative scoring in place for bad-fit and disengaged leads
Behavior points decay with time and stack with frequency
MQL threshold set from your conversion data, not borrowed
Score reasoning passed to sales, not just the number
Conversion-by-band and acceptance rate tracked from day one
A calendar reminder to review weights next quarter

Get the criteria right and your pipeline starts sorting itself: the people worth a call rise to the top, and your reps stop guessing. Get them wrong and you have automated a bad habit at scale.

If your lead flow has volume but your close rates and sales feedback say the quality is off, that is usually a scoring and routing problem, not a traffic one. Send us your last quarter of pipeline data and we will map where good leads are leaking out. Book a 30-minute review with Lead The Way and we will show you which signals in your funnel actually predict revenue.