Methodology

How we test, score, and rank every AI headshot tool.

No tool pays to appear here, and none pays for a higher score. Every ranking on this site comes from the same weighted rubric, the same blind judging panel, and the same 146-tool test set — re-run every quarter. Here is exactly how it works.

Methodology version

v4.2

Last full re-test

June 2026

Tools in test set

146

Paid placements

What we stand on

Five rules we don't
break.

A review site is only worth reading if its incentives are clean. These are the commitments that govern every score, ranking, and recommendation we publish.

Rule 01

No pay-to-rank.

A vendor cannot buy a spot on the leaderboard, a higher score, or an “Editor's pick” badge. Rankings are decided before any commercial conversation happens — and never revisited because of one.

Rule 02

We pay for our own accounts.

Every tool is tested on a plan we bought at the public price, in an unmarked account. We do not use vendor-supplied demo accounts, press seats, or hand-picked sample galleries.

Rule 03

Scoring is blind.

Judges rate output without knowing which tool produced it. Brand names are stripped before the panel sees a single portrait, so reputation can't inflate or deflate a score.

Rule 04

Affiliate links never move a rank.

We earn a commission when readers use our coupon codes — and we disclose it loudly. Whether a tool has an affiliate program has zero weight in the
rubric.

Rule 05

We show our work.

Every score traces back to a published rubric and a dated test run. When we change a ranking, we say what changed and why — see the changelog at the bottom of this page.

Rule 06

We retest on a schedule.

Models update constantly. A score is only valid for the version we tested, so we re-run the full test set every quarter and date-stamp each result on the tool's review.

The scorecard

Six things we score. Weighted.

Every tool gets a score from 0–10 on each of these six criteria. The criteria are weighted — realism and likeness matter most, because a headshot that doesn't look like you is worthless no matter how cheap it is. The weighted average becomes the editor score on the leaderboard.

1	Realism Does it read as a photo?	Skin texture, catchlights in the eyes, hair detail, believable fabric and background. We dock points hard for the classic AI tells — plastic skin, melted ears, garbled glasses, extra fingers.	25%
2	Likeness Is it still you?	We compare output against the source selfies for bone structure, hairline, age, and complexion. Tools that “average out” your face into a generic attractive stranger lose the most points here.	25%
3	Value Quality per dollar	Price against the number of usable keepers — not raw output count. A $49 pack that yields three keepers scores worse than a $19 pack that yields ten. Coupon pricing is noted but scored at list price.	20%
4	Variety Wardrobe & scenes	Range of usable looks — backgrounds, wardrobe, lighting, crops — and whether they actually differ. Twenty near-identical navy-suit shots count as one look, not twenty.	15%
5	Speed Selfie to download	Wall-clock time from finishing the upload to having downloadable keepers in hand, measured on a standard plan at a typical weekday hour — not the marketing “as fast as” claim.	10%
6	Trust Privacy & guarantees	Data handling, photo deletion policy, refund and money-back terms, and how easy it is to actually get your money back. Dark patterns in cancellation flows cost points.	5%

In the lab

How we run a single
test.

Every tool goes through the identical pipeline below, using the same controlled set of source photos across multiple real faces. Consistency is the whole point — same input, same prompts, same judges.

Same faces, same files

We feed each tool an identical kit of source selfies from a panel of 12 volunteer faces, spanning ages, skin tones, and hair types. No tool gets cleaner inputs than another.

Matched prompts

We request the same brief from every tool — a LinkedIn set, a creative set, a casual set — using each tool's nearest equivalent presets so we compare like for like.

Blind scoring

Output is stripped of branding and shuffled into a single pool. Five judges score each portrait against the rubric without knowing which tool made it.

Aggregate & date-stamp

Scores are averaged, weighted, and tied to the exact tool version and test date. The result publishes to the leaderboard and the tool's review page.

The sample

One quarter of testing, by the numbers.

This is the volume behind a single quarterly refresh of the leaderboard. It's why a full re-test takes our team the better part of a month.

146

tools in the
active test set

volunteer faces
across demographics

31k

portraits scored
blind by the panel

independent judges
per portrait

Who's judging

The panel, and the keeper question.

Scores are only as good as the people behind them. Here's who scores, and the one question that breaks every tie.

A mixed panel, on purpose

Our five-judge panel isn't five photographers. It's deliberately mixed: a portrait photographer, a recruiter who screens LinkedIn profiles all day, a dating-app coach, a brand designer, and one ordinary person with no industry stake at all.

That mix matters because a headshot has different jobs. A recruiter and a Hinge user are looking for almost opposite things, and a tool that nails one can fail the other. Averaging across the panel keeps any single taste from dominating the score.

The 10-second keeper test

When scores are close, we fall back to the only question that matters in the real world: would you actually publish this? Each judge spends ten seconds per portrait — the same glance a recruiter or a match gives it — and marks it keeper or pass.

We count keepers per dollar, not portraits per pack. A tool can spit out 300 images, but if only four survive the keeper test, that's what we score it on. It's the closest thing we have to measuring the thing you actually care about.

A score, decoded

9.0+

Editor's-pick tier. Passes the recruiter test >90% of the time;
we'd use it ourselves.

8.0–8.9

Strong. A great pick for most people, with one or two trade-offs
we'll name.

7.0–7.9

Good enough for LinkedIn, weaker on likeness or variety. Read
the caveats.

Below 7.0

Not recommended for now. Usually a likeness or realism
problem.

Full disclosure

How we make money.

The clearest signal that a review is honest is a plain answer to “how do you get paid?” Here's ours, with nothing left out.

Where the money comes from

Reader coupon codes.

When you use one of our negotiated codes — like GURU20 — the vendor pays us a commission, and you get a discount you wouldn't get otherwise. That's the entire business model. It keeps the site free to read.

We negotiate codes after a tool earns its ranking, never before. A tool's score is locked in based on the rubric long before any commercial conversation happens.

What we'll never do

Sell a rank.

We don't take payment for placement, higher scores, badges, or “sponsored review” coverage dressed up as editorial. We've turned down vendors who asked.

Affiliate status carries zero weight in the rubric. Several tools we rank highly pay us nothing, and we still recommend them — because the scorecard says so. If our incentives ever conflict with your interests, the rubric wins.

Show our work

Methodology changelog.

When the rubric or the test set changes, it goes here — dated and explained. Older scores are re-run against the new method before any ranking moves.

May 2026

v4.2

Likeness weight raised to 25%.

As base models got more photorealistic, “looks like a photo” stopped separating the field — but “looks like you” still did. We raised likeness from 20% to 25% and trimmed speed accordingly. Three tools shifted rank as a result.

Feb 2026

v4.1

Added the Trust criterion.

We introduced a 5% Trust score covering data deletion, refund terms, and cancellation dark patterns, after reader reports of tools that made it hard to get a refund. Weight came out of Value.

Nov 2025

v4.0

Expanded the panel to five judges.

We grew the blind-scoring panel from three to five and formalized the mixed-background requirement, so no single profession's taste dominates a score.

Aug 2025

v3.5

Switched to keepers-per-dollar for Value.

We stopped rewarding raw output volume and began scoring Value on usable keepers instead, after finding that high-volume tools were padding packs with near-duplicate shots.

See the rubric in action on the 2026 leaderboard.

See the ranking

On the method

Methodology, answered.

No. Rankings come from the weighted rubric and blind panel scoring, and they're locked in before any commercial conversation. We earn money from reader coupon codes, but affiliate status has zero weight in the score — several highly ranked tools pay us nothing.

Before the panel sees any output, we strip the branding and shuffle every portrait into one pool. Judges rate each image against the rubric without knowing which tool produced it, so reputation can't inflate or deflate a score.

We re-run the full 146-tool test set every quarter, and we date-stamp every result. A score is only valid for the tool version we tested — when a model updates meaningfully, we retest before changing its rank.

Together they're half the score because a headshot that doesn't read as a real photo of you fails at its only job, no matter how cheap or fast it is. Speed and price are tie-breakers, not the headline.

Five judges with deliberately different stakes: a portrait photographer, a recruiter, a dating-app coach, a brand designer, and one person with no industry stake. Averaging across them keeps any single taste from dominating.

Tell us. Reader reports are how the Trust criterion and the keepers-per-dollar change both happened. We can't promise a rank will move, but every flagged issue gets checked against the published rubric and the dated test run.

The Sunday Brief

One verdict a week.
No fluff.

Every Sunday: one new AI headshot tool reviewed against this rubric, one fresh coupon code, one reader makeover. 84,000+ subscribers. Unsubscribe in one click.

⭐Free forever ⭐No spam ⭐Exclusive deals

How we test, score, and rank every AI headshot tool.

Five rules we don't
break.

Six things we score. Weighted.

How we run a single
test.

One quarter of testing, by the numbers.

The panel, and the keeper question.

How we make money.

Methodology changelog.

Methodology, answered.

One verdict a week.
No fluff.

Top Services

Resources

Legal Information

How we test, score, and rank every AI headshot tool.

Five rules we don't break.

Six things we score. Weighted.

How we run a single test.

One quarter of testing, by the numbers.

The panel, and the keeper question.

How we make money.

Methodology changelog.

Methodology, answered.

One verdict a week.No fluff.

Top Services

Resources

Legal Information

Five rules we don't
break.

How we run a single
test.

One verdict a week.
No fluff.