Relevance Quality

How to read analytics as a relevance-improvement signal — and the workflow for turning low-CTR queries into synonym, curation, and ranking changes.

Search analytics is only valuable if it changes what you ship. This page connects the analytics surfaces back to the Relevance tuning workflow: what to look at, what action to take, and how to measure that the action worked.

The relevance-quality loop

analytics (low CTR, zero-result, reformulation)
        │
        ▼
diagnose (search preview, curation panel)
        │
        ▼
tune (synonym, queryBy weight, curation, content fix)
        │
        ▼
re-measure (same query, next period)

It's a slow loop — at least one period (24 h or 7 d) per cycle because you need enough volume on the affected query to read the signal. Don't rush it.

What "good relevance" looks like in the data

Three numbers, read together:

Metric	Target (storefront)	Target (knowledge base)
Average CTR	25–40 %	35–60 %
Zero-result rate	< 5 %	< 10 %
p95 search latency	< 200 ms	< 200 ms

These are rough planks, not SLOs — your catalog shape and your widget UX move them around. The trend matters more than the absolute number: a 5-point CTR drop after a deploy is much worse than a steady 18 % CTR for two quarters.

Reading low-CTR queries

A low-CTR query is one of three things:

Wrong results. The list does not contain what the user wanted.
Right results, wrong order. What the user wanted is on page 3.
Right list, wrong UX. Results are correct but unclickable — image missing, title truncated, price hidden.

You diagnose by running the query in the Search Preview with the same queryBy, filterBy, and sortBy your storefront uses. From there:

If the right product isn't there at all → it's case 1. Synonym / queryBy weight fix.
If the right product is on page 3+ → case 2. Pin it via curation.
If the right product is on page 1 → case 3. UX, not search. Send to the front-end team.

Synonyms vs curations vs queryBy weights

Three tools, three different jobs:

Tool	Best for	Watch out
Synonyms	Vocabulary mismatch ("airpods" ↔ "wireless earbuds", "pants" ↔ "trousers", regional terms).	Synonyms apply to the whole index — they affect every query containing the root term. Be specific.
queryBy weights	Adjusting which fields drive relevance for all queries on a surface ("title:10, brand:5").	Field weights are blunt — they don't differentiate by user intent. Tune sparingly.
Curations	"Always pin these IDs for this exact query." Branded queries, merchandising, seasonal promotions.	Curations are exact-match on query string by default. They don't fix related queries.

A common mistake is to reach for a curation when a synonym would solve the cluster of related queries. Read the reformulations in Queries to see whether you're looking at one query or twenty similar ones.

Putting it together — example

You see in topQueries:

Query	Searches	CTR	Zero
`trainers`	412	3 %	no
`running trainers`	88	6 %	no
`nike trainers`	65	8 %	no

All three are low-CTR but non-zero. Run trainers in the preview: results are mostly shoe accessories, not actual shoes. Action:

Add synonym trainers ↔ sneakers, running shoes. This is the vocabulary fix.
Also raise the weight of categories in queryBy so a categorised-as-shoes document outranks an uncategorised accessory.
Wait a week and recheck the same three queries' CTR in the next period.

If CTR rose to ~25 % on trainers, the synonym solved the cluster. If trainers improved but nike trainers stayed flat, you've also got a brand-field weight issue — boost brand in queryBy.

When relevance tuning won't help

Some signal looks like a relevance problem but isn't:

Stale catalog. If the right product was deprecated last week but is still in the index, you need a re-ingest, not a synonym.
Pricing / availability. Out-of-stock items on top of the result list will tank CTR no matter how relevant they are. Filter or boost availability in the storefront query path.
Image-less results. If the front-end requires an image and 60 % of your catalog has none, CTR will look catastrophic on every query.

Cross-check with the Activity tab — recent reindexes, schema changes, or sync-job failures often explain the drop.

Telemetry-driven testing

For high-stakes changes (a synonym that affects 30 % of queries; a new default queryBy), gate the rollout:

Apply the change in a staging org / preview index.
Replay the production query stream against staging (export → script).
Compare CTR per query bucket.
Roll out only if the staging bucket beats production.

The replay path uses analyticsEvents export + the Search Workspace preview API. It's manual; for a fully managed A/B framework, see the Operations / Performance doc.

Aggregating across multiple indexes

If you operate multiple indexes (storefront + help-center + blog), look at relevance quality per index, not org-wide. Different content types have different healthy CTRs. Mixing them masks problems.

Read by metadata.indexSlug from the raw events; the dashboard's index filter does the same join.

Analytics overview
Queries
No-results
Relevance tuning — the dashboard panel
Search workspace — the preview tool
Multi-search and querying — queryBy weights

Relevance Quality

On this page