add URL redirects with ETS-cached plug, broken URL tracking, and admin UI
All checks were successful
deploy / deploy (push) Successful in 3m30s

Redirects context with redirect/broken_url schemas, chain flattening,
ETS cache for fast lookups in the request pipeline. BrokenUrlTracker
plug logs 404s. Auto-redirect on product slug change via upsert_product
hook. Admin redirects page with active/broken tabs, manual create form.
RedirectPrunerWorker cleans up old broken URLs. 1227 tests passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
jamey
2026-02-26 14:14:14 +00:00
parent 23e95a3de6
commit 6e57af82fc
21 changed files with 1493 additions and 24 deletions

View File

@@ -1,12 +1,12 @@
# URL redirects
> Status: Planned
> Tasks: #7881 in PROGRESS.md
> Tasks: #7882 in PROGRESS.md
> Tier: 3 (Compliance & quality — SEO dependency)
## Goal
Preserve link equity and customer experience when product URLs change or products are removed. Automatically handle the most common cases, use analytics data to identify what actually matters, and surface anything ambiguous for admin review.
Preserve link equity and customer experience when product URLs change, products are removed, or collections are renamed. Automatically handle the most common cases, use analytics data to identify what actually matters, and surface anything ambiguous for admin review.
## Why it matters
@@ -20,11 +20,15 @@ Most redirect implementations just provide a manual table. The insight here is t
## Three layers
### Layer 1: Automatic redirect creation on slug change
### Layer 1: Automatic redirect creation on slug change or deletion
The most common case. When a product's title changes during sync, the slug changes, and the old `/products/old-slug` URL breaks. We detect this automatically in `upsert_product/2`.
Three triggers, all detected during provider sync:
**Hook point:** `lib/berrypod/products.ex:421425` — the `product ->` branch in `upsert_product/2` where `update_product(product, attrs)` is called. At this point we have `product.slug` (old) and can compute the new slug from `attrs[:title]`.
#### 1a. Product slug change
When a product's title changes during sync, the slug changes, and the old `/products/old-slug` URL breaks. Detected in `upsert_product/2`.
**Hook point:** `lib/berrypod/products.ex` — the `product ->` branch in `upsert_product/2` where `update_product(product, attrs)` is called. At this point we have `product.slug` (old) and can compute the new slug from `attrs[:title]`.
```elixir
product ->
@@ -48,6 +52,30 @@ product ->
`create_auto/1` uses `on_conflict: :nothing` on the `from_path` unique index — safe to call repeatedly if sync runs multiple times.
#### 1b. Product deletion
When a product is removed during sync, create a redirect to the most specific relevant page. Look up the product's category before deletion and redirect to that collection page. If no category is known, fall back to `/`.
Google's guidance is that a 301 to an irrelevant page (soft 404) is worse than a clean 404, so the redirect target must make sense — the collection page shows related products the customer might want.
```elixir
# In delete_product/1, before the actual deletion
category = product.category
target = if category, do: "/collections/#{Slug.slugify(category)}", else: "/"
Redirects.create_auto(%{
from_path: "/products/#{product.slug}",
to_path: target,
source: :auto_product_deleted
})
```
#### 1c. Collection slug change
Categories come from provider tags. If a tag is renamed, the category slug changes and `/collections/old-slug` breaks. Same detection logic — compare old vs new slug in the category upsert path and create a redirect.
Lower priority than products (collection URLs change less often), but the same mechanism handles it.
### Layer 2: A `redirects` table checked early in the Plug pipeline
One table, one Plug, all redirect types flow through the same path.
@@ -70,23 +98,55 @@ defmodule BerrypodWeb.Plugs.Redirects do
def init(opts), do: opts
def call(%{request_path: path} = conn, _opts) do
case Redirects.lookup(path) do
{:ok, redirect} ->
Redirects.increment_hit_count(redirect)
def call(conn, _opts) do
path = conn.request_path
# Normalise: trailing slash removal (except root)
# and lowercase path (not query params)
normalised = path |> maybe_strip_trailing_slash() |> String.downcase()
cond do
# Trailing slash or case mismatch — redirect to canonical form
normalised != path ->
location = append_query(normalised, conn.query_string)
conn
|> put_resp_header("location", redirect.to_path)
|> put_resp_header("location", location)
|> send_resp(301, "")
|> halt()
# Check redirect table (ETS-cached)
match?({:ok, _}, Redirects.lookup(path)) ->
{:ok, redirect} = Redirects.lookup(path)
Redirects.increment_hit_count(redirect)
location = append_query(redirect.to_path, conn.query_string)
conn
|> put_resp_header("location", location)
|> send_resp(redirect.status_code, "")
|> halt()
:not_found ->
true ->
conn
end
end
defp maybe_strip_trailing_slash("/"), do: "/"
defp maybe_strip_trailing_slash(path), do: String.trim_trailing(path, "/")
defp append_query(path, ""), do: path
defp append_query(path, qs), do: "#{path}?#{qs}"
end
```
The Plug handles three concerns in one pass:
1. **Trailing slash normalisation**`/products/foo/``/products/foo`. Phoenix generates no-trailing-slash URLs, so this is the canonical form. Prevents duplicate content in Google's index.
2. **Case normalisation**`/Products/Foo``/products/foo`. URLs are technically case-sensitive per RFC 3986, but mixed-case URLs cause duplicate content issues. Shopify lowercases everything. Only applies to the path, not query params (those can be case-sensitive for variant selectors like `?Color=Sand`).
3. **Redirect table lookup** — custom redirects from the `redirects` table.
All three preserve query params. This matters for variant selection URLs (`?Color=Sand&Size=S`) surviving a product slug change redirect.
**Caching:** The redirect lookup is on the hot path for every request. Use ETS for an in-memory cache, populated on app start and invalidated on any redirect create/update/delete.
```elixir
@@ -159,7 +219,7 @@ create table(:redirects, primary_key: false) do
add :from_path, :string, null: false # "/products/old-classic-tee"
add :to_path, :string, null: false # "/products/classic-tee-v2" or "/"
add :status_code, :integer, default: 301 # 301 permanent, 302 temporary
add :source, :string, null: false # "auto_slug_change" | "analytics_detected" | "admin"
add :source, :string, null: false # "auto_slug_change" | "auto_product_deleted" | "analytics_detected" | "admin"
add :confidence, :float # FTS5 match score for analytics_detected, nil otherwise
add :hit_count, :integer, default: 0 # incremented each time this redirect fires
timestamps()
@@ -201,6 +261,7 @@ Table of all redirects with columns: from path, to path, source (badge: auto/det
Sources:
- `auto_slug_change` — created automatically when sync detected a slug change. Trust these.
- `auto_product_deleted` — created automatically when a product was removed. Targets the category collection page or `/`.
- `analytics_detected` — created from analytics + FTS5 match. Show confidence score. Worth reviewing.
- `admin` — manually created.
@@ -238,11 +299,31 @@ Redirects.create_auto({from: /products/old, to: /products/new})
─────
Customer visits /products/old-slug
Provider deletes product
delete_product/1
Look up product category before deletion
Redirects.create_auto({from: /products/slug, to: /collections/category or /})
→ ETS cache invalidated
─────
Any request hits the Plug
1. Trailing slash? → 301 to canonical (preserving query params)
2. Mixed case path? → 301 to lowercase (preserving query params)
3. Redirect table match? → 301/302 to target (preserving query params)
4. None of the above → pass through to router
─────
Customer visits /products/old-slug?Color=Sand
BerrypodWeb.Plugs.Redirects checks ETS cache
↓ hit
301 → /products/new-slug
301 → /products/new-slug?Color=Sand
hit_count incremented
─────
@@ -275,6 +356,12 @@ Sees sorted list of broken URLs by prior traffic
Enters destination → creates redirect
ETS cache warmed → Plug now catches future requests
─────
Weekly Oban cron
Prune auto redirects with 0 hits older than 90 days
```
---
@@ -458,6 +545,23 @@ They complement each other. The redirect preserves SEO and visitor experience fo
---
## Auto-pruning
Auto-created redirects with zero hits are pruned after 90 days via a weekly Oban cron job. This prevents unbounded growth if products are renamed repeatedly.
```elixir
# Weekly cron: prune stale auto-redirects
from(r in Redirect,
where: r.source in ["auto_slug_change", "auto_product_deleted"] and r.hit_count == 0,
where: r.inserted_at < ago(90, "day")
)
|> Repo.delete_all()
```
Redirects that have been used at least once are kept forever — they're demonstrably serving traffic. Manual (`admin`) and analytics-detected redirects are excluded from auto-pruning; the admin can delete them manually if needed.
---
## Implementation notes
**Slug change detection is safe to add with no behaviour change** for products that don't change slug. The `on_conflict: :nothing` insert ensures idempotency across repeated syncs.
@@ -480,10 +584,11 @@ They complement each other. The redirect preserves SEO and visitor experience fo
- `lib/berrypod/redirects/redirect.ex` — schema
- `lib/berrypod/redirects/broken_url.ex` — schema
- `lib/berrypod/redirects.ex` — context: `lookup/1`, `create_auto/1`, `create_manual/1`, `warm_cache/0`, `invalidate_cache/1`, `increment_hit_count/1`, `list_broken_urls/0`, `record_broken_url/2`
- `lib/berrypod_web/plugs/redirects.ex` — new Plug
- `lib/berrypod/products.ex` — slug change detection in `upsert_product/2`
- `lib/berrypod_web/plugs/redirects.ex` — new Plug (redirects + trailing slash + case normalisation)
- `lib/berrypod/products.ex` — slug change detection in `upsert_product/2`, redirect on deletion in `delete_product/1`
- `lib/berrypod_web/live/shop/error.ex` — hook analytics query on 404
- `lib/berrypod_web/live/admin/redirects_live.ex` — new LiveView (3 tabs)
- `lib/berrypod/workers/redirect_pruner_worker.ex` — weekly Oban cron for auto-pruning
- Router — `/admin/redirects` route, ETS cache warm on startup
- Admin nav — new sidebar link
@@ -491,10 +596,19 @@ They complement each other. The redirect preserves SEO and visitor experience fo
- `upsert_product/2` with title change creates redirect automatically
- `upsert_product/2` with no title change does not create redirect
- `delete_product/1` creates redirect to category collection page
- `delete_product/1` with no category creates redirect to `/`
- Redirect Plug: matching path → 301, no match → passthrough
- Redirect Plug: query string preserved on redirect (`?Color=Sand` survives)
- Redirect Plug: trailing slash stripped (`/products/foo/``/products/foo`)
- Redirect Plug: mixed case normalised (`/Products/Foo``/products/foo`)
- Redirect Plug: root `/` trailing slash not stripped
- Redirect Plug: ETS cache hit (no DB call)
- 404 handler: path with analytics history → broken_url record created
- 404 handler: path with no analytics history → nothing recorded
- FTS5 auto-resolve: confident match → redirect created; no match → broken_url pending
- Redirect chain flattening: A→B, new B→C → stored as A→C
- `hit_count` incremented on each redirect fire
- Auto-pruning: 0-hit auto redirects older than 90 days deleted
- Auto-pruning: manual and analytics-detected redirects excluded
- Auto-pruning: redirects with hits > 0 preserved regardless of age