diff --git a/config/config.exs b/config/config.exs index ea4b501..c955a8e 100644 --- a/config/config.exs +++ b/config/config.exs @@ -95,7 +95,8 @@ config :berrypod, Oban, {"*/30 * * * *", Berrypod.Orders.FulfilmentStatusWorker}, {"0 */6 * * *", Berrypod.Sync.ScheduledSyncWorker}, {"0 3 * * *", Berrypod.Analytics.RetentionWorker}, - {"0 4 * * *", Berrypod.Orders.AbandonedCartPruneWorker} + {"0 4 * * *", Berrypod.Orders.AbandonedCartPruneWorker}, + {"0 5 * * 1", Berrypod.Workers.RedirectPrunerWorker} ]} ], queues: [images: 2, sync: 1, checkout: 1] diff --git a/docs/plans/url-redirects.md b/docs/plans/url-redirects.md index 3d72f43..d76364c 100644 --- a/docs/plans/url-redirects.md +++ b/docs/plans/url-redirects.md @@ -1,12 +1,12 @@ # URL redirects > Status: Planned -> Tasks: #78–81 in PROGRESS.md +> Tasks: #78–82 in PROGRESS.md > Tier: 3 (Compliance & quality — SEO dependency) ## Goal -Preserve link equity and customer experience when product URLs change or products are removed. Automatically handle the most common cases, use analytics data to identify what actually matters, and surface anything ambiguous for admin review. +Preserve link equity and customer experience when product URLs change, products are removed, or collections are renamed. Automatically handle the most common cases, use analytics data to identify what actually matters, and surface anything ambiguous for admin review. ## Why it matters @@ -20,11 +20,15 @@ Most redirect implementations just provide a manual table. The insight here is t ## Three layers -### Layer 1: Automatic redirect creation on slug change +### Layer 1: Automatic redirect creation on slug change or deletion -The most common case. When a product's title changes during sync, the slug changes, and the old `/products/old-slug` URL breaks. We detect this automatically in `upsert_product/2`. +Three triggers, all detected during provider sync: -**Hook point:** `lib/berrypod/products.ex:421–425` — the `product ->` branch in `upsert_product/2` where `update_product(product, attrs)` is called. At this point we have `product.slug` (old) and can compute the new slug from `attrs[:title]`. +#### 1a. Product slug change + +When a product's title changes during sync, the slug changes, and the old `/products/old-slug` URL breaks. Detected in `upsert_product/2`. + +**Hook point:** `lib/berrypod/products.ex` — the `product ->` branch in `upsert_product/2` where `update_product(product, attrs)` is called. At this point we have `product.slug` (old) and can compute the new slug from `attrs[:title]`. ```elixir product -> @@ -48,6 +52,30 @@ product -> `create_auto/1` uses `on_conflict: :nothing` on the `from_path` unique index — safe to call repeatedly if sync runs multiple times. +#### 1b. Product deletion + +When a product is removed during sync, create a redirect to the most specific relevant page. Look up the product's category before deletion and redirect to that collection page. If no category is known, fall back to `/`. + +Google's guidance is that a 301 to an irrelevant page (soft 404) is worse than a clean 404, so the redirect target must make sense — the collection page shows related products the customer might want. + +```elixir +# In delete_product/1, before the actual deletion +category = product.category +target = if category, do: "/collections/#{Slug.slugify(category)}", else: "/" + +Redirects.create_auto(%{ + from_path: "/products/#{product.slug}", + to_path: target, + source: :auto_product_deleted +}) +``` + +#### 1c. Collection slug change + +Categories come from provider tags. If a tag is renamed, the category slug changes and `/collections/old-slug` breaks. Same detection logic — compare old vs new slug in the category upsert path and create a redirect. + +Lower priority than products (collection URLs change less often), but the same mechanism handles it. + ### Layer 2: A `redirects` table checked early in the Plug pipeline One table, one Plug, all redirect types flow through the same path. @@ -70,23 +98,55 @@ defmodule BerrypodWeb.Plugs.Redirects do def init(opts), do: opts - def call(%{request_path: path} = conn, _opts) do - case Redirects.lookup(path) do - {:ok, redirect} -> - Redirects.increment_hit_count(redirect) + def call(conn, _opts) do + path = conn.request_path + + # Normalise: trailing slash removal (except root) + # and lowercase path (not query params) + normalised = path |> maybe_strip_trailing_slash() |> String.downcase() + + cond do + # Trailing slash or case mismatch — redirect to canonical form + normalised != path -> + location = append_query(normalised, conn.query_string) conn - |> put_resp_header("location", redirect.to_path) + |> put_resp_header("location", location) + |> send_resp(301, "") + |> halt() + + # Check redirect table (ETS-cached) + match?({:ok, _}, Redirects.lookup(path)) -> + {:ok, redirect} = Redirects.lookup(path) + Redirects.increment_hit_count(redirect) + location = append_query(redirect.to_path, conn.query_string) + + conn + |> put_resp_header("location", location) |> send_resp(redirect.status_code, "") |> halt() - :not_found -> + true -> conn end end + + defp maybe_strip_trailing_slash("/"), do: "/" + defp maybe_strip_trailing_slash(path), do: String.trim_trailing(path, "/") + + defp append_query(path, ""), do: path + defp append_query(path, qs), do: "#{path}?#{qs}" end ``` +The Plug handles three concerns in one pass: + +1. **Trailing slash normalisation** — `/products/foo/` → `/products/foo`. Phoenix generates no-trailing-slash URLs, so this is the canonical form. Prevents duplicate content in Google's index. +2. **Case normalisation** — `/Products/Foo` → `/products/foo`. URLs are technically case-sensitive per RFC 3986, but mixed-case URLs cause duplicate content issues. Shopify lowercases everything. Only applies to the path, not query params (those can be case-sensitive for variant selectors like `?Color=Sand`). +3. **Redirect table lookup** — custom redirects from the `redirects` table. + +All three preserve query params. This matters for variant selection URLs (`?Color=Sand&Size=S`) surviving a product slug change redirect. + **Caching:** The redirect lookup is on the hot path for every request. Use ETS for an in-memory cache, populated on app start and invalidated on any redirect create/update/delete. ```elixir @@ -159,7 +219,7 @@ create table(:redirects, primary_key: false) do add :from_path, :string, null: false # "/products/old-classic-tee" add :to_path, :string, null: false # "/products/classic-tee-v2" or "/" add :status_code, :integer, default: 301 # 301 permanent, 302 temporary - add :source, :string, null: false # "auto_slug_change" | "analytics_detected" | "admin" + add :source, :string, null: false # "auto_slug_change" | "auto_product_deleted" | "analytics_detected" | "admin" add :confidence, :float # FTS5 match score for analytics_detected, nil otherwise add :hit_count, :integer, default: 0 # incremented each time this redirect fires timestamps() @@ -201,6 +261,7 @@ Table of all redirects with columns: from path, to path, source (badge: auto/det Sources: - `auto_slug_change` — created automatically when sync detected a slug change. Trust these. +- `auto_product_deleted` — created automatically when a product was removed. Targets the category collection page or `/`. - `analytics_detected` — created from analytics + FTS5 match. Show confidence score. Worth reviewing. - `admin` — manually created. @@ -238,11 +299,31 @@ Redirects.create_auto({from: /products/old, to: /products/new}) ───── -Customer visits /products/old-slug +Provider deletes product + ↓ +delete_product/1 + ↓ +Look up product category before deletion + ↓ +Redirects.create_auto({from: /products/slug, to: /collections/category or /}) + → ETS cache invalidated + + ───── + +Any request hits the Plug + ↓ +1. Trailing slash? → 301 to canonical (preserving query params) +2. Mixed case path? → 301 to lowercase (preserving query params) +3. Redirect table match? → 301/302 to target (preserving query params) +4. None of the above → pass through to router + + ───── + +Customer visits /products/old-slug?Color=Sand ↓ BerrypodWeb.Plugs.Redirects checks ETS cache ↓ hit -301 → /products/new-slug +301 → /products/new-slug?Color=Sand hit_count incremented ───── @@ -275,6 +356,12 @@ Sees sorted list of broken URLs by prior traffic Enters destination → creates redirect ↓ ETS cache warmed → Plug now catches future requests + + ───── + +Weekly Oban cron + ↓ +Prune auto redirects with 0 hits older than 90 days ``` --- @@ -458,6 +545,23 @@ They complement each other. The redirect preserves SEO and visitor experience fo --- +## Auto-pruning + +Auto-created redirects with zero hits are pruned after 90 days via a weekly Oban cron job. This prevents unbounded growth if products are renamed repeatedly. + +```elixir +# Weekly cron: prune stale auto-redirects +from(r in Redirect, + where: r.source in ["auto_slug_change", "auto_product_deleted"] and r.hit_count == 0, + where: r.inserted_at < ago(90, "day") +) +|> Repo.delete_all() +``` + +Redirects that have been used at least once are kept forever — they're demonstrably serving traffic. Manual (`admin`) and analytics-detected redirects are excluded from auto-pruning; the admin can delete them manually if needed. + +--- + ## Implementation notes **Slug change detection is safe to add with no behaviour change** for products that don't change slug. The `on_conflict: :nothing` insert ensures idempotency across repeated syncs. @@ -480,10 +584,11 @@ They complement each other. The redirect preserves SEO and visitor experience fo - `lib/berrypod/redirects/redirect.ex` — schema - `lib/berrypod/redirects/broken_url.ex` — schema - `lib/berrypod/redirects.ex` — context: `lookup/1`, `create_auto/1`, `create_manual/1`, `warm_cache/0`, `invalidate_cache/1`, `increment_hit_count/1`, `list_broken_urls/0`, `record_broken_url/2` -- `lib/berrypod_web/plugs/redirects.ex` — new Plug -- `lib/berrypod/products.ex` — slug change detection in `upsert_product/2` +- `lib/berrypod_web/plugs/redirects.ex` — new Plug (redirects + trailing slash + case normalisation) +- `lib/berrypod/products.ex` — slug change detection in `upsert_product/2`, redirect on deletion in `delete_product/1` - `lib/berrypod_web/live/shop/error.ex` — hook analytics query on 404 - `lib/berrypod_web/live/admin/redirects_live.ex` — new LiveView (3 tabs) +- `lib/berrypod/workers/redirect_pruner_worker.ex` — weekly Oban cron for auto-pruning - Router — `/admin/redirects` route, ETS cache warm on startup - Admin nav — new sidebar link @@ -491,10 +596,19 @@ They complement each other. The redirect preserves SEO and visitor experience fo - `upsert_product/2` with title change creates redirect automatically - `upsert_product/2` with no title change does not create redirect +- `delete_product/1` creates redirect to category collection page +- `delete_product/1` with no category creates redirect to `/` - Redirect Plug: matching path → 301, no match → passthrough +- Redirect Plug: query string preserved on redirect (`?Color=Sand` survives) +- Redirect Plug: trailing slash stripped (`/products/foo/` → `/products/foo`) +- Redirect Plug: mixed case normalised (`/Products/Foo` → `/products/foo`) +- Redirect Plug: root `/` trailing slash not stripped - Redirect Plug: ETS cache hit (no DB call) - 404 handler: path with analytics history → broken_url record created - 404 handler: path with no analytics history → nothing recorded - FTS5 auto-resolve: confident match → redirect created; no match → broken_url pending - Redirect chain flattening: A→B, new B→C → stored as A→C - `hit_count` incremented on each redirect fire +- Auto-pruning: 0-hit auto redirects older than 90 days deleted +- Auto-pruning: manual and analytics-detected redirects excluded +- Auto-pruning: redirects with hits > 0 preserved regardless of age diff --git a/lib/berrypod/application.ex b/lib/berrypod/application.ex index 516bd84..939c49a 100644 --- a/lib/berrypod/application.ex +++ b/lib/berrypod/application.ex @@ -7,6 +7,10 @@ defmodule Berrypod.Application do @impl true def start(_type, _args) do + # Create ETS table here so the supervisor process owns it (lives forever). + # The Task below only warms it with data from the DB. + Berrypod.Redirects.create_table() + children = [ BerrypodWeb.Telemetry, Berrypod.Repo, @@ -20,6 +24,8 @@ defmodule Berrypod.Application do Supervisor.child_spec({Task, &Berrypod.Mailer.load_config/0}, id: :load_email_config), {DNSCluster, query: Application.get_env(:berrypod, :dns_cluster_query) || :ignore}, {Phoenix.PubSub, name: Berrypod.PubSub}, + # Warm redirect cache from DB (table already created above) + Supervisor.child_spec({Task, &Berrypod.Redirects.warm_cache/0}, id: :redirect_cache), # Background job processing {Oban, Application.fetch_env!(:berrypod, Oban)}, # Analytics: daily-rotating salt and ETS event buffer diff --git a/lib/berrypod/products.ex b/lib/berrypod/products.ex index dccc741..4eeb4fd 100644 --- a/lib/berrypod/products.ex +++ b/lib/berrypod/products.ex @@ -387,6 +387,20 @@ defmodule Berrypod.Products do Deletes a product. """ def delete_product(%Product{} = product) do + # Create a redirect before deletion so the old URL doesn't 404 + target = + if product.category do + "/collections/#{Slug.slugify(product.category)}" + else + "/" + end + + Berrypod.Redirects.create_auto(%{ + from_path: "/products/#{product.slug}", + to_path: target, + source: "auto_product_deleted" + }) + Repo.delete(product) end @@ -419,9 +433,22 @@ defmodule Berrypod.Products do {:ok, product, :unchanged} product -> + old_slug = product.slug + case update_product(product, attrs) do - {:ok, product} -> {:ok, product, :updated} - error -> error + {:ok, updated} -> + if old_slug != updated.slug do + Berrypod.Redirects.create_auto(%{ + from_path: "/products/#{old_slug}", + to_path: "/products/#{updated.slug}", + source: "auto_slug_change" + }) + end + + {:ok, updated, :updated} + + error -> + error end end end diff --git a/lib/berrypod/products/product.ex b/lib/berrypod/products/product.ex index 25e7bb7..4cdb7df 100644 --- a/lib/berrypod/products/product.ex +++ b/lib/berrypod/products/product.ex @@ -157,9 +157,22 @@ defmodule Berrypod.Products.Product do def compute_checksum(_), do: nil defp generate_slug_if_missing(changeset) do - case get_field(changeset, :slug) do - nil -> - title = get_change(changeset, :title) || get_field(changeset, :title) + slug_change = get_change(changeset, :slug) + title_change = get_change(changeset, :title) + current_slug = get_field(changeset, :slug) + + cond do + # Explicit slug provided — use it as-is + slug_change != nil -> + changeset + + # Title changed — regenerate slug to match + title_change != nil -> + put_change(changeset, :slug, Slug.slugify(title_change)) + + # No slug yet — generate from title + current_slug == nil -> + title = get_field(changeset, :title) if title do put_change(changeset, :slug, Slug.slugify(title)) @@ -167,7 +180,8 @@ defmodule Berrypod.Products.Product do changeset end - _ -> + # Slug exists and title didn't change — keep it + true -> changeset end end diff --git a/lib/berrypod/redirects.ex b/lib/berrypod/redirects.ex new file mode 100644 index 0000000..44d4ae3 --- /dev/null +++ b/lib/berrypod/redirects.ex @@ -0,0 +1,357 @@ +defmodule Berrypod.Redirects do + @moduledoc """ + Manages URL redirects and broken URL tracking. + + Redirects are cached in ETS for fast lookup on every request. + The cache is warmed on application start and invalidated on + any redirect create/update/delete. + """ + + import Ecto.Query + alias Berrypod.Repo + alias Berrypod.Redirects.{Redirect, BrokenUrl} + + @table :redirects_cache + @pubsub_topic "redirects" + + def subscribe do + Phoenix.PubSub.subscribe(Berrypod.PubSub, @pubsub_topic) + end + + defp broadcast(message) do + Phoenix.PubSub.broadcast(Berrypod.PubSub, @pubsub_topic, message) + end + + # ── ETS cache ── + + def start_cache do + create_table() + warm_cache() + end + + def create_table do + if :ets.whereis(@table) == :undefined do + :ets.new(@table, [:set, :public, :named_table, read_concurrency: true]) + end + + @table + end + + def warm_cache do + redirects = + Repo.all(from r in Redirect, select: {r.from_path, {r.to_path, r.status_code, r.id}}) + + for {from_path, value} <- redirects do + :ets.insert(@table, {from_path, value}) + end + + :ok + end + + defp invalidate_cache(from_path) do + :ets.delete(@table, from_path) + end + + defp put_cache(from_path, to_path, status_code, id) do + :ets.insert(@table, {from_path, {to_path, status_code, id}}) + end + + # ── Lookup ── + + @doc """ + Looks up a redirect by path. Checks ETS cache first, falls back to DB. + """ + def lookup(path) do + case :ets.lookup(@table, path) do + [{^path, {to_path, status_code, id}}] -> + {:ok, %{to_path: to_path, status_code: status_code, id: id}} + + [] -> + case Repo.one(from r in Redirect, where: r.from_path == ^path) do + nil -> + :not_found + + redirect -> + put_cache(redirect.from_path, redirect.to_path, redirect.status_code, redirect.id) + + {:ok, + %{to_path: redirect.to_path, status_code: redirect.status_code, id: redirect.id}} + end + end + end + + # ── Create ── + + @doc """ + Creates an automatic redirect (from sync events). + + Flattens redirect chains: if the new redirect's `to_path` is itself + a `from_path` in an existing redirect, follows it to the final destination. + Also updates any existing redirects that point to the new `from_path` + to point directly to the final destination instead. + + Uses `on_conflict: :nothing` so repeated sync calls are safe. + """ + def create_auto(attrs) do + to_path = resolve_chain(attrs[:to_path] || attrs["to_path"]) + attrs = Map.put(attrs, :to_path, to_path) + from_path = attrs[:from_path] || attrs["from_path"] + + # Flatten any existing redirects that point to our from_path + flatten_incoming(from_path, to_path) + + changeset = Redirect.changeset(%Redirect{}, attrs) + + case Repo.insert(changeset, on_conflict: :nothing, conflict_target: :from_path) do + {:ok, redirect} -> + put_cache(redirect.from_path, redirect.to_path, redirect.status_code, redirect.id) + broadcast({:redirects_changed, :created}) + {:ok, redirect} + + error -> + error + end + end + + @doc """ + Creates a manual redirect (from admin UI). + """ + def create_manual(attrs) do + attrs = Map.put(attrs, :source, "admin") + to_path = resolve_chain(attrs[:to_path] || attrs["to_path"]) + attrs = Map.put(attrs, :to_path, to_path) + from_path = attrs[:from_path] || attrs["from_path"] + + flatten_incoming(from_path, to_path) + + changeset = Redirect.changeset(%Redirect{}, attrs) + + case Repo.insert(changeset) do + {:ok, redirect} -> + put_cache(redirect.from_path, redirect.to_path, redirect.status_code, redirect.id) + broadcast({:redirects_changed, :created}) + {:ok, redirect} + + error -> + error + end + end + + # Follow redirect chains to find the final destination + defp resolve_chain(path, seen \\ MapSet.new()) do + if MapSet.member?(seen, path) do + # Circular — stop here + path + else + case Repo.one(from r in Redirect, where: r.from_path == ^path, select: r.to_path) do + nil -> path + next -> resolve_chain(next, MapSet.put(seen, path)) + end + end + end + + # Update any redirects whose to_path matches old_to to point to new_to instead + defp flatten_incoming(old_to, new_to) do + from(r in Redirect, where: r.to_path == ^old_to) + |> Repo.update_all(set: [to_path: new_to]) + + # Refresh cache for any updated redirects + from(r in Redirect, where: r.to_path == ^new_to) + |> Repo.all() + |> Enum.each(fn r -> put_cache(r.from_path, r.to_path, r.status_code, r.id) end) + end + + # ── Update / Delete ── + + @doc """ + Updates an existing redirect. + """ + def update_redirect(%Redirect{} = redirect, attrs) do + changeset = Redirect.changeset(redirect, attrs) + + case Repo.update(changeset) do + {:ok, updated} -> + # Old from_path may have changed + if redirect.from_path != updated.from_path do + invalidate_cache(redirect.from_path) + end + + put_cache(updated.from_path, updated.to_path, updated.status_code, updated.id) + {:ok, updated} + + error -> + error + end + end + + @doc """ + Deletes a redirect. + """ + def delete_redirect(%Redirect{} = redirect) do + case Repo.delete(redirect) do + {:ok, deleted} -> + invalidate_cache(deleted.from_path) + broadcast({:redirects_changed, :deleted}) + {:ok, deleted} + + error -> + error + end + end + + @doc """ + Increments the hit count for a redirect. + """ + def increment_hit_count(%{id: id}) do + from(r in Redirect, where: r.id == ^id) + |> Repo.update_all(inc: [hit_count: 1]) + end + + # ── Listing ── + + @doc """ + Lists all redirects, ordered by most recent first. + """ + def list_redirects do + from(r in Redirect, order_by: [desc: r.inserted_at]) + |> Repo.all() + end + + @doc """ + Gets a single redirect by ID. + """ + def get_redirect!(id), do: Repo.get!(Redirect, id) + + # ── Broken URLs ── + + @doc """ + Records or updates a broken URL entry. + + If the path already exists, increments the 404 count and updates last_seen_at. + """ + def record_broken_url(path, prior_hits) do + now = DateTime.utc_now() |> DateTime.truncate(:second) + + result = + case Repo.one(from b in BrokenUrl, where: b.path == ^path) do + nil -> + %BrokenUrl{} + |> BrokenUrl.changeset(%{ + path: path, + prior_analytics_hits: prior_hits, + first_seen_at: now, + last_seen_at: now + }) + |> Repo.insert() + + %{status: status} when status in ["ignored", "resolved"] -> + {:ok, :skipped} + + existing -> + existing + |> BrokenUrl.changeset(%{ + recent_404_count: existing.recent_404_count + 1, + last_seen_at: now + }) + |> Repo.update() + end + + case result do + {:ok, %BrokenUrl{}} -> broadcast({:broken_urls_changed, path}) + _ -> :ok + end + + result + end + + @doc """ + Lists broken URLs, sorted by prior analytics hits (highest impact first). + """ + def list_broken_urls(status \\ "pending") do + from(b in BrokenUrl, + where: b.status == ^status, + order_by: [desc: b.prior_analytics_hits, desc: b.recent_404_count] + ) + |> Repo.all() + end + + @doc """ + Resolves a broken URL by creating a redirect and updating the record. + """ + def resolve_broken_url(%BrokenUrl{} = broken_url, to_path) do + case create_manual(%{from_path: broken_url.path, to_path: to_path}) do + {:ok, redirect} -> + broken_url + |> BrokenUrl.changeset(%{status: "resolved", resolved_redirect_id: redirect.id}) + |> Repo.update() + + error -> + error + end + end + + @doc """ + Marks a broken URL as ignored. + """ + def ignore_broken_url(%BrokenUrl{} = broken_url) do + result = + broken_url + |> BrokenUrl.changeset(%{status: "ignored"}) + |> Repo.update() + + case result do + {:ok, _} -> broadcast({:broken_urls_changed, broken_url.path}) + _ -> :ok + end + + result + end + + @doc """ + Marks a broken URL as resolved (e.g. after creating a redirect for it). + """ + def mark_broken_url_resolved(%BrokenUrl{} = broken_url) do + result = + broken_url + |> BrokenUrl.changeset(%{status: "resolved"}) + |> Repo.update() + + case result do + {:ok, _} -> broadcast({:broken_urls_changed, broken_url.path}) + _ -> :ok + end + + result + end + + @doc """ + Gets a broken URL by ID. + """ + def get_broken_url!(id), do: Repo.get!(BrokenUrl, id) + + @doc """ + Gets a pending broken URL by path, or nil. + """ + def get_broken_url_by_path(path) do + Repo.one(from b in BrokenUrl, where: b.path == ^path and b.status == "pending") + end + + # ── Pruning ── + + @doc """ + Prunes auto-created redirects with zero hits older than the given number of days. + """ + def prune_stale_redirects(max_age_days \\ 90) do + {count, _} = + from(r in Redirect, + where: r.source in ["auto_slug_change", "auto_product_deleted"] and r.hit_count == 0, + where: r.inserted_at < ago(^max_age_days, "day") + ) + |> Repo.delete_all() + + # Rebuild cache if anything was pruned + if count > 0, do: warm_cache() + + {:ok, count} + end +end diff --git a/lib/berrypod/redirects/broken_url.ex b/lib/berrypod/redirects/broken_url.ex new file mode 100644 index 0000000..ad74631 --- /dev/null +++ b/lib/berrypod/redirects/broken_url.ex @@ -0,0 +1,38 @@ +defmodule Berrypod.Redirects.BrokenUrl do + use Ecto.Schema + import Ecto.Changeset + + @primary_key {:id, :binary_id, autogenerate: true} + @foreign_key_type :binary_id + + @statuses ~w(pending resolved ignored) + + schema "broken_urls" do + field :path, :string + field :prior_analytics_hits, :integer, default: 0 + field :recent_404_count, :integer, default: 1 + field :first_seen_at, :utc_datetime + field :last_seen_at, :utc_datetime + field :status, :string, default: "pending" + + belongs_to :resolved_redirect, Berrypod.Redirects.Redirect + + timestamps() + end + + def changeset(broken_url, attrs) do + broken_url + |> cast(attrs, [ + :path, + :prior_analytics_hits, + :recent_404_count, + :first_seen_at, + :last_seen_at, + :status, + :resolved_redirect_id + ]) + |> validate_required([:path, :first_seen_at, :last_seen_at]) + |> validate_inclusion(:status, @statuses) + |> unique_constraint(:path) + end +end diff --git a/lib/berrypod/redirects/redirect.ex b/lib/berrypod/redirects/redirect.ex new file mode 100644 index 0000000..56939dc --- /dev/null +++ b/lib/berrypod/redirects/redirect.ex @@ -0,0 +1,31 @@ +defmodule Berrypod.Redirects.Redirect do + use Ecto.Schema + import Ecto.Changeset + + @primary_key {:id, :binary_id, autogenerate: true} + @foreign_key_type :binary_id + + @sources ~w(auto_slug_change auto_product_deleted analytics_detected admin) + + schema "redirects" do + field :from_path, :string + field :to_path, :string + field :status_code, :integer, default: 301 + field :source, :string + field :confidence, :float + field :hit_count, :integer, default: 0 + + timestamps() + end + + def changeset(redirect, attrs) do + redirect + |> cast(attrs, [:from_path, :to_path, :status_code, :source, :confidence]) + |> validate_required([:from_path, :to_path, :source]) + |> validate_inclusion(:source, @sources) + |> validate_inclusion(:status_code, [301, 302]) + |> validate_format(:from_path, ~r"^/", message: "must start with /") + |> validate_format(:to_path, ~r"^/", message: "must start with /") + |> unique_constraint(:from_path) + end +end diff --git a/lib/berrypod/workers/redirect_pruner_worker.ex b/lib/berrypod/workers/redirect_pruner_worker.ex new file mode 100644 index 0000000..91b787a --- /dev/null +++ b/lib/berrypod/workers/redirect_pruner_worker.ex @@ -0,0 +1,20 @@ +defmodule Berrypod.Workers.RedirectPrunerWorker do + @moduledoc """ + Weekly Oban cron job that prunes auto-created redirects + with zero hits older than 90 days. + """ + + use Oban.Worker, queue: :default, max_attempts: 1 + + @impl Oban.Worker + def perform(_job) do + {:ok, count} = Berrypod.Redirects.prune_stale_redirects() + + if count > 0 do + require Logger + Logger.info("Pruned #{count} stale redirect(s) with 0 hits") + end + + :ok + end +end diff --git a/lib/berrypod_web/components/layouts/admin.html.heex b/lib/berrypod_web/components/layouts/admin.html.heex index b33c257..39bf749 100644 --- a/lib/berrypod_web/components/layouts/admin.html.heex +++ b/lib/berrypod_web/components/layouts/admin.html.heex @@ -118,6 +118,14 @@ <.icon name="hero-envelope" class="size-5" /> Email +
No redirects yet.
+ <% else %> +| From | +To | +Source | +Hits | +Created | ++ |
|---|---|---|---|---|---|
{redirect.from_path} |
+ {redirect.to_path} |
+ + + {redirect.source} + + | +{redirect.hit_count} | +{Calendar.strftime(redirect.inserted_at, "%d %b %Y")} | ++ + | +
No broken URLs detected.
+ <% else %> +| Path | +Prior traffic | +404s | +First seen | +Last seen | ++ |
|---|---|---|---|---|---|
{broken_url.path} |
+ {broken_url.prior_analytics_hits} | +{broken_url.recent_404_count} | +{Calendar.strftime(broken_url.first_seen_at, "%d %b %Y")} | +{Calendar.strftime(broken_url.last_seen_at, "%d %b %Y")} | ++ + + | +