I rolled out magic-link sign-in to replace passwords in the portal earlier this year. The happy path took an afternoon — Rails's generates_token_for, a mailer on top of Devise, two controller actions. The edge cases took two months.

If you're walking into magic-link auth for the first time, this is what I'd tell you. Six independent production fires, each one with the symptom users saw, the root cause I eventually traced, the fix, and the lesson it left behind. Code samples are Rails + Devise, but the failure modes are universal. Anyone shipping passwordless auth in any stack will hit at least four of these.

A small note on the format: this is a war diary, not a recipe. The fires are ordered logically rather than chronologically, and you'll notice that Fires 3 and 6 both follow Fire 1 — because they were caused by its fix. That ordering is part of the value. The "fix one thing, break two more" pattern is real, and it deserves to be on the page.

Table of contents


Part 1: The setup

A quick context before the fires. The portal is a real-estate investment platform. Investors sign in infrequently — weeks or months between sessions. Session recordings showed the password step was the most-failed step in the entire flow: people forgot passwords, fought the strength meter, abandoned the form altogether. Magic links were the lower-friction alternative. Of course there were trade-offs: I'd be shifting the failure surface from "remember password" to "receive and click email." But the failure surface I already had was clearly worse than the one I was trading into. At least, that's what I thought at the time.

The base implementation is small. generates_token_for :magic_login (a Rails 7.1+ primitive) on the Person model produces a signed token tying a few model columns, a purpose, and a TTL into a URL-safe string. A mailer sends the URL. A controller has two actions: one to request the link (POST), one to verify and sign in (originally a single GET). The flow at a glance:

Sequence diagram of the baseline magic-link flow with four actors (User, Browser, App, Inbox). The User clicks "log in with magic link" in the Browser, which POSTs to the App. The App sends an email to the Inbox and returns an "Email sent" page. The User clicks the "Sign in" CTA in the email, the Browser issues a GET /verify, and the App signs the user in and redirects to the dashboard.

And the three files in their original form (this is the baseline state — before any of the fires):

# app/models/person.rb
class Person < ApplicationRecord
  devise :database_authenticatable, :confirmable, :trackable, :rememberable

  generates_token_for :magic_login, expires_in: 15.minutes do
    email
    # ... and a few other columns; we'll come back to this in Fire 1
  end
end
# app/mailers/magic_link_mailer.rb
def send_link(person)
  @url = magic_link_verify_url(
    token: person.generate_token_for(:magic_login)
  )
  mail(to: person.email, subject: "Your sign-in link")
end
# app/controllers/people/magic_links_controller.rb
def create
  MagicLinkService.request(params[:email])
  flash[:magic_link_email] = params[:email]
  redirect_to sent_magic_link_path
end

def sent
  @email = flash[:magic_link_email]
end

def verify
  if (person = MagicLinkService.authenticate(params[:token]))
    sign_in(person)
    redirect_to dashboard_path
  else
    redirect_to expired_magic_link_path
  end
end

Roughly fifty lines across three files. That's the entire baseline before any of the fires.

A quick word about Devise, since it'll keep showing up: it's a long-established authentication library in the Rails ecosystem — widely used, around for over a decade. Authentication is delivered as a stack of opt-in modules that you include on your user model: one for password-based sign-in, one for session management, one for tracking who signed in and when, one for confirming email addresses on sign-up, and several more. Each module wires up the controllers, routes, mailers, and views for the flow it owns. You compose the set you want; Devise does the rest.

Two of those modules were already in place on the Person model from the password era, and they'll be relevant later, so I want to name them upfront:

  • :trackable updates sign_in_count, current_sign_in_at, last_sign_in_at, current_sign_in_ip, and last_sign_in_ip on every successful sign-in.
  • :confirmable requires a confirmed email address before sign-in is allowed.

Both modules look orthogonal to magic links. They're not. Each one will cause a separate fire. Keep them in mind as the diagnoses unfold.

The implementation went out behind a feature flag, ran cleanly in staging for a week, and shipped to production. The first fire arrived two days later.


Part 2: The six fires

Each fire follows the same micro-template: Symptom (what users or metrics saw), Diagnosis (root cause and the non-obvious mechanism), Fix (code and brief explanation), Lesson (one paragraph of takeaway).

Fire 1: Email prefetchers were burning valid tokens

Symptom. Within 48 hours of the rollout, support tickets arrived describing the same failure: users clicked the magic link in their email and landed on "Link expired", even though the click was within the 15-minute TTL I'd set. The reports clustered around two populations — executives on Outlook with Mimecast in front, and a handful of Gmail users. I couldn't reproduce it on a Gmail consumer account in QA.

Diagnosis. Two layers stacked on top of each other:

  1. Corporate email gateways — Outlook Safe Links, Mimecast, Proofpoint — issue GET requests against every link in incoming emails for malware scanning. Before the user has even opened the email. Some consumer providers do the same: Gmail in particular pre-fetches links for some users from rotating cloud IPs. There is no reliable way to distinguish a scanner from a real browser. User-Agent strings get spoofed, IP ranges overlap with legitimate traffic, and the more aggressive scanners now run headless browsers. You can't UA-sniff your way out of this.

  2. The GET /magic_link/verify action was the one that signed the user in. Successful sign-in triggered Devise's :trackable, which mutated last_sign_in_at. And — here's the trap — last_sign_in_at happens to be one of the columns generates_token_for includes in the signature. The signature no longer matched the previously-issued token. The single-use token was burned by the scanner; by the time the human clicked, the link was already invalid.

The non-obvious bit is that second layer. generates_token_for looks simple from the outside — it signs a payload. But the payload is implicit: it's whatever columns you declared in the generates_token_for block. If your auth side effects touch any of those columns, your tokens are self-invalidating. And :trackable touches one of them on every sign-in, by default, silently. The two modules look orthogonal; they're not.

Fix. Split the verify endpoint into two actions:

  • GET /magic_link/verify becomes fully idempotent. It renders a branded interstitial page with a "Continue" button. It does not call the auth service, does not sign_in, does not mutate the Person. It emits Cache-Control: no-store and Referrer-Policy: no-referrer (I'll revisit that second one in Fire 3). Safe for any prefetcher to hit.
  • POST /magic_link/verify is the actual authentication. Triggered by a Stimulus controller that auto-submits the form ~400 ms after the page becomes visible. Prefetchers don't execute JavaScript and don't submit forms; the POST only ever fires for real users.

The shape of the verify flow changes accordingly. Where the baseline had one GET /verify that signed the user in, there are now three steps in its place: the GET renders an inert page, the browser auto-submits a POST, and the POST does the actual authentication:

Sequence diagram of the magic-link flow after Fire 1. Same four actors as the baseline, but the verify step is now three messages instead of one: the Browser GETs /verify and receives the interstitial page (no auth, idempotent), a note above the Browser lifeline shows that JS auto-submits ~400 ms after the page becomes visible, then the Browser POSTs /verify and the App signs in and redirects to the dashboard.

The Stimulus controller driving the auto-submit is small. The visibilitychange listener handles cases where the link is opened in a background tab — I want the auto-submit to fire when the user actually looks at the page, not before:

// magic_link_auto_submit_controller.js
import { Controller } from "@hotwired/stimulus"

export default class extends Controller {
  static targets = ["form"]

  connect() {
    if (document.visibilityState === "visible") {
      this.scheduleSubmit()
    } else {
      document.addEventListener("visibilitychange", this.onVisible)
    }
  }

  disconnect() {
    document.removeEventListener("visibilitychange", this.onVisible)
    if (this.timer) clearTimeout(this.timer)
  }

  onVisible = () => {
    if (document.visibilityState === "visible") this.scheduleSubmit()
  }

  scheduleSubmit() {
    this.timer = setTimeout(() => this.formTarget.requestSubmit(), 400)
  }
}

The visible "Continue" button stays in the page as a no-JS and accessibility fallback. Real users with JS enabled never see it click — the form auto-submits in under half a second.

Lesson. Any side effect on a GET endpoint will be triggered by a scanner before the user clicks. If your action is non-idempotent — sign-in, confirmation, single-use token consumption, anything that touches state — it has to live behind a POST that JavaScript triggers. And when you use generates_token_for, audit which columns are in the signature against which columns your auth flow mutates. The interaction is silent and the failure mode is invisible from logs, unless you happen to correlate signed-token TTL with scanner traffic on the same URL. I didn't, and that's how I learned.

Fire 2: The confirmation step was redundant

To understand this one, a bit of backstory on :confirmable is needed.

When Devise's :confirmable module is enabled on a model, it does two things automatically. First, whenever a record is created with an email address, it fires off a confirmation email containing a unique link. Second — and this is the load-bearing part — it blocks sign-in for any record whose confirmed_at column is NULL. Even if the user authenticates correctly, Devise rejects the session and routes them to a "please confirm your email" page. The user is meant to click the link in the confirmation email, which sets confirmed_at, and only then can they sign in.

That's the flow that was already in place before magic links existed, designed around password-based sign-in. New investor → confirmation email → click link → password flow available.

Symptom. When magic links shipped, that confirmation flow stayed exactly where it was — I didn't touch :confirmable because, on paper, the two features were unrelated. Now imagine the new shape of the world: an operator creates an investor's account from the back office, and the investor's inbox receives two emails — the auto-fired Devise confirmation email, and (depending on how the account was provisioned) a magic-link email asking them to log in. Two emails, both legitimate, both pointing at the portal.

The investor does the natural thing: they pick the more obvious one, the magic link, and click it.

The magic-link verification itself succeeds. The token signature is valid, Devise authenticates the request. But then :confirmable steps in and refuses to let the session proceed, because confirmed_at is still NULL on that Person. Instead of the portal, the investor lands on the "please confirm your email" wall — having just clicked a magic link in that exact email a second ago. To actually get in, they have to dig back into their inbox, find the other email (the confirmation one), and click its link. Operators ended up walking users through this in support calls, repeatedly.

Diagnosis. This isn't a bug in the usual sense. It's two correct modules that don't know about each other. :confirmable was designed for a world where the only proof of mailbox control happens at sign-up time, via its own dedicated email. Its mental model is "the email is unverified until my link is clicked." Magic links, meanwhile, verify exactly the same property on every login: clicking a link delivered to an inbox proves the user controls that inbox. The two mechanisms are doing the same job, by the same mechanism, with the same level of assurance.

The trouble is that :confirmable has no way of knowing that. From its perspective, the magic-link click is invisible — it's a different controller, a different code path, a different model column. So it keeps blocking access until its own link is clicked. The user is being asked to prove the same thing twice, using two different emails, because the framework can't tell that they're equivalent.

Fix. Acknowledge the equivalence explicitly. On the magic-link POST handler, after a successful token verification, mark the person as confirmed if they aren't already, then proceed with sign-in:

person.confirm unless person.confirmed?
sign_in(person)

person.confirm is a Devise-provided method from :confirmable itself — it sets confirmed_at to the current time and clears any pending confirmation token. Calling it manually from the magic-link flow is the bridge between the two modules: it tells :confirmable "yes, mailbox control has been proven, even though it wasn't via your own link." Now when sign_in(person) runs, :confirmable sees confirmed_at populated and gets out of the way.

Existing confirmation-link emails still work for users who happen to click them first — the password flow is untouched. They're just no longer mandatory for users coming in via magic link.

Lesson. Magic-link click is functionally equivalent to email confirmation. They verify the same property by the same mechanism, but the framework treats them as independent because the modules don't talk to each other. When you bolt magic links onto an existing :confirmable flow, collapse the two steps — otherwise you're charging the user twice for the same proof, and the second email feels like an arbitrary hurdle they've already cleared. It's the kind of decision that's only obvious in retrospect, because the documentation for each module is written as if the other doesn't exist.

Fire 3: The fix to Fire 1 broke CSRF for everyone

Symptom. Hours after the Fire 1 fix went live, telemetry showed every magic-link POST returning HTTP 422 with ActionController::InvalidAuthenticityToken. Nobody could sign in. Not a population — everyone. The test suite was green. Staging had passed manual QA.

Diagnosis. The interstitial set Referrer-Policy: no-referrer as part of the Fire 1 hardening — I didn't want the token-bearing URL leaking into Referer headers on the auto-submitted POST. That part was correct.

What I missed: modern browsers translate Referrer-Policy: no-referrer not just into a stripped Referer header, but also into an Origin: null value on the subsequent POST. And Rails' default forgery_protection_origin_check validates that Origin matches the host. null matches nothing. The CSRF token in the form was correct; the origin check was what failed.

The reason the test suite missed it was equally subtle: config/environments/test.rb had config.action_controller.allow_forgery_protection = false — a sensible default for unit and feature specs that aren't testing CSRF. Even the :js feature spec driving real Chromium through the full GET → POST cycle passed, because CSRF enforcement was off in test.

Fix. A one-character change:

# app/controllers/people/magic_links_controller.rb
response.headers["Referrer-Policy"] = "strict-origin"

With strict-origin, the browser sends Origin (scheme + host + port only — no path, no query string) on the POST. The token-bearing URL still does not leak via Referer, and forgery_protection_origin_check has the value it needs. strict-origin is the most permissive policy that still hides your URL — it's the right default for any page that holds a sensitive token.

The accompanying regression spec opts into CSRF locally instead of flipping the global setting:

# spec/requests/people/magic_links_spec.rb
RSpec.describe "magic link CSRF", type: :request do
  around do |example|
    original = ActionController::Base.allow_forgery_protection
    ActionController::Base.allow_forgery_protection = true
    example.run
  ensure
    ActionController::Base.allow_forgery_protection = original
  end

  it "rejects POSTs with Origin: null" do
    # ...
  end

  it "accepts POSTs with the Origin a real browser sends" do
    # ...
  end
end

I now apply this pattern — opt-in CSRF on the specific endpoint, in its own request spec — prophylactically to every auth-related controller in the codebase. Flipping CSRF on globally in the test env would have been too invasive; opt-in covers the security-critical paths without rewriting the rest of the suite.

Lesson. CSRF in Rails depends on both the token in the form and the Origin header in the request. Referrer-Policy controls both. If you set no-referrer on a page that POSTs to its own host, you break your own CSRF check. strict-origin is the right knob: it strips the token-bearing path from Referer while keeping Origin intact. And if your test environment globally disables CSRF, you need opt-in CSRF specs on auth-critical endpoints. Otherwise the regression that took down sign-in in production will pass every test you have. Mine did.

Fire 4: The user's email rendered as a stray flash message

Symptom. On the "Email sent" confirmation page (rendered right after the user requests a magic link), the email address rendered twice: once inside the instructions paragraph ("we've sent a link to user@example.com") and once as an unlabeled standalone line above the notice. Visually it looked like a UI bug. Functionally it was harmless. Aesthetically it was embarrassing.

┌──────────────────────────────────────────────────────┐
│                                                      │
│  user@example.com                          ← [1]     │
│                                                      │
│  Link sent                                           │
│                                                      │
│  If your email is registered, we've sent             │
│  a sign-in link to user@example.com.       ← [2]     │
│                                                      │
└──────────────────────────────────────────────────────┘

  [1] Stray flash — picked up unintentionally by the
      layout's generic flash partial.
  [2] Intended use — rendered inside the instructions
      paragraph by the view.

Diagnosis. The request controller used the flash as a data carrier across the post-redirect:

# magic_links_controller.rb
def create
  MagicLinkService.request(params[:email])
  flash[:magic_link_email] = params[:email]
  redirect_to sent_magic_link_path
end

def sent
  @email = flash[:magic_link_email]
end

That's a perfectly idiomatic use of the flash — except for one detail. The shared layout has a generic flash partial that iterates over every flash key and renders each one as a .flash-message:

<%# layouts/_devise_flash.html.erb %>
<% flash.each do |type, message| %>
  <div class="flash-message flash-message--<%= type %>"><%= message %></div>
<% end %>

So the email was being picked up twice: once intentionally inside @email in the view, once unintentionally by the layout's flash iteration.

Fix. Read the value from the flash and then immediately remove it before the view renders:

def sent
  @email = flash[:magic_link_email]
  flash.delete(:magic_link_email)
end

The cross-request carrier semantics are preserved — the email still arrives from the redirect, the page still shows it inside the instructions paragraph, the regression test that asserts "email must come from the immediate redirect, not from a stale session" still passes. The generic flash partial no longer sees the key.

Lesson. The flash is shared infrastructure. Anything you put in it is fair game for any layout code that iterates the whole hash — and most layouts do, sooner or later. If you need a one-shot carrier that the layout shouldn't render, delete it after reading. The alternative is to teach the layout partial a deny-list, which is brittle and pushes the coupling in the wrong direction. Better: keep the producer responsible for cleaning up.

Fire 5: A 15-minute TTL was too short for corporate inboxes

Symptom. Six weeks after the rollout — once Fires 1, 2, 3, and 4 were behind me — a senior executive at a corporate client wrote in saying his magic links always expired before he could click. He was on Outlook with Mimecast in front. My first suspicion was that Fire 1 had regressed somehow. New Relic showed something worse: across the entire user population, roughly 40% of magic-link clicks were landing on /expired.

Diagnosis. Not a regression. Fire 1's GET/POST split was intact and the GET endpoint was genuinely idempotent — scanners weren't burning tokens. The cause was simpler, and harder to fix: 15 minutes is just not enough time for corporate email to deliver, scan, route, quarantine-review, and surface a link to a human. Especially a busy human who reads email in batches twice a day.

I'd picked 15 minutes from the security side of the trade-off — short enough to limit the blast radius of a leaked link, long enough for a focused B2C user to click. What I hadn't done was calibrate against B2B inboxes. A Mimecast quarantine review can add 5–10 minutes on its own. An executive who reads email twice a day will routinely consume the TTL before opening the message at all.

Fix. Bump the TTL. From 15 minutes to 60 minutes, via the single constant that feeds everything else:

# app/models/person.rb
MAGIC_LINK_EXPIRY = 60.minutes

generates_token_for :magic_login, expires_in: MAGIC_LINK_EXPIRY do
  # ...
end

The constant propagates automatically to the token's lifetime, the mailer body ("60 minutes" derived from the constant — so future bumps don't desync), and the rate-limit window. That last one is worth calling out: I intentionally coupled the magic-link rate limit to MAGIC_LINK_EXPIRY. If the rate-limit window were shorter than the token's lifetime, a user could exhaust the rate limit while still holding valid unexpired links — confusing UX, and probably a footgun for the next person reading the code.

# app/services/magic_link_service.rb
RATE_LIMIT_WINDOW = Person::MAGIC_LINK_EXPIRY
RATE_LIMIT_REQUESTS = 3

The constant change also surfaced a test-hygiene improvement. Two specs had hardcoded the string 15 minutes. They now read from Person::MAGIC_LINK_EXPIRY directly, so a future TTL change doesn't require touching the tests.

Post-deploy, the /expired landing rate dropped sharply. The exact final number was below 10% — most of the residual being users who genuinely opened email more than an hour late, which is unrecoverable by TTL alone.

Lesson. B2B and B2C have completely different time-to-click distributions. Whatever TTL feels right from a security angle, calibrate it against production telemetry before shipping. A token that expires before half your users click is worse than a slightly looser TTL — the security gain is theoretical, the funnel drop is measurable. And when you do settle on a value, anchor it to a single constant and let every related setting (mailer copy, rate-limit window, test expectations) derive from it. Future-you will thank present-you when the next calibration is needed.

Fire 6: A double-submit race was clobbering successful sign-ins

Symptom. A handful of users reported the same odd experience: they clicked the magic link, briefly saw the dashboard, and were then bounced to /expired. The link was apparently valid (they had clearly seen the dashboard) but the page somehow concluded that it wasn't. New Relic over a 2.5-hour window made it concrete: 7 out of 26 magic-link POSTs returning 422 — pairs of POSTs from the same User-Agent, milliseconds apart. The 422 rate (~27%) was high enough to be a real funnel drop.

Diagnosis. The Stimulus auto-submit from Fire 1's fix had a race in it. The 400 ms timer fired one submit; if the user also clicked the visible "Continue" button before then, both submits raced each other. The first POST succeeded — token consumed, session rotated. The second POST hit CSRF rejection because the session had just rotated under it, the previous CSRF token was no longer valid, and the user landed on /expired even though the first POST had already signed them in correctly. From their perspective: a momentary dashboard, then an error screen. Confusing.

A bit of backstory on why this got past the first round of fixing. Recall from Fire 1 that the Stimulus controller auto-submits the form via a 400 ms timer (this.formTarget.requestSubmit()), and that there's also a visible "Continue" button on the page as a no-JS / accessibility fallback. In an earlier iteration I had also added a click listener on that button — to suppress the case where a user clicked it twice in quick succession. The listener used the standard JavaScript trick for "make this fire only once": addEventListener("click", handler, { once: true }). The { once: true } option tells the browser to auto-remove the listener after it fires the first time. It's the canonical way to stop an impatient user from clicking a submit button twice in a row.

What { once: true } does not do — and this is what I'd missed — is prevent submission via any other code path. The 400 ms timer wasn't simulating a click on the button; it was calling this.formTarget.requestSubmit() directly, which triggers the form's native submission and bypasses the click listener entirely. The actual sequence was: the user clicks "Continue", the click listener fires, it auto-detaches ({ once: true } doing exactly what it promised), the form submits. Then 400 ms later — or even before that, if the user was fast — the timer fires, calls requestSubmit() on the same form, the form submits again. No click handler involved this time. Nothing to suppress. Two POSTs. The guard was watching the wrong door.

Fix. Both layers needed to change.

Client-side, an inFlight guard in the Stimulus controller that returns early on any subsequent submit attempt regardless of how it was triggered:

// bound to the form via `data-action="submit->magic-link-auto-submit#submit"`,
// so it fires for both the timer-driven submit and any click on the visible button
submit(event) {
  if (this.inFlight) {
    event.preventDefault()
    return
  }
  this.inFlight = true
  // form proceeds with default submission
}

Server-side, an idempotent-replay branch in the POST handler. If the token has already been consumed but Warden already has a signed-in person attached to the session — and that person both belongs to the current developer/tenant and is still eligible for magic-link sign-in — that's a legitimate replay. Redirect to the dashboard instead of /expired. The user is signed in; I just need to acknowledge the second POST gracefully:

# people/magic_links_controller.rb
def authenticate
  if (person = MagicLinkService.authenticate(params[:token]))
    sign_in(person)
    redirect_to dashboard_path and return # happy path — the normal post-login redirect
  end

  if (person = eligible_replay_person)
    Rails.logger.info(
      event: "magic_link.replay_acknowledged",
      person_id: person.id
    )
    redirect_to dashboard_path and return
  end

  redirect_to expired_magic_link_path
end

private

def eligible_replay_person
  signed_in = warden.user(:person)
  return unless signed_in
  
  # the app is multi-tenant; `developer` is the tenant model
  return unless signed_in.belongs_to_developer?(current_developer)
  return unless signed_in.can_use_magic_link?

  signed_in
end

A few details that matter:

  • The replay is gated on cross-tenant belonging (belongs_to_developer?) and on the person being eligible for magic links (can_use_magic_link?). A signed-in person from a different tenant is not a legitimate replay — it's a session-confusion bug.
  • The forensic log entry is deliberately structured (keyed). When this branch fires, I want to be able to query for it in aggregate later.

Both layers matter. The client guard catches the common case — timer-vs-click on a single device. The server idempotent path catches anything that slips through: browser back/forward, double-tap on touch devices, retries from email clients that open links twice. Same principle as Stripe webhooks: assume at-least-once delivery and design the receiver to deduplicate.

Lesson. Client-side guards alone are not enough for idempotency. The server has to have an idempotent replay path for any single-use auth flow, because the client is simply not under your control. Browsers retry. Users double-tap. Network conditions cause re-submits. Accessibility tooling triggers events in ways you can't anticipate. The fix at both layers is the right answer. Doing only one is a half-fix, and it will resurface in production — probably at the worst possible moment.


Part 3: What I'd do differently

A few observations, stepping back from all six fires.

The "fix one thing, break two more" pattern is structural, not anecdotal. Fires 3 and 6 were both caused by Fire 1's fix. The GET/POST split with JS auto-submit is the right architecture — Fire 1's lesson stands — but the implementation seam introduced new failure modes of its own. When I ship a non-trivial auth change now, I budget time for the follow-up fires. They arrive, usually within hours of the deploy.

Production telemetry was load-bearing for half the fires. Fires 5 and 6 would have been invisible to me without aggregate data. The 40% /expired landing rate and the 27% 422 ratio were both population-level signals — no single user reported them clearly enough to diagnose without metrics. If your auth flow doesn't emit structured events for failure modes (expired, replayed, CSRF-rejected, scanner-triggered), you're flying with one engine.

The most subtle gotcha was :trackable feeding the token signature. Devise's :trackable mutating last_sign_in_at — a column included by default in generates_token_for's signature — is the kind of interaction you can read both docs from beginning to end and still not catch. Anyone using generates_token_for on a Devise model should audit which columns the signature covers against which columns the auth flow mutates. The check costs ten minutes. The alternative is Fire 1.

Test env had CSRF off. That single line in config/environments/test.rb was why Fire 3 was undetectable in the existing suite. The fix wasn't to flip CSRF on globally — too invasive for the rest of the suite — but to add opt-in CSRF specs on security-critical endpoints via an around block. Worth doing prophylactically on any auth-related controller, before you need it.

Magic links are not "auth made simple." They're a different surface area with different failure modes. The happy path is genuinely simpler than passwords — no remembering, no strength meters, no rotating policies. The edge cases are arguably worse. If you're picking magic links to reduce the auth surface, calibrate that expectation against the six fires above. The right reason to pick magic links is that the failure modes you're trading into — corporate gateways, prefetchers, TTL calibration, idempotency races — are ones you can engineer for. The failure modes you're trading out of — forgotten passwords, abandoned sign-ups, strength-meter friction — are ones your users can't engineer for, no matter how patient they are.

I'm glad I made the trade. I just wish someone had written this post before I started.


References