Security · Entra ID · April 2025

Protecting against token theft in Microsoft Entra ID

MFA is no longer enough on its own. Attackers have learned to steal tokens after authentication is complete — bypassing MFA entirely. Here's how Continuous Access Evaluation and token binding change the equation.

Multi-factor authentication is table stakes now. Most organisations have rolled it out, and the security posture improvement is real. But threat actors have adapted. The attack pattern that's become increasingly common — and that I see impacting partners and customers in the A/NZ market — is token theft via adversary-in-the-middle (AiTM) phishing.

The mechanics are straightforward and the implications are significant: an attacker intercepts the authentication session after MFA is completed, steals the session token, and replays it from a completely different location. From the application's perspective, the token is valid. No further authentication is required. The attacker is in.

This is a post about how Microsoft Entra ID's advanced capabilities — specifically Continuous Access Evaluation (CAE) and token binding — address this threat vector, and what you can do right now to reduce your exposure.

How token theft actually works

Modern authentication in Entra ID uses OAuth 2.0 tokens. When a user authenticates (including completing MFA), they receive an access token and a refresh token. These tokens are then presented to services — Microsoft 365, Azure, third-party apps — to prove identity without re-authenticating on every request.

AiTM phishing attacks use a reverse proxy (tools like Evilginx2 are commonly cited in threat intelligence reports) that sits between the user and the legitimate identity provider. The user authenticates normally — including MFA — but the attacker's proxy captures the session cookie or token on the way through. The attacker now has a valid, authenticated token they can replay.

The key insight: MFA protects the authentication event. It doesn't protect the token that results from it. Once the token is issued and stolen, MFA has already done its job — from the system's perspective, the authentication was legitimate.

Continuous Access Evaluation (CAE)

CAE is Microsoft's response to the token lifetime problem. Traditionally, access tokens have a fixed lifetime (typically 60-75 minutes). During that window, they're valid regardless of what happens to the user account or network conditions.

CAE changes this by creating a real-time communication channel between Entra ID and CAE-capable applications (Exchange Online, SharePoint Online, Teams, and others). When a critical event occurs — a user's account is disabled, their password is changed, their session is revoked, or their network location changes — Entra ID can signal the application to re-evaluate the token immediately, rather than waiting for the token to expire naturally.

What CAE defends against

Enabling and verifying CAE

CAE is enabled by default for supported services — you don't need to turn it on. What you do need to ensure is that your Conditional Access policies are configured to use it effectively. Specifically:

Token binding

Token binding is the more fundamental defence: cryptographically tying a token to the specific TLS session in which it was issued. A bound token cannot be used from a different TLS session — even if it's stolen — because the attacker's session won't have the corresponding private key.

Browser-based token binding (RFC 8471) has had a complicated adoption history, and is not yet widely deployed across the ecosystem. However, Microsoft has implemented a version of this concept through Proof of Possession (PoP) tokens in Entra ID — particularly for Continuous Access Evaluation-enabled flows.

PoP tokens require the client to prove possession of a private key when presenting the token. An attacker who steals the token but not the private key cannot replay it. This is a meaningful shift from bearer token semantics, where possession of the token is the proof.

Practical takeaway: PoP token support is currently most mature in native application flows. If you're building line-of-business applications on top of Entra ID, using MSAL (Microsoft Authentication Library) with PoP support enabled is a meaningful hardening step.

What to do right now

CAE and token binding are directionally correct defences, but they don't replace good security hygiene. The practical steps I'd recommend for any Entra ID tenant:

  1. Enable Entra ID Identity Protection — The risk-based Conditional Access policies can detect token theft signals (impossible travel, unfamiliar sign-in properties, anonymous IP) and require step-up authentication even when a valid token is presented.
  2. Enforce compliant device requirements via Conditional Access — A stolen token replayed from an unmanaged device will fail Intune compliance checks if your CA policies require them. This is one of the most effective controls against AiTM token replay.
  3. Deploy phishing-resistant MFA — FIDO2 security keys and Windows Hello for Business don't produce tokens that can be intercepted by an AiTM proxy in the same way. Where possible, migrate high-value accounts away from push notification MFA.
  4. Implement session token sign-in frequency controls — Set sign-in frequency appropriately for your risk tolerance. Shorter-lived sessions limit the window of opportunity for stolen token replay.
  5. Review your named locations and monitor for impossible travel alerts — A token being used simultaneously from Sydney and Eastern Europe is a signal your SIEM and Entra ID monitoring should be surfacing.

The bigger picture

The shift from "authenticate the user" to "continuously evaluate the session" is the right architectural direction for identity security. CAE represents a meaningful step toward that model. But it works best when your Conditional Access policies are well-designed, your device management is mature, and you're actively monitoring the signal that Entra ID produces.

Token theft isn't a vulnerability that gets patched — it's an exploitation of a design characteristic of bearer tokens. The defence is architectural, and it requires layering controls rather than relying on any single mechanism.

← Back to blog