atriumatrium
← All posts
·Jonny Asmar

From WKWebView to CEF: embedding Chromium in a Tauri app

The Google OAuth wall that forced an engine swap, the punchout trick that makes the embed look native, and the hit-test layering that makes clicks land.

atrium has a pane type that holds a real Chromium browser. Not an iframe. Not a "preview" mode of the main webview. A separate Chromium process, rendering inside a tile of the workspace, sitting next to a Claude Code session or a markdown canvas. You can drag-resize it, point it at a localhost dev server, open DevTools on it, and watch hot reload while an AI agent works in the pane next door.

What it looks like from the outside is unremarkable. Web inside a desktop app. We've been doing that for thirty years.

What it took to actually ship was one engine migration I really did not want to do, and an education in where macOS draws the line between "a thing your CSS controls" and "a thing the OS controls." This is meant to be useful to anyone who has thought, even for a minute, I should put a browser inside my desktop app, and then encountered the surprising amount of platform plumbing standing between them and a working version of that.

The spike that worked, until Google OAuth

The first version of atrium's browser pane was a second WKWebView, embedded through Tauri's unstable multi-webview support. This is the path Tauri makes easy. You construct a tauri::webview::WebviewBuilder, point it at a URL, and a sibling webview shows up next to your main one. From the React side, it's just another pane.

The prototype merged into main in early April. For about two weeks I was happy with it. Then I started running into things the WKWebView path made hard, and one thing it made effectively impossible.

The impossible thing was Google OAuth. Modern Google sign-in detects when it's running inside an embedded browser and refuses to complete the flow, a security policy aimed at credential phishing. The exact heuristics aren't public, but in practice WKWebView gets flagged immediately. Users would click "Sign in with Google," watch the redirect happen, and then see a polite refusal page telling them to use a "secure browser." Gmail wouldn't load. Figma's sign-in failed. Anything that routed through Google Identity Services failed.

There are workarounds people use in other apps. Custom user-agent strings. External browser redirects with a deep-link back. We tried a couple. None of them got us out of WKWebView's detection envelope, and the ones that involved bouncing the user out to Safari broke the workspace experience the browser pane existed to preserve. If you're going to embed a browser at all, the entire point is that the user doesn't have to leave your app.

The hard things were less dramatic but added up. WKWebView's inspector speaks Safari Web Inspector protocol, not Chrome DevTools Protocol, which meant the agent-automation surface (programmatic browser navigate, browser eval, browser snapshot) was harder to wire up cleanly. Click and key synthesis worked but felt fragile. And on macOS, hooking WKWebView's hit testing required swizzling a class whose name has the Wry crate version embedded in it (wry::wkwebview::class::wry_web_view::WryWebView0.54.4), discoverable at runtime via objc_getClassList, breakable on every dependency bump.

By mid-April it was clear WKWebView wasn't going to make it past the Google OAuth wall, and that even if we found a workaround, every adjacent capability we wanted from a "real" browser pane was a fight against the engine. The planning doc for the engine swap landed on April 17. The first CEF swap landed over a weekend. The part that took the next month was making it feel native inside the workspace.

What replaced WKWebView is more code, a heavier runtime, and a much more controllable architecture. CEF (Chromium Embedded Framework) is, to be blunt, a more capable embedded browser. It speaks CDP. It moves page rendering into Chromium renderer subprocesses, so a renderer crash in one pane doesn't take down the main app. It supports passkeys and FedCM. In our Google OAuth and Gmail validation, it did not trip the embedded-browser refusal, because the surface is Chromium rather than WKWebView.

The tradeoff is real. CEF adds about 170MB to the app bundle. Initialization is heavier. There's a separate helper subprocess with its own crash boundary. But for a user-facing browser pane that has to work with the entire real web, those costs are the right ones to pay.

The punchout trick

The hardest thing to explain about embedding a non-web view inside a React app is that there is no good way to do it inside React. React renders to the DOM. The DOM lives inside a single webview. A Chromium browser engine running as a separate process cannot be a DOM node, because the DOM was not designed to host that kind of thing.

What you can do is take advantage of macOS's compositing layer. The Tauri app's main window contains an NSView hierarchy. The React-rendering webview is one NSView in that hierarchy. If you set that webview to not draw its own background, and you put an atrium-owned wrapper view (containing the CEF browser) in the same NSView hierarchy with a lowered zPosition, then anywhere the React layer renders nothing, the transparent gap in the webview lets the wrapper show through visually from behind.

That's the punchout trick. The transparency punches a visual hole through React; the wrapper view fills the hole from behind.

The actual structure looks like this:

// 1. The main webview (Wry-backed) must not paint its own background.
[wryWebView setDrawsBackground:NO];

// 2. CEF lives inside an atrium-owned wrapper NSView with a layer backing
//    and a lowered z-position, so it sits behind the main webview in the
//    visual stack. Hidden by default; shown only when a browser pane opens.
NSView *cefWrapper = [[NSView alloc] init];
cefWrapper.wantsLayer = YES;
cefWrapper.layer.zPosition = -1.0;
cefWrapper.hidden = YES;
[contentView addSubview:cefWrapper];

// 3. The CEF view is a child of the wrapper, filling it.
cefView.frame = NSMakeRect(0, 0, wrapperW, wrapperH);
[cefWrapper addSubview:cefView];

On the React side, the BrowserPane component renders a transparent div at the pane's tile coordinates and registers that element with a central scheduler called browserGeometryManager. The manager observes layout changes, throttles updates to roughly frame-rate, and pushes the resulting bounds through a Tauri command (setBrowserCefPaneBounds). Rust resizes the wrapper to match, the CEF child reflows to fill, and the two layers stay aligned through pane resize, room switch, window drag, anything that moves the visual.

There are surprises in this. The biggest one was that CALayer.zPosition does not control hit testing. It controls visual compositing only. Two siblings at the same depth, one with zPosition = -1, will render in z-position order but hit-test in subview order. I learned this by setting the wrapper's zPosition correctly, watching it draw correctly behind the main webview, and watching every single click still go to the CEF view. macOS uses subview order, full stop, for input routing. The visual override does not extend to events.

The second surprise was that you can't reorder subviews to fix this, at least not safely. [parent addSubview:positioned:relativeTo:] works for visual reorder, but if the view being moved is a Chromium-backed webview with pending work (which CEF always is) the operation can freeze, blank, or desynchronize the view hierarchy. The right move isn't to reorder. The right move is to leave both subviews in place and suppress hit-testing on whichever one shouldn't receive events at any given moment.

That's what the next part is about.

The OS swallows your clicks before JavaScript ever sees them

Once the punchout works visually, you have a working browser embed. Drop a React overlay over it (a Radix dropdown, a Sonner toast, a modal) and the next bug surfaces. Clicks on the overlay fall through to the browser underneath. Hovering works. Visual state updates work. Clicking does nothing.

This bug looks like a React problem for the first two hours. It is not.

The hit-test decision happens at the AppKit layer, before the JavaScript event loop of either webview gets a chance to see the event. AppKit sees two webviews at overlapping coordinates, walks the responder chain in subview order, and gives the event to whichever one accepts it. The CEF view, by default, accepts every event inside its frame. So even though React drew a dropdown menu on top, AppKit doesn't know about the dropdown. AppKit only knows about subviews. The CEF view says "I'll take this," takes it, and your React component never fires.

There are a few "obvious" fixes that don't work, and they're worth naming because every developer who hits this tries at least two of them.

pointer-events: none on the BrowserPane container element does nothing. The property is a CSS thing. It governs DOM event dispatch inside the React webview, but the event in question never makes it into the React webview's event loop. AppKit consumed it at the subview level.

Tab-index and focus management don't help. Focus is a different concern from hit-testing. You can have focus inside React and still lose every click.

A mouseover JavaScript handler on the React side, calling some setBrowserHitTesting(false) Tauri command, would theoretically work for hovering overlays. But the mouseover event also never reaches React for the same reason as click. CEF eats it. You're locked out of the JavaScript event loop entirely, for any pointer event over the CEF view's frame.

The only path that works is native, per-event. You override hitTest: on the CEF view's class so that, when a global "overlay open" flag is set, the override returns nil instead of self. AppKit interprets nil as "this view declines to handle the event," and walks up the responder chain looking for someone who will. The React webview, sitting in front in z-stack and rendering an overlay, accepts the event. The click lands.

That's the mechanism. The interesting engineering question is: how does the React side correctly maintain the "overlay open" flag, so the native layer toggles hit-testing at the right times?

The overlay registry, after the counter broke

The first implementation was a counter. Every React overlay incremented an overlayCountAtom on open, decremented on close. A predicate (isOverlaySuspendedAtom = count > 0) was read by a Rust command that toggled CEF's hit-testing. Wired up cleanly. Worked for about a week.

Then browser panes stopped accepting clicks. Not in a specific overlay state. Globally, permanently, until app restart.

The counter had drifted. The decrement wasn't running in two cases that turned out to be common:

The first was Radix onOpenChange. Radix's controlled-mode overlays don't fire onOpenChange when the open state changes via the prop. Several components passed open={someAtom} for centrally-controlled visibility, which meant the close happened by the atom flipping false, but onOpenChange only fires for user close (clicking outside, pressing Escape, etc.). The decrement registered to onOpenChange simply didn't run.

The second was unmount-during-open. A pane closes mid-transition, a parent re-renders with the child gone, a portal disappears. The component unmounts. The effect cleanup runs, but the effect cleanup also depended on the open-state flag to know whether to decrement. Both got missed.

After a few hours of normal app usage, the counter would have drifted upward by 3, 5, 12. It never came back down. The browser was "always suspended," meaning clicks always fell through to React, and React, having no overlay, ignored them. The effect was indistinguishable from "browser panes are dead."

The fix replaced the counter with a registry: Map<id, HTMLElement | null>, keyed by useId(). The suspended predicate became registry.size > 0. Register and deregister are idempotent. Double-register is a Map.set of an existing key, a no-op. Double-deregister is a Map.delete of a missing key, also a no-op. Cleanup removes the current ID only when that instance believes it is open. Repeated calls don't compound state.

The diff removed more code than it added. That's usually the shape of a good architecture fix.

But the registry alone wasn't enough. DOM nodes can become orphaned in ways React doesn't tell you about. A manual removeChild. A parent setting display: none. A portal unmounting without going through the proper lifecycle. The registry entry still holds an ID and a node pointer. The browser is still suspended. The actual overlay no longer exists on screen.

The fix was a watchdog. A small component mounted at app root, sweeping the registry every 500ms. For every entry whose anchor element is disconnected from the document or hidden for two consecutive sweeps, the watchdog prunes the entry. Two sweeps because one might catch a node mid-transition; two consecutive misses means the node is gone-gone.

The watchdog is not elegant. It's a polling reconciler running once every half-second forever. But it has a useful property: it self-heals from every programmer mistake. Miss a ref wiring? Caught. Render a custom overlay that doesn't use the standard hook? Caught the moment it unmounts. Boring, durable, correct.

Partial overlap was the last bug

For about a week after the registry shipped, the architecture felt finished. Then a new bug class showed up: partial-overlap overlays. A toast notification in the corner. A popout window's title bar, with the browser visible underneath. A maximize backdrop with the browser visible behind a dimmed scrim.

In all three cases, the overlay covers part of the browser pane, leaving the rest visible. The full-hit-test-suspend mechanism is binary. Either it suspends the browser entirely (the visible browser is dead, you can see it but you can't click the parts that aren't covered), or it doesn't suspend at all (every click goes through to the browser, the overlay is dead). Neither is correct. The correct behavior is per-pixel. If the cursor is currently inside the overlay's bounds, route the click to React; otherwise, let the browser have it.

This decision can't live in CSS, back to the OS-eats-events lesson. It can't live in React, because React never sees the event. It has to live in native code, with the actual cursor coordinates.

The fix added an NSEvent local monitor for mouseMoved events. The monitor maintains a list of "active overlay hit-regions," rectangles in window coordinates, observed and published from the React side by a manager called popoutHitRegionManager. That manager uses ResizeObserver and IntersectionObserver on the overlay's anchor element, batches updates through requestAnimationFrame, and pushes the resulting rects through a Tauri command on change. When the cursor moves into a hit-region, the CEF view's hit-test override starts returning nil. When it leaves, it returns self again. The transitions happen on mouseMoved, so there's no latency between cursor position and click eligibility.

The React side of this stayed thin. The popout, toast, and maximize layers each register an anchor element with the manager on mount and clear it on unmount. The native bridge does the geometry. Two lines of consumer code per overlay type.

This is the kind of architecture I'd defend in a code review. Native owns native concerns (the OS-level decision). React owns React concerns (publishing what bounds are active right now). The boundary between them is one manager and one Tauri command. Nothing leaks.

The platform tax

There's a tier of CEF-specific work that didn't change the architecture but did delay the ship.

CEF initialization is heavy. The first call to cef::initialize plus the library load can cost anywhere from several hundred milliseconds to a couple of seconds depending on the machine, large enough to be visible on cold launch. The fix was to push CEF init into a deferred background job that runs after the first React paint. Browser panes simply don't render their content for the first second or so of app launch. Most users never notice. The ones who reach for a browser pane during that early window now see a small spinner instead of a freeze.

The CEF helper process is separate. Chromium's multi-process architecture means rendering happens in a subprocess (atrium Helper.app), independent of the main Rust binary. Every logging subscriber, every tracing setup, every panic handler you've configured for the main process applies only to the main process. The first time a helper render crashed silently and I had no idea why (no logs, no panic message, just a blank pane) I learned this. The helper process now wires up its own tracing subscriber from a small main() bootstrap, so helper-side failures finally show up where I can read them.

The CEF message pump needs careful scheduling. CEF schedules pump work from inside the current pump run sometimes. Tauri's run_on_main_thread helper has a same-thread shortcut that runs closures synchronously, which means a reentrant cef::do_message_loop_work call, which means an immediate panic. The fix was to stop using Tauri's same-thread shortcut for CEF pump work and route the pump through GCD async scheduling instead. One line of new code, one full afternoon of figuring out which one it should be.

These aren't architecture decisions, exactly. They're things you only find out after the architecture is right and you start running the result for a few weeks.

What's there now

The browser pane that ships in atrium today is a CEF view inside a wrapper NSView, with the main webview's drawsBackground=NO punching a visual hole above it. On top of that sits a registry-backed overlay suspension layer reconciled by a watchdog, a cursor-aware native bridge for partial-overlap cases, a deferred init path that keeps cold-launch fast, and a tracing subscriber inside the helper process.

That's more architecture than I expected to write when I started. It's also, in retrospect, hard to avoid. Each layer addresses a problem that the layer below can't solve from the wrong side of an event boundary.

Embedding Chromium was never just a renderer swap. It forced atrium to split ownership along the same boundary the OS uses. React owns layout intent. Rust owns native geometry. CEF owns the browser process. The browser pane works because those boundaries are explicit.