Reverse-Engineering Google NotebookLM
Google NotebookLM is one of the most quietly powerful AI products out there. You feed it your sources — PDFs, lecture notes, a YouTube video, a 200-page thesis — and you get an AI that only knows what you know, with none of the hallucinated filler from the open internet. From those sources it spins up quizzes, flashcards, mind maps, slide decks, even two-person podcast "audio overviews." All grounded in your documents.
There's no API. No REST endpoints. No SDK. No documentation. Google built this incredible product and bolted the front door shut.1
I hit that wall during exam season. I'd dumped a semester of college PDFs into a notebook — lecture slides, scanned notes, the chapters I'd avoided all term — and overnight I had something that only knew my syllabus, quizzing me on the exact units I was about to be tested on. By the third subject I was doing the same five clicks every time: create notebook, add sources, wait for processing, generate a quiz, download it. I wanted to script it. There was nothing to script against.
So I reverse-engineered the whole thing — the protocol, the auth, the artifact pipeline — and wrapped it in a TypeScript SDK that treats NotebookLM like it always should have had one.
What follows is the honest version of that story: the protocol I decoded by hand, the three bugs that nearly killed streaming, the auth flow that doesn't exist, every feature I packed in, and the things I had to delete to keep it alive.
Christmas Day, 2025
December 25th. Exams were done, I finally had time, and that five-click workflow was still nagging at me. I opened DevTools to watch what NotebookLM was actually doing on the wire.
It doesn't use a clean REST API under the hood. It speaks Google's internal batchexecute protocol — the same RPC-over-HTTP mechanism behind Google Docs, Search, and most of Google's web products. Every request is x-www-form-urlencoded form data carrying nested JSON, and every response comes back wrapped in layers of arrays behind a )]}' junk prefix. Nothing is documented.
So I started mapping methods by hand. Each one is addressed by an obfuscated ID in the rpcids query parameter — zKMnhd to create a project, o3aBmc to list them, wXbhsf to load one, JFMDGd for sharing state. The arguments ride along in an f.req field as an array-of-arrays: roughly [[[rpcid, JSON.stringify(args), null, "generic"]]]. Position is everything. There are no field names — just indices, where slot 6 means one thing and slot 7 means another, and the only spec is the minified JavaScript that builds them.
My loop was crude but it worked: trigger an action in the UI, grab the outgoing request from the network tab, then diff payloads until I understood which index controlled what. Once you've decoded a handful, the shape of the rest starts to rhyme.
Stripped to its essentials, a single call looks like this — RPC id in the query string, arguments buried in an f.req envelope, response dug out from behind its junk prefix:
// Arguments are JSON, nested inside an array-of-arrays.
const body = new URLSearchParams({
"f.req": JSON.stringify([[[rpcId, JSON.stringify(args), null, "generic"]]]),
at: authToken, // the SNlM0e token scraped from the page
});
const res = await fetch(`${BATCHEXECUTE_URL}?rpcids=${rpcId}`, {
method: "POST",
headers: { "content-type": "application/x-www-form-urlencoded", cookie },
body,
});
// Responses start with )]}' to block a naive eval(). Strip it, then pull the
// real payload out of the [["wrb.fr", null, "<escaped_json>"]] wrapper.
const raw = (await res.text()).replace(/^\)\]\}'/, "");
const data = JSON.parse(extractWrbFr(raw));Multiply that by every action NotebookLM exposes — create, list, add source, generate, poll, download, share — and you have the skeleton of the SDK.
First commit went up at midnight on Christmas. By the next morning, notebooks were creating and listing; over the next few days came sources, artifacts, audio, video, slides, language handling, auto-refresh, and quota tracking.
The Three Bugs That Nearly Killed Streaming
Chat was the hardest feature. NotebookLM's chat uses a streaming endpoint (GenerateFreeFormStreamed) that returns chunked responses in a custom format:
<byte_count>
[["wrb.fr", null, "<escaped_json>"]]Each chunk contains a snapshot of the full response so far, not a delta. The text can shrink between chunks — the API revises its own responses mid-stream. I'd never seen anything like it. It means you replace, never append — the opposite of every streaming API I'd touched:
let latest = "";
for await (const chunk of stream) {
const snapshot = parseWrbFr(chunk); // full text so far, not a delta
if (snapshot) latest = snapshot; // may be SHORTER than the previous one
}
return latest;The snapshot model was the easy part to understand. Three smaller bugs were not.
Missing source-path. The streaming URL requires a source-path parameter that isn't documented anywhere. Without it, the request silently succeeds but returns empty responses. Hours staring at 200 OK with no data before I caught it in the network tab of a working session.
Double URL encoding. The source-path contains a URL-encoded notebook ID. When I encoded the full URL, the notebook ID got double-encoded. Google's servers rejected it silently. No error, no response, nothing. The fix was a single line — use the raw value — but finding it took a full day.
Wrong ID position in request body. The chat request body requires the notebook ID as the last parameter in a nested array. I had it in the wrong position. The API returned a generic error that gave no indication of what was actually wrong.
I rewrote the streaming client twice. The second version has this at the top:
CRITICAL FIXES MAINTAINED:
1. source-path parameter in URL
2. Single URL encoding (no double encoding)
3. notebookId as last parameter in request bodyWarnings to my future self. Every line represents hours of debugging.
Google Doesn't Want You Doing This
Every few weeks, something breaks. Not because my code changed — because Google's did.
NotebookLM is a living product. Google deploys constantly. RPC method signatures shift. Response formats change. Features appear and vanish.
Notebook descriptions don't exist. I built a full description feature — create, update, display. Then realized Google removed it from NotebookLM. The UI never had a description field. I'd been writing to a parameter that was silently ignored.
Artifact state 3 means... READY? The API returns numeric state codes. 0 = unknown, 1 = creating, 2 = ready, 3 = failed. At least, that's what I assumed. Then users reported state 3 for perfectly valid, fully-generated artifacts. In practice, the API uses both 2 and 3 to indicate "ready." The enum says FAILED = 3. The implementation maps both 2 and 3 to READY. Welcome to reverse engineering.
2FA breaks slide downloads. Downloading slides requires authenticated access to Google's rendering pipeline. When a user has 2FA enabled, the Playwright session needs special handling. Cookies expire differently. Session tokens behave differently.
The Authentication Nightmare
There's no API key. No OAuth flow. No service account. To authenticate, you need an auth token (extracted from a JavaScript variable called WIZ_global_data.SNlM0e on the page) and a full cookie string from an authenticated browser session.
I built three methods, in increasing order of how much work I do for you:
Manual credentials. Copy your auth token and cookies from DevTools, paste into .env. Works immediately, expires in hours.
Browser extraction. The SDK opens Chromium, navigates to NotebookLM, waits for you to log in (including 2FA), then extracts credentials automatically.
Auto-login. Provide your Google email and password. The SDK uses Playwright to automate the entire login flow — type email, click next, type password, click next, handle 2FA prompts.
Then auto-refresh on top: a background timer that re-extracts the auth token from an active session before it expires. Connect once and the SDK keeps itself alive.
const sdk = new NotebookLMClient({
auth: {
email: process.env.GOOGLE_EMAIL,
password: process.env.GOOGLE_PASSWORD,
headless: false,
},
autoRefresh: true,
});
await sdk.connect();It's not elegant. It's the only way to programmatically authenticate with a product that has no API.
Underneath, credential resolution follows a strict priority: explicit config, then environment variables, then a saved credentials.json, then auto-login as a last resort. Connect once, and a background refresh manager keeps the token alive — auto strategy by default, with time-based and expiration-based modes if you want to tune it.
The surface area
Once the protocol was cracked, the SDK grew into a full client. Five services, organized the way you actually think about a notebook:
sdk.notebooks— create, list, get, update, delete, and (experimentally) share with viewer/editor roles. Deletes fan out one call at a time, in parallel or sequential mode, because Google has no batch delete.sdk.sources— add from URL, raw text, local files, YouTube, or Google Drive, individually or in batches. There's even a web-search source: hand it a query and a research mode (FASTorDEEP) and it discovers sources, waits for them, and adds the ones you choose.sdk.artifacts— create, list, get, download, rename, delete, and share every artifact type NotebookLM can produce.sdk.generation— chat with a notebook, streaming or buffered, with multi-turn history and configurable response length and custom prompts.sdk.notes— create, list, update, and delete the notes attached to a notebook.
Before any of these touch the wire, the client resolves credentials, checks your remaining quota, and retries on transient failures. Roughly forty methods, all typed, all grounded in real notebook state.
Here's the exam workflow that started all of this, now collapsed into a script:
const sdk = new NotebookLMClient();
await sdk.connect();
const nb = await sdk.notebooks.create({ title: "OS Final", emoji: "📚" });
await sdk.sources.add.batch(nb.projectId, {
sources: pdfPaths.map((path) => ({
type: "file",
content: fs.readFileSync(path),
fileName: path,
})),
waitForProcessing: true, // block until NotebookLM finishes ingesting
});
const quiz = await sdk.artifacts.create(nb.projectId, ArtifactType.QUIZ, {
customization: { numberOfQuestions: 20, difficulty: 3 },
});
await sdk.artifacts.download(nb.projectId, quiz.id, "./os-final-quiz.json");Five clicks, gone. The thing I used to do by hand before every exam now runs while I make coffee.
Sources don't even have to be files you already have. Point the SDK at the open web and let NotebookLM do the research first:
const { sessionId, web } = await sdk.sources.add.web.searchAndWait(nb.projectId, {
query: "transformer attention mechanism, explained simply",
mode: ResearchMode.DEEP,
});
// Keep the five most relevant hits and add them as sources.
await sdk.sources.add.web.addDiscovered(nb.projectId, {
sessionId,
webSources: web.slice(0, 5),
});Auto-Chunking
NotebookLM has limits. 500,000 words per source. 200MB per file. Users don't care about limits. They want to upload their entire research corpus and have it work.
When you try to add a text source with 1.2 million words, the SDK silently splits it into three 400K-word chunks and uploads them as separate sources. Each chunk gets metadata — chunk index, word boundaries, byte ranges — so you can reconstruct the original if needed. Small sources still return a simple string ID. Large ones return an AddSourceResult with wasChunked: true and all the chunk IDs.
const result = await sdk.sources.add.text(notebookId, {
title: 'My Massive Research Paper',
content: aMillionWordString,
});
if (typeof result === 'string') {
// Small source, single ID
} else if (result.wasChunked) {
console.log(`Split into ${result.chunks.length} chunks`);
}The feature that made the SDK feel production-ready.
The Artifact Engine
NotebookLM's artifacts are its killer feature. From the same set of sources: quizzes, flashcards, mind maps, reports, infographics, slide decks, audio overviews (two-person podcast discussions), video overviews (animated explainers).
Each artifact type uses a different RPC method with a different payload structure. I unified all of it behind a single interface:
const quiz = await sdk.artifacts.create(notebookId, ArtifactType.QUIZ, {
customization: { numberOfQuestions: 3, difficulty: 3 },
});
const audio = await sdk.artifacts.audio.create(notebookId, {
customization: { format: 0, language: 'en', length: 2 },
});Same pattern, every time: create, poll until ready, download. What comes back depends on the type — quizzes and flashcards as structured JSON, audio overviews as audio files, video overviews as MP4, slide decks as PDF or PNG. The SDK normalizes the create call and the polling; the download method knows how to unpack each format.
Supporting 80+ languages was its own thing. Each artifact type handles language differently. Audio and video accept a language code. Reports infer from the notebook's default. Quizzes and flashcards need the language instruction baked into the text prompt. I built a NotebookLanguageService to manage all of this — set it once at the notebook level and everything downstream respects it.
Killing Features
One of the hardest parts of building against an undocumented API is knowing when to kill things.
I built a chat-with-source example — then realized chat-basic already supported source selection. Redundant. Killed it. Built a document guideline generator — over-engineered, achievable with a chat prompt. Killed it. Built loadContent() to extract source text — worked for a while, then started returning "Service unavailable" after a Google update. Deprecated it.
Every dead feature is a maintenance burden. Every deprecated method is a promise you'll eventually break. Getting comfortable with removal made the SDK better than getting comfortable with accumulation.
Limits and quotas
A consumer product you're not supposed to automate still enforces hard limits — and they scale with your Google account's plan. The SDK bakes these numbers in, so you can check sdk.getRemaining('chats') before a long job instead of discovering the ceiling halfway through one:
| Limit | Standard | Plus | Pro | Ultra |
|---|---|---|---|---|
| Notebooks | 100 | 200 | 500 | 500 |
| Sources / notebook | 50 | 100 | 300 | 600 |
| Words / source | 500K | 500K | 500K | 500K |
| File size | 200 MB | 200 MB | 200 MB | 200 MB |
| Chats / day | 50 | 200 | 500 | 5,000 |
| Audio + video / day | 3 | 6 | 20 | 200 |
| Reports / day | 10 | 20 | 100 | 1,000 |
| Deep research / month | 10 | 90 | 600 | 6,000 |
Enforcement is server-side no matter what; the client-side check is opt-in (enforceQuotas: true). A few other rules the SDK simply has to work around:
- Sources cap at 500,000 words or 200 MB — the exact reason the auto-chunker exists.
- Copy-protected PDFs refuse to import. Nothing to do about it client-side, so the SDK surfaces the failure instead of hanging on it.
- No batch delete. Google removes notebooks one call at a time, so the SDK fans the requests out itself and reports which IDs failed.
What you can build on it
A programmatic interface turns NotebookLM from a website into a primitive you can wire into anything. A few things that go from tedious to trivial:
- A study tool on top of NotebookLM. Point it at a folder of course material and generate a full deck per chapter — quiz, flashcards, mind map, audio overview — on a schedule, with nobody clicking through the UI. (This is the one I actually wanted.)
- Automated audio briefs. Feed it an RSS feed, a newsletter, or a week of meeting notes, and produce a two-person podcast overview every morning. The audio overview was NotebookLM's killer demo; here it's a cron job.
- Bulk repurposing pipelines. Drop 500 podcast episodes or support tickets in as sources and batch-generate summaries, study guides, and blog drafts — each grounded in the real transcript, with citations.
- An internal knowledge base that cites itself. Sync a team's docs into notebooks and answer questions through
generation.chat(), anchored to actual sources instead of a model's best guess.
And the one I keep coming back to: NotebookLM as a tool an AI agent can call. Grounded retrieval is the hard part of every agent, and people stand up entire RAG stacks to get it. With the SDK, an agent can create a notebook, add sources, chat to synthesize, and pull structured artifacts back out — all programmatically. Wrap it in an MCP server or a single tool definition and any agent framework gets a research-and-grounding skill for free.
Google built an incredible product. I built the bridge to it.
The bridge needs maintenance — every Google deploy is a potential break, and I expect to be patching obfuscated method names for as long as NotebookLM keeps shipping. But that's the deal you sign when you build against a product that never meant to be built on. Worth it.
notebooklm-kit is MIT-licensed on GitHub and on npm, where it pulls roughly 1.7K downloads a week — issues and PRs welcome.Footnotes
-
Google does ship a real API — for NotebookLM Enterprise, locked behind Google Cloud, Gemini Enterprise licensing, and IAM roles. The consumer product at
notebooklm.google.com, the one everyone actually uses, has nothing. ↩