Google's New File Limits: What the 2MB HTML Cap Actually Means for Your Site
Google quietly updated its Googlebot documentation in early February 2026, and the SEO world collectively panicked. The headlines screamed about a massive reduction in crawl limits, from 15MB down to just 2MB for HTML files. But here's the thing: most of that panic was completely unnecessary.
Let's cut through the noise and look at what actually changed, who needs to worry, and what you should do about it.
What Actually Changed
On 3 February 2026, Google clarified its file size limits for different types of content. This wasn't a sudden policy shift. It was Google finally bothering to document what Googlebot has been doing for ages.
Here's what Googlebot will now crawl for Google Search specifically:
- HTML and text-based files: 2MB
- PDF files: 64MB
- Other file types: 2MB
The 15MB limit still exists, but that's for Google's broader crawling infrastructure. Think Google Shopping, Google News, and other products. For organic search results, it's now officially 2MB for HTML.
Before You Panic: Check Your Actual File Sizes
The median HTML page on the web is about 33KB. Even at the 90th percentile, pages only hit around 151KB. To reach the 2MB threshold, your HTML would need to be roughly 60 times larger than the average page.
Unless you're embedding entire novels inline or dumping massive JSON objects directly into your HTML, you're probably fine.
The Real Problem: Silent Truncation
Here's where it gets concerning. When your HTML exceeds 2MB, Googlebot doesn't tell you. There's no warning in Search Console. No helpful error message. It just quietly stops reading after 2MB and indexes whatever it managed to grab.
Testing by Spotibo confirmed this. Pages over 2MB were silently truncated. Files over 16MB got a generic error with zero useful information. Google Search Console stayed completely silent about the whole thing.
This means if you've got bloated pages, you might be losing content, structured data, or important links without ever knowing.
What Gets Cut Off
If your HTML hits that 2MB wall, everything after that point is invisible to Google. That could include:
- Schema markup sitting in your footer
- Internal links to other pages
- Product descriptions if you're running e-commerce
- Any content lazy-loaded via inline scripts
The good news? External resources don't count. Your CSS files, JavaScript bundles, and images are fetched separately. They have their own limits and won't eat into your 2MB budget.
PDFs Get the VIP Treatment
Interestingly, Google gives PDF files much more breathing room at 64MB. That's 32 times more than HTML files get.
Why the difference? PDFs are typically standalone documents, white papers, or reports. They're meant to be comprehensive. HTML pages, on the other hand, should be reasonably sized and link to other resources rather than cramming everything into one file.
Who Actually Needs to Worry
Most sites can ignore this completely. But there are a few scenarios where this matters:
Single-Page Applications
If you're running a heavy SPA that dumps tons of JavaScript and data directly into the initial HTML, you might be pushing limits. Check your actual HTML file sizes, not your total page weight.
Sites with Massive Inline Scripts
Some developers love embedding entire libraries inline to save HTTP requests. If you're doing this excessively, it could be a problem.
Pages with Embedded Data Objects
E-commerce sites sometimes embed product catalogues or massive JSON-LD objects directly into HTML. If you're doing this on category or listing pages, double-check your file sizes.
Legacy Sites with Bloated Code
Old sites that have accumulated years of inline styles, scripts, and forgotten code might be heavier than you think.
How to Check Your Pages
You've got a few options for testing:
Chrome DevTools
Open DevTools, go to the Network tab, and reload your page. Look at the Size column for your HTML document. If it's under 2MB, you're golden.
Tame the Bots Simulator
On 6 February, Tame the Bots added a 2MB truncation feature to their fetch and render tool. John Mueller from Google even gave it a nod of approval. It'll show you exactly what Googlebot sees.
Command Line
If you're technical, just curl your page and check the response size:
curl -I https://yoursite.com/page | grep -i content-length
What to Do If You're Over the Limit
If you discover pages exceeding 2MB, here's your action plan:
1. Move Inline JavaScript to External Files
Stop embedding entire libraries in your HTML. External JS files don't count towards the 2MB limit and can be cached by browsers anyway.
2. Externalise CSS
Same logic applies. Move inline styles to external stylesheets.
3. Clean Up JSON-LD and Schema
If you're embedding massive structured data objects, consider whether you really need all that data inline. Keep schema markup focused and relevant.
4. Split Large Pages
If you've got genuinely huge content, consider pagination or splitting into multiple pages. Not only does this help with crawling, but it's probably better for users too.
5. Remove Dead Code
Old commented-out sections, unused scripts, forgotten tracking pixels. All that cruft adds up. Do some spring cleaning.
The PDF Advantage
Here's an interesting angle: if you're publishing long-form content or resources, consider offering a PDF version alongside your HTML.
With a 64MB limit, PDFs can contain significantly more content whilst still being fully indexed. For white papers, research reports, or comprehensive guides, this could be a smart move.
Just make sure you're not creating duplicate content issues. Use canonical tags properly if you're serving the same content in multiple formats.
Why Google Made This Change
Google hasn't explicitly said why they documented this 2MB limit now, but it's not hard to guess.
Crawling costs money. Every byte Google downloads costs bandwidth and processing power. With the web growing exponentially, Google needs to be more efficient about what it crawls.
The 2MB limit encourages better web development practices. Sites should be lean, fast, and well-structured rather than massive single-page monoliths.
It's also worth noting this isn't really new. The limit has existed for a while, Google just never bothered documenting it clearly. They're finally being upfront about it.
Looking Forward
Will these limits get stricter? Probably not in the near term. 2MB is already generous for HTML content. But it's a reminder that Google rewards efficient, well-optimised sites.
The bigger lesson here is that web performance matters. Not just for user experience or Core Web Vitals, but for basic crawlability too. If your HTML files are anywhere near 2MB, you've got bigger problems than just Googlebot limits.
The Bottom Line
For 95% of sites, this change means absolutely nothing. Your pages are nowhere near 2MB, and you can carry on as normal.
If you're in that 5% with genuinely large HTML files, now's the time to audit and optimise. Not because Google's being mean, but because bloated HTML was probably hurting your site anyway.
The real frustration here isn't the limit itself. It's the silent truncation with no Search Console warnings. Google should really fix that. Until they do, regular audits with tools like Tame the Bots are your best defence.
And if you're publishing long-form content? Maybe it's time to reconsider the humble PDF. At 64MB, you've got plenty of room to work with.


