Google Turboquant ⚡, ARC-AGI-3 🧩, Manus founders detained 👮🏼

From:

TLDR AI <dan@tldrnewsletter.com>

To:

Hidden Recipient <hidden@emailshot.io>

Date:

3/26/2026, 1:47 PM

TLDR

Together With

TLDR AI 2026-03-26

To get AI pricing right, you need a new playbook (Sponsor)

Charging per seat in SaaS once made sense, but not now, when your product charges per token, per API call, or per resolved ticket.

Metronome's Pricing Experimentation Playbook is the step-by-step guide to building and iterating on pricing models that fit how AI products actually deliver value. You'll learn:

✅ How to design a monetization operating model that connects pricing to product and revenue outcomes

✅ How to run safe pricing experiments without engineering sprints or breaking existing contracts

✅ When to use hybrid, usage-, outcome-, or seat-based pricing – with real examples from Cursor, Intercom Fin, Clay, and others

Download the playbook

🚀

Headlines & Launches

Google's TurboQuant AI-compression algorithm can reduce LLM memory usage by 6x (4 minute read)

Google's TurboQuant is a compression algorithm that reduces the memory footprint of large language models while also boosting speed and maintaining accuracy. It reduces the size of the key-value cache so it doesn't have to be recomputed. Early testing shows TurboQuant results in an 8x performance increase and 6x reduction in memory usage without a loss of quality. Compression techniques like TurboQuant could improve the quality of outputs of models for edge devices without having to send data to the cloud.

ARC-AGI-3 is out (2 minute read)

ARC-AGI-3 was designed to evaluate agentic intelligence via interactive reasoning environments. Beating it will mean an AI system matches or exceeds human-level efficiency on all environments upon seeing them for the first time. 100% of the environments are solvable by humans on first contact with no prior training or instruction. All frontier AI reasoning models currently solve under 1%.

Nvidia-Backed Startup Seeking to Counter Chinese AI Eyes $25 Billion Valuation (3 minute read)

Reflection is a startup leading an effort to create freely available US AI systems. It is one of a handful of Nvidia-linked startups seeking to build a network of open source AI models. The startup is in talks to raise $2.5 billion at a valuation of $25 billion. Investors describe Reflection as the 'DeepSeek of the West' as it offers an alternative to the open source models offered by Chinese companies.

Leaders of AI Firm Bought by Meta Are Restricted From Leaving China (4 minute read)

Manus' co-founders, Xiao Hong and Ji Yichao, have been told not to leave China while authorities review the company's $2.5 billion sale to Meta. Early versions of Manus were created by engineers from a Chinese company. A Singapore-based entity then took over Manus' operations and relocated most of its China-based employees to Singapore, which made it possible for Meta to purchase it. Authorities are concerned that Manus' moves could encourage other Chinese companies to follow suit and move out of the country without vetting.

🧠

Deep Dives & Analysis

Closed Source vs Open Source AI: A Cage Fight Few People Understand (13 minute read)

Open source models are reaching parity with frontier labs' models, making those labs' equity look overpriced if they're simply utilities. These frontier labs have enterprise agreements, safety certifications, distribution, research talent, and regulatory positioning, but that doesn't explain their moat. People focus on capability, but the number that actually matters for valuations is the monetizable spread, the subset of that capability delta that someone will actually pay a premium for. The monetizable spread is declining faster than the capability spread.

Quantization from the ground up (35 minute read)

Quantized models are actually pretty good. A 16-bit to 8-bit quantization carries almost no quality penalty - the difference with a 4-bit quantization is more noticeable, but it would only perform about 90% as well as the original. It's worth experimenting with these models as they're much smaller and can run on more systems. This article explains how model parameters work, what quantization is, how quantization is applied in practice, and the effects of quantization on model accuracy.

Final training runs account for a minority of R&D compute spending (12 minute read)

The final training run for a model is only the last step in a long, expensive process. Before the run, companies burn compute on running experiments at various scales, generating synthetic data, testing ideas, and training unreleased models. The full cost of developing a model is much higher than the cost of the final training run of a frontier model. Most of the spend is on exploration rather than execution. Companies that learn from their competition can replicate their results for a fraction of the original cost.

🧑‍💻

Engineering & Research

The open platform for AI-powered enterprise search (Sponsor)

How can enterprises start using the massive amounts of data they're sitting on? Read the special report by 451 Research to discover the latest innovations in RAG and beyond. When you're ready to start building, explore enterprise search with OpenSearch - the trusted, open source platform for AI-powered search, analytics, and vector database solutions.

How OpenAI Creates Its Model Spec (29 minute read)

OpenAI outlined the philosophy and structure behind its Model Spec, a framework defining desired model behavior, safety principles, and how systems should follow instructions and resolve conflicts.

What Happened When I Applied Karpathy's Autoresearch Idea to LLM Inference (3 minute read)

Manthan Gupta built Auto-Inference-Optimiser to let an AI agent hill-climb on LLM inference speed while keeping quality fixed on Apple Silicon. Argmax sampling and simplifying inference code gave the largest throughput gains, while most tuning knobs and KV cache quantization hurt or had no effect. The project highlights that a disciplined, observable harness is critical for distinguishing real performance wins from noise or benchmark illusions.

🎁

Miscellaneous

Inside the grind: The SF startup racing to build an AI software engineer (14 minute read)

Cognition's Devin is an AI software engineer that can build software from start to finish without human involvement. When it launched in 2024, it was considered a step toward a long-held Silicon Valley dream of a machine that codes for you. Cognition's CEO, Scott Wu, believes that the technology doesn't mean the end of software engineering. Rather than eliminate engineers, Cognition's tools will allow them to focus on the best parts of the job while sparing them from the grunt work that traditionally consumes most of their time.

OpenAI Safety Bug Bounty Program (3 minute read)

OpenAI launched a public bug bounty program targeting AI misuse and safety risks, expanding beyond traditional security vulnerabilities to include abuse scenarios.

⚡

Quick Links

Make your raw meeting notes awesome with Granola. No awkward meeting bot required (Sponsor)

Granola takes your raw call notes, enriches them, and makes them instantly valuable. Share with your team or prompt it to draft follow ups, even on iPhone. Use promo code TLDR1MO

AI's Bundling Moment (2 minute read)

AI companies are shifting from narrow solutions to broad platforms, driven by rapid model changes.

Harvey Reached $11B Valuation (2 minute read)

Harvey raised $200 million in a new round led by GIC and Sequoia, bringing its valuation to $11 billion and total funding past $1 billion.

Google released Lyria 3 Pro (3 minute read)

Lyria 3 Pro extends the maximum track length to three minutes and adds finer control over song structure and customization.

Love TLDR? Tell your friends and get rewards!

Share your referral link below with friends to get free TLDR swag!

https://refer.tldr.tech/39389a05/2

Track your referrals here.

Want to advertise in TLDR? 📰

If your company is interested in reaching an audience of AI professionals and decision makers, you may want to advertise with us.

Want to work at TLDR? 💼

Apply here, create your own role or send a friend's resume to jobs@tldr.tech and get $1k if we hire them! TLDR is one of Inc.'s Best Bootstrapped businesses of 2025.

If you have any comments or feedback, just respond to this email!

Thanks for reading,
Andrew Tan, Ali Aminian, & Jacob Turner

Manage your subscriptions to our other newsletters on tech, startups, and programming. Or if TLDR AI isn't for you, please unsubscribe.

Similar newsletters

There are other similar shared emails that you might be interested in: