Project Glasswing: Securing critical software for the AI era (10 minute read)
Anthropic's Claude Mythos Preview autonomously identified thousands of zero-day vulnerabilities across major operating systems and browsers. Project Glasswing, in partnership with major tech companies, uses these capabilities to enhance cybersecurity by detecting and fixing vulnerabilities at scale. Anthropic plans to develop safeguards and broaden industry cooperation to address security challenges in the AI era.
|
GLM-5.1: Towards Long-Horizon Tasks (14 minute read)
GLM-5.1 is a flagship model for agentic engineering created by Z.ai. It achieves state-of-the-art performance on SWE-Bench Pro. The model is built to stay effective on agent tasks over much longer horizons than previous generations. It can sustain optimization over hundreds of rounds and thousands of tool calls. The model breaks complex problems down, runs experiments, reads results, and identifies blockers with real precision.
|
|
My picture of the present in AI (11 minute read)
Ryan Greenblatt is the chief scientist at Redwood Research, a research organization with the mission of aligning superhuman AI. This post goes through some of his best guesses for the current situation of AI. The scenario forecast discusses R&D access regulations, engineering capabilities and qualitative abilities, misalignment and misalignment-related properties, cyber, bioweapons, and economic effects. Some of the claims are highly speculative, while others are better grounded.
|
Claude Mythos (31 minute read)
Anthropic detailed early evaluations of Claude Mythos Preview, highlighting strong performance in discovering zero-day vulnerabilities and reverse-engineering exploits, prompting a coordinated security initiative called Project Glasswing.
|
AI Can't Read an Investor Deck (6 minute read)
Current AI models struggle with interpreting complex financial documents, especially with visual data extraction. GPT-5.4, Gemini 3.1 Pro, and Claude Opus 4.6 consistently falter when processing dense charts and images, only achieving 56% to 64% accuracy compared to 72% to 80% with text-only inputs. These findings highlight significant gaps in AI's ability to perform real-world financial reasoning tasks, making the displacement of financial analysts with AI appear premature.
|
|
TorchTPU: Running PyTorch Natively on TPUs at Google Scale (10 minute read)
Google's Tensor Processing Units (TPUs) are foundational to the company's supercomputing infrastructure. The company's custom ASICs power training and serving for Google and its Cloud customers. TorchTPU is a stack that makes it easy for the AI community to access the full capabilities of TPUs. It provides the APIs and tools needed to extract every ounce of compute from Google's hardware. This post takes a look under the hood at the engineering principles behind TorchTPU.
|
TriAttention for KV Cache Compression (GitHub Repo)
TriAttention estimates KV importance in pre-RoPE space using stable Q/K centers and distance-based scoring, preserving long-context reasoning quality while sharply reducing KV memory use and improving throughput.
|
|
We're actually running out of benchmarks to upper bound AI capabilities (7 minute read)
METR's Time Horizon suite is being saturated. Frontier AI models can reliably do all but maybe a dozen or so tasks in the suite, making it hard to upper bound their time horizon. New benchmarks are becoming more expensive to grade and create. The situation will likely get worse as AI progress continues. It is likely that, by mid-2027, no benchmark score from a 2026 or earlier benchmark will be able to rule out dangerous capabilities from frontier AI systems.
|
Elon Musk Asks for OpenAI's Nonprofit to Get Any Damages From His Lawsuit (3 minute read)
Elon Musk's lawsuit against OpenAI is expected to go to trial later this month in Oakland, California. Musk has amended the lawsuit to ask that any damages he might win be awarded to OpenAI's charitable arm rather than to himself. The amendment also asks that OpenAI CEO Sam Altman be removed from the OpenAI nonprofit's board. Musk is seeking more than $150 billion in damages from both OpenAI and Microsoft as he believes OpenAI strayed from its non-profit mission and defrauded him as a donor in seeking to convert to a for-profit company.
|
|
|
Love TLDR? Tell your friends and get rewards!
|
|
Share your referral link below with friends to get free TLDR swag!
|
|
|
|
Track your referrals here.
|
|
|
|