This Week in Turing Post: | Wednesday, AI 101, Model: New and old models of Liquid AI Friday, Interview: discussing coding with Amjad Masad from Replit
| Our news digest is always free. Upgrade to receive our deep dives in full, directly into your inbox. If you want to support us without getting a subscription β do it here. |
|
| Last week, a lot of interesting reports, articles, and discussions were about coding. | Anthropicβs new Economic Index shows how fast AI is reshaping software development. By analyzing 500,000 coding-related interactions across Claude.ai and Claude Code, they found that automation is moving faster β and deeper β than most realize. | Claude Code users hit 79% automation rates, compared to Claude.aiβs 49%. The distinction between coding with AI and coding by AI is starting to evaporate. Today, developers still hover over the process, validating outputs. But itβs not clear how long that safety net will last. | Surprisingly, itβs not just backend grunt work thatβs being automated. Most of the action is in building user-facing web and mobile apps β JavaScript, TypeScript, HTML, CSS. Which means simple UI jobs, not the deep backend, might be the first to collapse under AI pressure. | Startups are sprinting ahead: 33% of Claude Code conversations revolve around startup projects. Only 13% come from enterprises. Corporate inertia, risk aversion, security bottlenecks β all giving startups an opening that could soon look like a canyon. | Sure, these are Anthropic users β early adopters by definition. But the trend is clear. Coding is turning into an AI-first economic function. Developers are the first wave. If you're still thinking in terms of lines of code instead of workflows and systems management, the futureβs already moving past you. | Microsoft, meanwhile, published 2025 Work Trend Index Annual Report, whey they introduce the "Frontier Firm": a new breed of organization where AI agents work alongside humans at scale. With intelligence "on tap," org charts will collapse, humans will become "agent bosses," and every workflow gets rewritten. In their mind this is the next step after AI reforming software development. | Now, about the "vibes." | If you noticed, we never wrote about "vibe coding." Nowβs the time. Because even Andrej Karpathy is now confused. | When he coined "vibe coding" back in February 2025, he meant something very specific: fast, experimental, throwaway projects built with AI. Not serious production work. Not responsible software engineering. "Forget the code exists," he said. Accept all suggestions, donβt read the diffs, fix bugs by guessing, move fast. Perfect for prototypes and weekend hacks β and fun. | Simon Willison, who has cataloged dozens of experiments using this style, stresses that vibe coding is not the same as professional AI-assisted development. Vibe coding is playful and low-stakes. Itβs about seeing how far you can get without worrying about long-term consequences. It has its place β especially for beginners or for sparking new ideas. | But even Karpathy now seems uneasy with how the term has ballooned. In recent posts, heβs been careful to separate "real coding" and "AI-assisted coding" from pure vibe experiments. Itβs clear heβs looking for better language β something that captures the idea of working conversationally with AI while still doing serious engineering. (I suggested βco-codingβ β a term that hints at both collaboration and conversation β as a possible middle ground.) | | The bigger problem is that "vibe coding" is already bleeding into broader use. People casually label all AI-assisted programming as "vibe coding," flattening important distinctions. If executives, investors, or policymakers hear that AI coding is just vibes, it risks undermining real progress β and misrepresenting the actual demands of building secure, maintainable, responsible software with AI in the loop. | Responsible AI-assisted coding still demands rigor: reviewing code, testing it, understanding it, explaining it. If you donβt do that, youβre vibe coding. If you do, youβre engineering β AI or no AI. | So while AI is truly eating software development we are floundering about in vibes, which β historically! β is something completely different. |  | the evolution of actual vibes |
| The real vibes were noticed in the article The Hidden Cost of AI Coding β a quiet grief for the fading joy of flow in programming, as AI shifts creation from an act of immersion to one of curation. | And much more cheerful was Andrew Ng, who argues that AI is making programming languages less of a barrier: if you understand core coding concepts, you can prompt LLMs to generate working code across many languages β becoming simply a βdeveloper,β not just a βPythonβ or βJavaβ one. He doesnβt use any βvibeβ language β understanding ideas like arrays, memory, and system design still matters to guide the AI and debug effectively. | Welcome to Monday. Where staying precise with how we talk about AI is part of keeping control. | | π Weβve noticed our open rate falling. Please move us to your Primary inbox π |
|
| | Curated Collections | | | We are reading/watching | | | News from The Usual Suspects Β© | AI in Politics At a recent Politburo session, President Xi Jinping called for accelerated AI self-reliance, urging deeper investment in chips, infrastructure software, and risk management systems. President Trump signed an executive order establishing a White House Task Force on AI Education, aiming to embed AI literacy across Kβ12 education.
Nous Labs Cognition Cognition, the minds behind Devin, unveiled DeepWiki β a free Wikipedia-style guide for GitHub repositories. Swap a GitHub URL for DeepWikiβs, and youβre greeted with a clean breakdown of the code plus a Devin-powered chatbot to navigate it.
Microsoft Copilot goes full agentic: AI-powered Search, Create, project-style Notebooks, and an Agent Store β positioning Copilot as a front-end for a fleet of enterprise agents.
OpenAI The latest GPT-4o update improves STEM reasoning, memory handling, and conversational goal alignment. A notable addition: subtle "fuzzy" upgrades make the model better at intuiting implied intent β an important step toward building more natural, less rigid AI systems. Leaked investor decks show OpenAI projecting $125 billion in revenue for 2029 and $174 billion for 2030, driven by agent subscriptions, βfree-user monetization,β and new products beyond ChatGPT.
Anthropic Anthropic outlines its evolving framework for assessing and mitigating AI harms β from catastrophic risks to everyday misuse. Important insights into balancing helpfulness, privacy, and risk management. Anthropic announces a council of leading economists, including Tyler Cowen and John List, to guide research on AIβs economic impact, especially on labor markets and broader socioeconomic systems. Anthropic detects and counters malicious use of its models β including influence operations, credential stuffing, recruitment fraud, and malware development.
Meta
| Models to pay attention to: | π Hyena Edge: design a convolution-based hybrid architecture to outperform Transformer models in speed, memory, and quality on smartphones and other edge devices. πTina: Tiny Reasoning Models via LoRA: achieve strong reasoning capabilities with tiny models by applying cost-efficient low-rank adaptation and reinforcement learning. πKimi-Audio: build a universal audio foundation model for understanding, generating, and conversing in audio and text, achieving SOTA across diverse benchmarks. Aimo-2 winning solution: build state-of-the-art mathematical reasoning models with OpenMathReasoning dataset. Eagle 2.5: expand vision-language models to handle long-context video and image comprehension with specialized training tricks and efficient scaling. Trillion-7B: develop a highly token-efficient multilingual LLM using specialized cross-lingual techniques for Korean, Japanese, and more. Surya OCR: release an open-source, high-speed OCR model supporting 90+ languages with LaTeX formatting and structured output for real-world document processing. πProcess Reward Models That Think introduces ThinkPRM, a generative verifier that scales step-wise reward modeling with minimal supervision. Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning advances multimodal reasoning with a hybrid RL paradigm balancing reward guidance and rule-based strategies.
| The freshest research papers, categorized for your convenience | There were quite a few TOP research papers this week, we will mark them with π in each section. | Reasoning, Decision-Making, and Agents | Does RL Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model critiques RL's ability to create genuinely new reasoning abilities in LLMs, showing it mostly reweights existing paths βread the paper. π TTRL: Test-Time Reinforcement Learning introduces a method for self-evolving LLMs at test-time using reward signals without labeled data βread the paper. Learning to Reason under Off-Policy Guidance proposes LUFFY, a method that mixes on-policy and off-policy traces to generalize reasoning better βread the paper. FlowReasoner: Reinforcing Query-Level Meta-Agents builds a meta-agent that dynamically designs query-specific multi-agent systems via reasoning and RL βread the paper. Learning Adaptive Parallel Reasoning with Language Models develops APR to orchestrate serialized and parallel computations adaptively for faster and more efficient reasoning βread the paper. πLLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities analyzes how RL fine-tuning improves exploration and decision-making abilities of LLMs βread the paper. Causal-Copilot: An Autonomous Causal Analysis Agent builds a domain-agnostic autonomous agent for full causal analysis pipelines βread the paper. πPaper2Code automates end-to-end ML paper-to-code translation with a multi-agent framework βread the paper.
| Pretraining, Data Selection, and Scaling | π Efficient Pretraining Length Scaling presents PHD-Transformer to enable efficient long-context pretraining without inflating memory costs βread the paper. QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining proposes a unified data sampling strategy that balances quality and diversity during pretraining βread the paper.
| Architecture, Optimization, and Efficiency | πThe Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs investigates sparse attention trade-offs and proposes scaling laws for long-context LLMs βread the paper. BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs enables native 4-bit activation quantization in ultra-low-bit LLMs using Hadamard transformations βread the paper.
| Learning Frameworks and Representations | πRoll the dice & look before you leap: Going beyond the creative limits of next-token prediction highlights limitations of next-token prediction and proposes noise-injection strategies for open-ended creativity βread the paper. I-Con: A Unifying Framework for Representation Learning presents a theoretical framework unifying clustering, contrastive learning, and other methods via information geometry βread the paper. Interpretable Non-Linear Dimensionality Reduction Using Gaussian Weighted Linear Transformation combines interpretability and non-linearity for dimensionality reduction with Gaussian-weighted transformations βread the paper.
| Safety, Evaluation, and Societal Impact | A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment delivers a full-lifecycle safety survey from data generation to LLM commercialization βread the paper. πValues in the Wild: Discovering and Analyzing Values in Real-World Language Model Interactions maps AI value expressions across real-world interactions to inform grounded AI value alignment βread the paper.
| | Thatβs all for today. Thank you for reading! Please send this newsletter to your colleagues if it can help them enhance their understanding of AI and stay ahead of the curve | Leave a review! | | |
|