This Week in Turing Post: |
|
Our news digest is always free. Upgrade to receive our deep dives in full, directly into your inbox. | |
|
|
Apologies for the delay β the whole internet was not super functional yesterday. |
|
|
My path into machine learning started with the book The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World, written by Pedro Domingos in 2015. If you read Pedro Domingos on Twitter, you might hate him β heβs famously provocative and doesnβt care much about hurting feelings. |
Recently, I went to IA Summit organized by Madrona, one of the most forward-thinking VC funds focused on AI. And there he was, Pedro Domingos himself, challenging speakers with sharp questions. βHe might tell me off right on the spot,β I thought β but curiosity overpowered any assumptions, and I asked if I could sit next to him. βYou know,β I said, βmy ML journey started with your book.β It might have melted his heart βIβm not sure β but we had a great conversation, discussing reasoning machines, reinforcement learning, and of course, I asked if he was finally working on the Master Algorithm Iβd been waiting for since reading his book. |
βIβm actually very close to publishing it,β he said. |
And so, last week, the world was quietly introduced to Tensor Logic β the paper Domingos believes is the closest realization of the Master Algorithm yet. It slipped under the radar, and thatβs exactly why I want to draw attention to it. |
While the title sounds abstract, Tensor Logic is a serious attempt to do what Domingos has promised for a decade: to find a common language for all of AI. His argument is simple and radical β neural networks, symbolic logic, and probabilistic reasoning are not different fields at all. They are the same operation written in different notations. Logical rules, he shows, can be expressed as Einstein summations over tensors. In this view, everything from transformers to Prolog programs to Bayesian networks can be built from a single primitive: the tensor equation. |
If that sounds theoretical, itβs not. Domingos is proposing a new programming language for AI (and promises to open a repo soon) β one where both learning and reasoning live in the same algebra and run directly on GPUs. In Tensor Logic, a neural layer, a logical rule, and a probabilistic inference step all compile to the same structure. No Python scaffolding, no glue code between symbolic and neural components β just equations. Itβs mathematically elegant and potentially a foundation for the next generation of AI infrastructure. |
 | Image credit: Slides on Tensor-logic.org |
|
The most intriguing part is what he calls reasoning in embedding space. Reasoning in embedding space matters right now because it hits the nerve of where the field is stuck. In Tensor Logic, facts and rules live inside vector embeddings. At low βtemperature,β the system behaves like pure logic β provable, deterministic, and free of hallucinations. As the temperature rises, reasoning becomes analogical: similar concepts borrow inferences from each other. This βlogic-to-analogyβ continuum could bridge the gap between the reliability of symbolic reasoning and the pattern recognition of LLMs. |
Thatβs why this paper might become a big deal. Todayβs LLMs are fluent imitators but clumsy reasoners. They produce text without a formal system of truth. Tensor Logic offers a way to give them one β a mathematical substrate for reasoning, not just stohastic pattern complition. Can it become to AI what calculus was to physics? We are yet to see. But it certainly worth exploring. |
|
π€ Want to scale your AV data workflows 10x faster? |
|
Most AV teams can segment LiDAR and camera data, but struggle to iterate quickly, detect rare events, and scale workflows reliably. The teams that succeed combine curation, annotation, and model evaluation all in one place. |
Encord is the universal data layer trusted by the worldβs leading ADAS & AV teams, like Woven by Toyota and Zipline. |
We recommend to join Encordβs LiDAR experts on Oct. 28 for a masterclass on how to: |
Visualize and curate multimodal data, including LiDAR and radar Automate 3D segmentation of obstacles with single-shot labeling and object tracking Create robust, scalable pipelines for model training and evaluation
|
Join live or sign up below to catch the replay β |
|
|
Topic 2: With so much out there, attention gets stretched too thin. This time, we are focusing on the overlooked topics in the conversation between Andrej Karpathy and Dwarkesh Patel. AGI Bubble and AI Aristocracy. Watch it hereβ |
 | Andrej Karpathy and Dwarkesh Patel β Popping the AGI Bubble, Building the AI Aristocracy |
|
|
|
|
Links from the editorial: |
|
We are also reading/watching: |
|
|
|
Curated Collections β Learning is power |
|
|
News from The Usual Suspects Β© |
 | Elon Musk @elonmusk |  |
| |
The π recommendation system is evolving very rapidly. We are aiming for deletion of all heuristics within 4 to 6 weeks. Grok will literally read every post and watch every video (100M+ per day) to match users with content theyβre most likely to find interesting. This should | DogeDesigner @cb_doge
How the π algorithm really works? The algorithm doesnβt boost or hide posts randomly, it just tries to figure out if your post will be interesting to people. That means: If you post a plain link with no words, the algorithm doesnβt have much to judge, so it wonβt show it to |
| | | 4:07 PM β’ Oct 17, 2025 | | | | | | 95.1K Likes 12.1K Retweets | 14.3K Replies |
|
|
Hugging Face launches its Omni Chat. The initial version of HF Chat was somewhat disappointing β when I tried it, it didnβt work as expected. Since then, theyβve completely overhauled the design and functionality, turning it into a powerful router that intelligently selects the most suitable open-source model for each prompt. The new iteration looks great and performs impressively. Anthropic has introduced Agent Skills, a modular way to enhance Claudeβs capabilities for specific tasks like Excel work or adhering to brand guidelines. Skills are lightweight, portable folders containing code, resources, and instructions that Claude loads only when needed. Developers and teams can now create and manage custom skills across Claude apps, API, and Claude Codeβbringing more structure and precision to AI workflows. Related blog by Simon Willison: Claude Skills are awesome, maybe a bigger deal than MCP
|
 | Nanochat is out - itβs a fully open-source, end-to-end LLM pipeline that lets you build a working ChatGPT-style model from scratch in just a few hours and for around $100 β making the entire system readable, hackable, and personally owned. |
|
|
Amazing tutorial: Robot Learning β A Tutorial |
|
|
Models to pay attention to |
 | World Labs @theworldlabs |  |
| |
Introducing RTFM (Real-Time Frame Model): a highly efficient World Model that generates video frames in real time as you interact with it, powered by a single H100 GPU. RTFM renders persistent and 3D consistent worlds, both real and imaginary. Try our demo of RTFM today! | |  | | | 3:03 PM β’ Oct 16, 2025 | | | | | | 1.29K Likes 223 Retweets | 54 Replies |
|
|
DeepSeek-OCR: Contexts Optical Compression Researchers from DeepSeek AI released DeepSeek-OCR, a vision-text model designed to integrate document understanding into LLMs efficiently. It introduces Contexts Optical Compression, supporting native resolutions from 512Γ512 to 1280Γ1280 and a dynamic "Gundam" mode. Prompts enable markdown conversion, layout parsing, OCR, and visual grounding. Running at ~2500 tokens/sec on A100-40G GPUs, it supports both vLLM and Transformers. The model emphasizes efficient visual-token compression and supports flash attention 2.0 for acceleration. It is open-sourced under MIT license and optimized for OCR-rich tasks across diverse visual layouts βGitHub Fantastic (small) retrievers and how to train them: Mxbai-edge-colbert-v0 tech report Researchers from Mixedbread AI and Waseda University introduced mxbai-edge-colbert-v0, two late-interaction ColBERT models with 17M and 32M parameters. These outperform ColBERTv2 on BEIR despite lower embedding dimensions (48/64). Using ModernBERT backbones, multi-stage training (contrastive pre-training, fine-tuning, distillation), and optimized ablations, the 17M model supports 32k contexts, runs efficiently on CPU, and stores vectors with 2.5Γ less memory. It achieves 0.6405 NDCG@10 on NanoBEIR βread the paper Qwen3Guard technical report Researchers from Qwen introduced Qwen3Guard, a multilingual safety moderation model available in 0.6B, 4B, and 8B sizes, supporting 119 languages. It includes Generative Qwen3Guard for tri-class safety classification (safe, controversial, unsafe) and Stream Qwen3Guard for token-level real-time moderation. Qwen3Guard-Gen achieves state-of-the-art F1 scores on 8 of 14 English benchmarks, surpasses larger models on multilingual tasks, and supports response refusal detection. Stream Qwen3Guard achieves near real-time latency with only ~2-point performance drop, enabling efficient streaming safety interventions βread the paper A2fm: An adaptive agent foundation model for tool-aware hybrid reasoning Researchers from OPPO developed A2FM, a 32B model integrating three execution modesβagentic (tool-using), reasoning (chain-of-thought), and instant (direct answers). It uses a route-then-align strategy and introduces Adaptive Policy Optimization (APO) for efficiency-accuracy trade-offs. A2FM achieves 13.4% on BrowseComp, 70.4% on AIME25, and 16.7% on HLE. It surpasses 32B peers in cost efficiencyβ$0.00487 per correct answerβcutting cost by 45.2% vs reasoning mode. It ranks top across agentic, reasoning, and general benchmarks βread the paper
|
|
The freshest research papers, categorized for your convenience |
We organize research papers by goal-oriented or functional categories to make it easier to explore related developments and compare approaches. As always, papers we particularly recommend are marked with π |
Reinforcement Learning for Reasoning & Agents |
π π π The Art of Scaling Reinforcement Learning Compute for LLMs (by Meta et al.) β model computeβperformance curves, isolate which choices shift asymptotes vs. efficiency, and propose a predictable, scalable RL recipe. This paper marks the turning point where reinforcement learning becomes an engineering science β transforming opaque, trial-and-error reward tuning into a predictable, scalable process that can guide the next generation of reasoning and alignment in large models βread the paper
|
|
QeRL: Beyond Efficiency β Quantization-enhanced Reinforcement Learning for LLMs β combine low-precision NVFP4 + LoRA with adaptive quantization noise to speed rollouts, raise exploration entropy, and match full-tune reasoning accuracy at far lower compute βread the paper Agentic Entropy-Balanced Policy Optimization β balance entropy during rollouts and updates with pre-monitoring and entropy-aware advantages to stabilize long-horizon tool use and improve pass rates on web-agent tasks βread the paper Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization β identify attention patterns that mark critical tokens and assign targeted RL credit to them for consistent reasoning gains βread the paper LaSeR: Reinforcement Learning with Last-Token Self-Rewarding β replace costly verifier passes by aligning a last-token self-reward with reasoning rewards, improving RLVR training and test-time scaling at minimal extra inference cost βread the paper Information Gain-based Policy Optimization β compute dense, intrinsic turn-level rewards from belief updates to overcome reward sparsity in multi-turn agents and improve sample efficiency βread the paper Demystifying Reinforcement Learning in Agentic Reasoning β distill practical recipes across data, algorithms, and reasoning modes that let small models rival larger ones on agentic benchmarks βread the paper Stronger Together: On-Policy Reinforcement Learning for Collaborative LLMs β adapt grouped RL to multi-agent roles and turns, scaling cooperative planning, coding, and math accuracy βread the paper
|
Architectures, Efficiency & Compression |
Diffusion Transformers with Representation Autoencoders β replace VAEs with pretrained representation encoders plus trained decoders to yield richer latents, faster convergence, and state-of-the-art DiT image generation βread the paper Attention Is All You Need for KV Cache in Diffusion LLMs β refresh KV caches adaptively by attention-aware drift tests and depth-aware schedules to accelerate diffusion decoding without quality loss βread the paper Dr.LLM: Dynamic Layer Routing in LLMs β learn per-layer routers (skip/execute/repeat) from MCTS-discovered paths to cut compute while improving or preserving accuracy across tasks βread the paper π BitNet Distillation (by Microsoft) β distill full-precision LLMs into ternary-weight task models with SubLN, attention distillation, and warm-start pretraining to deliver large memory savings and faster CPU inference βread the paper
|
Retrieval & Knowledge Access |
πRAG-Anything: All-in-One RAG Framework (by Hong Kong University) β unify multimodal documents via dual-graph structure and cross-modal hybrid retrieval to reason over long, heterogeneous evidence βread the paper LLM-guided Hierarchical Retrieval β impose a semantic tree over large corpora and traverse with calibrated relevance to achieve logarithmic-complexity, zero-shot retrieval on reasoning-heavy datasets βread the paper
|
Multimodal & Representation Learning |
Scaling Language-Centric Omnimodal Representation Learning β leverage generative pretrainingβs latent cross-modal alignment and refine with contrastive learning, revealing a generation-representation scaling law βread the paper π OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM (by NVIDIA) β align audioβvision embeddings and curate omni-modal conversations to outperform larger omni models with far fewer tokens βread the paper
|
Theory, Evaluation & Practice |
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning β analyze sampling-based test-time scaling and propose a hybrid that improves confidence reliability while halving sampling cost βread the paper π The Role of Computing Resources in Publishing Foundation Model Research (by UT Austin, UCLA, Google) β quantify how compute access correlates with citations and advocate shared infrastructure to broaden participation βread the paper
|
Applications & Systems |
|
Philosophy & Framing |
|
Thatβs all for today. Thank you for reading! Please send this newsletter to colleagues if it can help them enhance their understanding of AI and stay ahead of the curve. |
How did you like it? |
|
|