This Week in Turing Post: |
Wednesday / AI 101 series: Kimi K2 Thinking and why all the hype Friday / Interview: Great conversation with Anneka Gupta, CPO at Rubrik
|
|
π Free Expert Guide to Multi-Agent Systems |
Multi-agent systems can handle more complex tasks, but are they worth the orchestration overhead, and how can they be made reliable in production? |
|
Read Galileoβs new 165-page guide for a comprehensive exploration of multi-agent systems β a true treasure trove for learning. |
|
|
Our news digest is always free. Click on the partnerβs link to support us or Upgrade to receive our deep dives in full, directly into your inbox β | |
|
|
|
Editorial: Today I have two things to share with you in my editorial. |
First, Kosmos AI. This week, many people in research circles are talking about Kosmos AI β a new βAI scientistβ built by Edison Scientific in collaboration with many great research institutions and supported by Eric Schmidt (our reader and supporter as well). |
Edison Scientific, a spinout from FutureHouse, designed Kosmos as a system for autonomous discovery. In its first phase, Kosmos produced seven findings across neuroscience, materials science, and clinical genetics. Three reproduced unpublished human work; four are new contributions now being validated with academic partners. About 79 percent of its results replicate β roughly the same rate as early-stage human research. Each discovery can be traced back to the exact code and papers that informed it, giving a rare level of auditability to AI-generated science. That alone is remarkable. |
But what I found most interesting is that anyone can use it, not just scientists. And hereβs an example that probably resonates with everyone β medical advice. Itβs a bit personal, but Iβm doing a long fast right now (three days coffee + water, then eight days water only). Itβs not my first time, and I often use ChatGPT to guide me, but its answers can change a lot from one day to another. Thatβs fine when you know the subject and can read between the lines, but when it comes to your health, you need something you can actually trust β something that shows what the entire research field says, with a list of sources you can verify and explore if needed. |
The model runs on a credit system: free users get 10 credits, and each conversation with a specialized agent costs one. Follow-ups count too. |
|
The only one that requires a full subscription β about $200 per month β is Kosmos itself, used for serious work like Mechanism of entorhinal cortex vulnerability in aging or Nucleotide metabolism as the dominant pathway altered under hypothermic conditions in brain. And this agent can really do science. |
I feel a little bit sorry for those scientists who feverishly deny AI and say theyβll never use it because itβs βalways wrong,β βit steals their job,β or whatever other reason. Tools like this accelerate a human scientistβs work to an unbelievable degree. Go, grab it, and make new breakthroughs with it. Seriously, refusing AI is like refusing a microscope and saying youβre fine looking at molecules through your grandmaβs magnifying glass βread their Kosmos tech report here and play with it here. |
Second, Fei Fei Li and her new blog. She started it 10 hours ago, and I believe that will be one of the most interesting reads about Spatial Intelligence. We covered it once here βWhat is Spatial Intelligenceβ (free to read). |
|
She writes, that Spatial intelligence depends on world models built on three core principles β they must be generative (able to create coherent, physics-consistent simulated worlds), multimodal (able to understand and respond through any combination of inputs like images, text, or actions), and interactive (able to predict how the world changes in response to actions or goals). Together, these define the foundation for truly spatially intelligent AI. She also says: βThe scope of this challenge exceeds anything AI has faced before.β |
Iβm very happy that thinkers and builders like Fei-Fei Li are sharing their insights with the public, helping more people understand where the next frontier of AI is truly heading βread her blog here |
Note: There are a few new papers on Spatial Intelligence in the Research Papers section today. Check them out. |
|
Attention Span: Sam Altman published a long clarification about OpenAIβs rumored government backing β denying bailout plans while outlining $1.4 trillion in infrastructure spending through 2033. Thatβs a big number! Considering that they are are going to make only $30 billion this year. What does he actually mean? Letβs discuss. Watch it here β |
 | Sam Altman: βWe do not want government guarantees for our datacentersβ. What is he actually saying? |
|
|
|
|
Curated Collections β Letβs talk a bit more about presicion |
|
|
What are we reading/watching: |
|
News from The Usual Suspects Β© |
Deepnote with a new notebook After seven years in stealthy development, Deepnote has gone open-source. The teamβs new .deepnote format replaces the aging .ipynb with something future-proof: human-readable YAML, AI-native design, multi-language support, and a project-based structure fit for real-world teams and AI agents alike. As they say: βItβs a big step toward making data tools fully open and community-driven.β Memories.ai brings memory to your device Memories.ai has unveiled LVMM 2.0, a model designed to give machines something humans take for granted: persistent visual memory. In partnership with Qualcomm, this next-gen tech will run on-device across phones, cameras, and wearables by 2026 β bringing sub-second video search, privacy-preserving inference, and real-time visual recall to the edge. Goodbye scrubbing through footage; hello semantic memory for machines. We are going to interview their CEO in December β send us your questions. Webflow uncovers the tug-of-war behind the homepage Webflowβs State of the Website 2026 reveals a digital battleground: marketing and engineering are at odds over strategy, governance, and control. 92% see cross-functional friction, 97% feel the weight of technical debt, and developers are increasingly frustrated β some to the point of quitting. Meanwhile, AI is knocking, but half the teams arenβt sure itβs safe to let it in. Google shoots for the stars β literally Project Suncatcher is Google Researchβs latest moonshot: solar-powered satellite constellations armed with TPUs, linked by high-speed optical comms, and designed to scale AI compute in space. Think cloud infrastructure β just in orbit. With bench-tested bandwidths of 1.6 Tbps and radiation-tolerant TPUs, it's a wild yet plausible vision of off-planet machine learning. If this flies, βcloud computingβ may soon be a literal term. Google Cloud sharpens its silicon for the AI era Google Cloud has officially launched Ironwood TPUs β its most powerful and efficient custom chips yet β delivering 10x the peak performance of v5p and redefining inference at scale. Alongside, new Arm-based Axion VMs promise serious price-performance gains for general-purpose workloads. Welcome to the AI Hypercomputer age: optimized from chip to cluster, and built to scale like never before. OpenAI warns: weβre not ready In its latest dispatch, OpenAI reflects on how AI has quietly passed historic milestones β outthinking humans in elite domains β while most still see it as a fancy chatbot. their concern is that tech is racing ahead of public understanding and governance. OpenAI urges new safety norms, government coordination, and a full-blown AI resilience ecosystem before superintelligence arrives. They are peculiar people, this OpenAI.
|
|
This emoji π¦ means open-source. |
Interesting Datasets for Robotics: |
PHUMA: Physically-Grounded Humanoid Locomotion Dataset β curate large-scale video-based humanoid motion data with physics-constrained retargeting that enforces joint limits and contact fidelity, producing physically reliable motions for robust imitation learning βread the paper TWIST2: Scalable, Portable, and Holistic Humanoid Data Collection System β develop a mocap-free, VR-based teleoperation system for fast, low-cost humanoid data collection and whole-body visuomotor control, enabling dexterous manipulation and dynamic locomotion βread the paper VSI-590K: Spatially-Focused Instruction-Tuning Dataset β build a large-scale dataset centered on spatial reasoning, aggregating diverse sources with fine-grained spatial annotations to improve modelsβ spatial understanding βread the paper
|
Models to pay attention to: |
ππ π¦ Kimi K2 Thinking β the model that blew everyoneβs mind. We will cover it on Wednesday including the researchers opinions about it. In short, it is an open-source long-horizon reasoning agent supporting hundreds of sequential tool calls, optimized for INT4 inference, achieving frontier performance on reasoning, coding, and web-agent benchmarks through deep tool integration βread the paper ππ¦ NVIDIA Nemotron Nano V2 VL β advance document and video understanding with a hybrid Mamba-Transformer vision-language architecture using token reduction for efficient long-context reasoning; released in multiple precision formats with open datasets and recipes βread the paper iFlyBot-VLA β integrate language, vision, and action through dual-level supervision that aligns high-level latent actions with low-level control tokens, producing a unified VLA model capable of precise 3D reasoning and real-world manipulation βread the paper
|
|
The freshest research papers, categorized for your convenience |
We organize research papers by goal-oriented or functional categories to make it easier to explore related developments and compare approaches. As always, papers we particularly recommend are marked with π. |
Highlight: |
ππ Cambrian-S: Towards spatial supersensing in video Researchers from New York University and Stanford University introduce Cambrian-S, a family of spatially grounded multimodal models trained on a 590K-sample video dataset (VSI-590K) for spatial reasoning. They propose VSI-SUPER, a benchmark with two tasks: spatial recall (VSR) and spatial counting (VSC), using up to 240-minute videos. Cambrian-S achieves 30%+ gains on VSI-Bench but fails on VSI-SUPER, revealing scale limitations. A predictive sensing prototype using latent frame prediction and surprise-based memory outperforms Gemini-2.5-Flash on VSI-SUPER tasks, showing prediction aids long-horizon spatial understanding βread the paper
|
Continual learning paradigms |
|
Agent training, simulation & experience synthesis |
π Scaling Agent Learning via Experience Synthesis (by Meta) β distill environment dynamics into a reasoning-based experience model to generate scalable synthetic rollouts, warm-start RL, and match PPO/GRPO with far fewer real interactions βread the paper ππ¦ Magentic Marketplace: An open-source simulation environment for studying agentic markets (by Microsoft) β simulate two-sided markets of assistant and service agents to evaluate welfare, bias, prompt-injection risks, and search design under realistic competition βread the paper Simulating Environments with Reasoning Models for Agent Training β synthesize SFT trajectories (Simia-SFT) and RL feedback (Simia-RL) with LLM-simulated environments to train agents without real APIs, surpassing strong baselines on ΟΒ²-Bench βread the paper
|
Spatial cognition, multimodal reasoning & grounding |
π Visual Spatial Tuning β construct VST-P (4.1M) and VST-R (135k) and train VLMs with SFTβRL to enhance spatial perceptionβreasoning without hurting general abilities βread the paper π Towards Mitigating Hallucinations in Large Vision-Language Models by Refining Textual Embeddings (by University of Maryland, Dolby Laboratories, Hilabs, Capital One) β integrate average-pooled visual features into textual embeddings to rebalance modalities, improving grounding and reducing hallucinations βread the paper Actial: Activate Spatial Reasoning Ability of Multimodal LLMs β build Viewpoint-100K and train via SFT + GRPO to enforce cross-view consistency, improving 3D/spatial reasoning in- and out-of-domain βread the paper
|
Tabular ICL & retrieval systems |
π Orion-MSP: Multi-Scale Sparse Attention for Tabular In-Context Learning (by Lexsi Labs) β combine multi-scale processing, block-sparse attention, and Perceiver-style memory to capture hierarchical feature interactions and scale to wide tables βread the paper π¦ Trove: A Flexible Toolkit for Dense Retrieval β provide low-code, on-the-fly dataset filtering/combination, unified evaluation and hard-negative mining, and multi-node scaling for customizable retrieval research βread the paper
|
Efficient reasoning training & decoding behavior |
|
Theory & foundations |
π Diffusion Language Models are Super Data Learners (by National University of Singapore) β show DLMs outperform AR models under data scarcity via any-order modeling, denoising compute, and MC augmentation, achieving strong accuracy with repeated data βread the paper The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms β prove existence conditions for strong lottery tickets in MHA and extend theory to transformers without normalization, with exponential error decay empirics βread the paper
|
AI Scientists |
|
Thatβs all for today. Thank you for reading! Please send this newsletter to colleagues if it can help them enhance their understanding of AI and stay ahead of the curve. |
How did you like it? |
|
|