This Week in Turing Post: |
Wednesday / AI 101 series: Kimi K2 Thinking and why all the hype Friday / Interview: Great conversation with Anneka Gupta, CPO at Rubrik
|
|
👉 Free Expert Guide to Multi-Agent Systems |
Multi-agent systems can handle more complex tasks, but are they worth the orchestration overhead, and how can they be made reliable in production? |
|
Read Galileo’s new 165-page guide for a comprehensive exploration of multi-agent systems – a true treasure trove for learning. |
|
|
Our news digest is always free. Click on the partner’s link to support us or Upgrade to receive our deep dives in full, directly into your inbox → | |
|
|
|
Editorial: Today I have two things to share with you in my editorial. |
First, Kosmos AI. This week, many people in research circles are talking about Kosmos AI – a new “AI scientist” built by Edison Scientific in collaboration with many great research institutions and supported by Eric Schmidt (our reader and supporter as well). |
Edison Scientific, a spinout from FutureHouse, designed Kosmos as a system for autonomous discovery. In its first phase, Kosmos produced seven findings across neuroscience, materials science, and clinical genetics. Three reproduced unpublished human work; four are new contributions now being validated with academic partners. About 79 percent of its results replicate – roughly the same rate as early-stage human research. Each discovery can be traced back to the exact code and papers that informed it, giving a rare level of auditability to AI-generated science. That alone is remarkable. |
But what I found most interesting is that anyone can use it, not just scientists. And here’s an example that probably resonates with everyone – medical advice. It’s a bit personal, but I’m doing a long fast right now (three days coffee + water, then eight days water only). It’s not my first time, and I often use ChatGPT to guide me, but its answers can change a lot from one day to another. That’s fine when you know the subject and can read between the lines, but when it comes to your health, you need something you can actually trust – something that shows what the entire research field says, with a list of sources you can verify and explore if needed. |
The model runs on a credit system: free users get 10 credits, and each conversation with a specialized agent costs one. Follow-ups count too. |
|
The only one that requires a full subscription – about $200 per month – is Kosmos itself, used for serious work like Mechanism of entorhinal cortex vulnerability in aging or Nucleotide metabolism as the dominant pathway altered under hypothermic conditions in brain. And this agent can really do science. |
I feel a little bit sorry for those scientists who feverishly deny AI and say they’ll never use it because it’s “always wrong,” “it steals their job,” or whatever other reason. Tools like this accelerate a human scientist’s work to an unbelievable degree. Go, grab it, and make new breakthroughs with it. Seriously, refusing AI is like refusing a microscope and saying you’re fine looking at molecules through your grandma’s magnifying glass →read their Kosmos tech report here and play with it here. |
Second, Fei Fei Li and her new blog. She started it 10 hours ago, and I believe that will be one of the most interesting reads about Spatial Intelligence. We covered it once here “What is Spatial Intelligence” (free to read). |
|
She writes, that Spatial intelligence depends on world models built on three core principles – they must be generative (able to create coherent, physics-consistent simulated worlds), multimodal (able to understand and respond through any combination of inputs like images, text, or actions), and interactive (able to predict how the world changes in response to actions or goals). Together, these define the foundation for truly spatially intelligent AI. She also says: “The scope of this challenge exceeds anything AI has faced before.” |
I’m very happy that thinkers and builders like Fei-Fei Li are sharing their insights with the public, helping more people understand where the next frontier of AI is truly heading →read her blog here |
Note: There are a few new papers on Spatial Intelligence in the Research Papers section today. Check them out. |
|
Attention Span: Sam Altman published a long clarification about OpenAI’s rumored government backing – denying bailout plans while outlining $1.4 trillion in infrastructure spending through 2033. That’s a big number! Considering that they are are going to make only $30 billion this year. What does he actually mean? Let’s discuss. Watch it here → |
 | Sam Altman: “We do not want government guarantees for our datacenters”. What is he actually saying? |
|
|
|
|
Curated Collections – Let’s talk a bit more about presicion |
|
|
What are we reading/watching: |
|
News from The Usual Suspects © |
Deepnote with a new notebook After seven years in stealthy development, Deepnote has gone open-source. The team’s new .deepnote format replaces the aging .ipynb with something future-proof: human-readable YAML, AI-native design, multi-language support, and a project-based structure fit for real-world teams and AI agents alike. As they say: “It’s a big step toward making data tools fully open and community-driven.” Memories.ai brings memory to your device Memories.ai has unveiled LVMM 2.0, a model designed to give machines something humans take for granted: persistent visual memory. In partnership with Qualcomm, this next-gen tech will run on-device across phones, cameras, and wearables by 2026 – bringing sub-second video search, privacy-preserving inference, and real-time visual recall to the edge. Goodbye scrubbing through footage; hello semantic memory for machines. We are going to interview their CEO in December – send us your questions. Webflow uncovers the tug-of-war behind the homepage Webflow’s State of the Website 2026 reveals a digital battleground: marketing and engineering are at odds over strategy, governance, and control. 92% see cross-functional friction, 97% feel the weight of technical debt, and developers are increasingly frustrated – some to the point of quitting. Meanwhile, AI is knocking, but half the teams aren’t sure it’s safe to let it in. Google shoots for the stars – literally Project Suncatcher is Google Research’s latest moonshot: solar-powered satellite constellations armed with TPUs, linked by high-speed optical comms, and designed to scale AI compute in space. Think cloud infrastructure – just in orbit. With bench-tested bandwidths of 1.6 Tbps and radiation-tolerant TPUs, it's a wild yet plausible vision of off-planet machine learning. If this flies, “cloud computing” may soon be a literal term. Google Cloud sharpens its silicon for the AI era Google Cloud has officially launched Ironwood TPUs – its most powerful and efficient custom chips yet – delivering 10x the peak performance of v5p and redefining inference at scale. Alongside, new Arm-based Axion VMs promise serious price-performance gains for general-purpose workloads. Welcome to the AI Hypercomputer age: optimized from chip to cluster, and built to scale like never before. OpenAI warns: we’re not ready In its latest dispatch, OpenAI reflects on how AI has quietly passed historic milestones – outthinking humans in elite domains – while most still see it as a fancy chatbot. their concern is that tech is racing ahead of public understanding and governance. OpenAI urges new safety norms, government coordination, and a full-blown AI resilience ecosystem before superintelligence arrives. They are peculiar people, this OpenAI.
|
|
This emoji 🦋 means open-source. |
Interesting Datasets for Robotics: |
PHUMA: Physically-Grounded Humanoid Locomotion Dataset – curate large-scale video-based humanoid motion data with physics-constrained retargeting that enforces joint limits and contact fidelity, producing physically reliable motions for robust imitation learning →read the paper TWIST2: Scalable, Portable, and Holistic Humanoid Data Collection System – develop a mocap-free, VR-based teleoperation system for fast, low-cost humanoid data collection and whole-body visuomotor control, enabling dexterous manipulation and dynamic locomotion →read the paper VSI-590K: Spatially-Focused Instruction-Tuning Dataset – build a large-scale dataset centered on spatial reasoning, aggregating diverse sources with fine-grained spatial annotations to improve models’ spatial understanding →read the paper
|
Models to pay attention to: |
🌟🌟 🦋 Kimi K2 Thinking – the model that blew everyone’s mind. We will cover it on Wednesday including the researchers opinions about it. In short, it is an open-source long-horizon reasoning agent supporting hundreds of sequential tool calls, optimized for INT4 inference, achieving frontier performance on reasoning, coding, and web-agent benchmarks through deep tool integration →read the paper 🌟🦋 NVIDIA Nemotron Nano V2 VL – advance document and video understanding with a hybrid Mamba-Transformer vision-language architecture using token reduction for efficient long-context reasoning; released in multiple precision formats with open datasets and recipes →read the paper iFlyBot-VLA – integrate language, vision, and action through dual-level supervision that aligns high-level latent actions with low-level control tokens, producing a unified VLA model capable of precise 3D reasoning and real-world manipulation →read the paper
|
|
The freshest research papers, categorized for your convenience |
We organize research papers by goal-oriented or functional categories to make it easier to explore related developments and compare approaches. As always, papers we particularly recommend are marked with 🌟. |
Highlight: |
🌟🌟 Cambrian-S: Towards spatial supersensing in video Researchers from New York University and Stanford University introduce Cambrian-S, a family of spatially grounded multimodal models trained on a 590K-sample video dataset (VSI-590K) for spatial reasoning. They propose VSI-SUPER, a benchmark with two tasks: spatial recall (VSR) and spatial counting (VSC), using up to 240-minute videos. Cambrian-S achieves 30%+ gains on VSI-Bench but fails on VSI-SUPER, revealing scale limitations. A predictive sensing prototype using latent frame prediction and surprise-based memory outperforms Gemini-2.5-Flash on VSI-SUPER tasks, showing prediction aids long-horizon spatial understanding →read the paper
|
Continual learning paradigms |
|
Agent training, simulation & experience synthesis |
🌟 Scaling Agent Learning via Experience Synthesis (by Meta) – distill environment dynamics into a reasoning-based experience model to generate scalable synthetic rollouts, warm-start RL, and match PPO/GRPO with far fewer real interactions →read the paper 🌟🦋 Magentic Marketplace: An open-source simulation environment for studying agentic markets (by Microsoft) – simulate two-sided markets of assistant and service agents to evaluate welfare, bias, prompt-injection risks, and search design under realistic competition →read the paper Simulating Environments with Reasoning Models for Agent Training – synthesize SFT trajectories (Simia-SFT) and RL feedback (Simia-RL) with LLM-simulated environments to train agents without real APIs, surpassing strong baselines on τ²-Bench →read the paper
|
Spatial cognition, multimodal reasoning & grounding |
🌟 Visual Spatial Tuning – construct VST-P (4.1M) and VST-R (135k) and train VLMs with SFT→RL to enhance spatial perception→reasoning without hurting general abilities →read the paper 🌟 Towards Mitigating Hallucinations in Large Vision-Language Models by Refining Textual Embeddings (by University of Maryland, Dolby Laboratories, Hilabs, Capital One) – integrate average-pooled visual features into textual embeddings to rebalance modalities, improving grounding and reducing hallucinations →read the paper Actial: Activate Spatial Reasoning Ability of Multimodal LLMs – build Viewpoint-100K and train via SFT + GRPO to enforce cross-view consistency, improving 3D/spatial reasoning in- and out-of-domain →read the paper
|
Tabular ICL & retrieval systems |
🌟 Orion-MSP: Multi-Scale Sparse Attention for Tabular In-Context Learning (by Lexsi Labs) – combine multi-scale processing, block-sparse attention, and Perceiver-style memory to capture hierarchical feature interactions and scale to wide tables →read the paper 🦋 Trove: A Flexible Toolkit for Dense Retrieval – provide low-code, on-the-fly dataset filtering/combination, unified evaluation and hard-negative mining, and multi-node scaling for customizable retrieval research →read the paper
|
Efficient reasoning training & decoding behavior |
|
Theory & foundations |
🌟 Diffusion Language Models are Super Data Learners (by National University of Singapore) – show DLMs outperform AR models under data scarcity via any-order modeling, denoising compute, and MC augmentation, achieving strong accuracy with repeated data →read the paper The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms – prove existence conditions for strong lottery tickets in MHA and extend theory to transformers without normalization, with exponential error decay empirics →read the paper
|
AI Scientists |
|
That’s all for today. Thank you for reading! Please send this newsletter to colleagues if it can help them enhance their understanding of AI and stay ahead of the curve. |
How did you like it? |
|
|