This month, weβre spotlighting the most essential AI topics of 2025 β the ones that keep showing up no matter where the tech goes. Weβre organizing our AI 101 series recaps into three parts: 1) methods and techniques, 2) models, and 3) core concepts. |
Today weβre refreshing 9 techniques from AI 101 that shaped the way we think about AI from January to June 2025. Thereβs a lot to explore. Happy learning! β |
|
1. What is HtmlRAG, Multimodal RAG and Agentic RAG? |
These three RAG methods do what original RAG canβt β 1) HtmlRAG works directly with HTML version of text, 2) Multimodal RAG can retrieve image information; and 3) Agentic RAG incorporates agentic capabilities in RAG technique. Hereβs all about them and why they are special. |
 | What is HtmlRAG, Multimodal RAG and Agentic RAG? | We explore in details three RAG methods that address limitations of original RAG and meet the upcoming trends of the new year | www.turingpost.com/p/html-multimodal-agentic-rag |
|  |
|
|
|
2. Everything You Need to Know about Knowledge Distillation |
Proposed a decade ago, knowledge distillation is still a cornerstone technique for transferring knowledge from a larger model β the teacher, to a smaller one β the student. With this training style, smaller models become inheritors of their larger counterpartsβ capabilities. Hereβs a closer look at knowledge distillation, covering the main aspects you need to know. |
 | AI101: Everything You Need to Know about Knowledge Distillation | This is one of the hottest topics thanks to DeepSeek. Learn with us: the core idea, its types, scaling laws, real-world cases and useful resources to dive deeper | www.turingpost.com/p/kd |
|  |
|
|
|
3. The Keys to Prompt Optimization |
Practical insights for more effective prompting from Isabel GonzΓ‘lez that uncover the four main pillars of query optimization: expansion, decomposition, disambiguation, abstraction. |
 | Topic 25: The Keys to Prompt Optimization | Practical Insights for Large Language Models | www.turingpost.com/p/topic25 |
|  |
|
|
|
4. What is GRPO and Flow-GRPO? |
DeepSeekβs Group Relative Policy Optimization (GRPO) is a twist on traditional reinforcement learning methods like PPO (Proximal Policy Optimization). It skips the critic model in the workflow, letting a model learn from its own outputs. This makes it especially efficient for using in Reasoning Models that need to solve hard math and coding tasks, and perform long Chain-of-Thought (CoT) reasoning. Also, we dive into Flow-GRPO β the implementation of GRPO in flow models. |
 | Topic 36: What is GRPO and Flow-GRPO? | we explore very clever algorithm behind DeepSeek-R1's breakthrough and how its latest adoption, Flow-GRPO make RL real for flow models | www.turingpost.com/p/gpro |
|  |
|
|
|
5. What are Chain-of-Agents and Chain-of-RAG? |
Chain-of-β¦ methods and agents are defining trends of 2025, and RAG is always a hot topic. This episode highlights two fascinating advances: Googleβs Chain-of-Agents (CoA) uses a structured multi-agent chain for long-context tasks, while Microsoftβs Chain-of-Retrieval-Augmented Generation (CoRAG) enables strong multi-hop reasoning via iterative retrieval. A must read in terms of current trends. |
 | Topic 27: What are Chain-of-Agents and Chain-of-RAG? | We explore Google's and Microsoft's advancements that implement "chain" approaches for long context and multi-hop reasoning | www.turingpost.com/p/coa-and-co-rag/ |
|  |
|
|
Upgrade to receive our deep dives in full, directly into your inbox. Join Premium members from top companies like Hugging Face, Microsoft, Google, a16z, Datadog plus AI labs such as Ai2, MIT, Berkeley, .gov, and thousands of others to really understand whatβs going on with AI β | |
|
|
|
6. How to Reduce Memory Use in Reasoning Models |
With the rise of Reasoning Models that outline their thought process step by step during inference, the challenge of managing memory use has become even more noticeable. Here we discuss two approaches to mitigate this: |
LightThinker that helps models learn how to summarize their own βthoughtsβ and solve tasks based on these short, meaningful summarization. DeepSeekβs Multi-head Latent Attention (MLA) mechanism that compresses the KV cache into a much smaller form.
|
Plus, we propose an idea of their combination. |
 | Topic 31: How to Reduce Memory Use in Reasoning Models | we explore how combining LightThinker and Multi-Head Latent Attention cuts memory and boosts performance | www.turingpost.com/p/mlalightthinker |
|  |
|
|
|
7. Slim Attention, KArAt, and XAttention Explained β Whatβs Really Changing in Transformers? |
These three types of attention mechanisms take the models we use daily to the next level: |
Slim Attention allows to process long context faster and cut memory use up to 32 times. XAttention improves the effectiveness of sparse attention in long sequences including text and videos. Kolmogorov-Arnold Attention (KArAt and Fourier-KArAt) is a completely different approach that focuses on making attention learnable and adaptable.
|
In this episode we discuss all their specific features, strengths, weaknesses and future potential. |
 | Topic 33: Slim Attention, KArAt, and XAttention Explained β Whatβs Really Changing in Transformers? | We explore three advanced attention mechanisms which improve how models handle long sequences, cut memory use and make attention learnable | www.turingpost.com/p/attentions |
|  |
|
|
|
8. What is MoE 2.0? Update Your Knowledge about Mixture-of-experts |
Mixture-of-Experts (MoE) is a classical approach, but it keeps evolving rapidly. Here we highlight the latest developments: Structural Mixture of Residual Experts (SβMoRE), SymbolicβMoE that works in pure language space, eMoE, MoEShard, Speculative-MoE, and MoE-Gen. Itβs a truly fresh angle on current MoE. |
 | What is MoE 2.0? Update Your Knowledge about Mixture-of-experts | The fresh angle on current Mixture-of-Expert. We discuss what new MoE techniques like S'MoRE, Symbolic-MoE, and others mean to the next generation AI | www.turingpost.com/p/moe2 |
|  |
|
|
|
9. The Human Touch: How HITL is Saving AI from Itself with Synthetic Data |
Explore how AI teams are using human-in-the-loop (HITL) systems to make synthetic data useful and safe. We break down how humans guide and validate the process real-world examples of how this is being implemented today. |
 | Topic 44: The Human Touch: How HITL is Saving AI from Itself with Synthetic Data | we explore how human-in-the-loop systems are keeping synthetic data grounded, useful, and safe in the age of AI self-training | www.turingpost.com/p/hit |
|  |
|
|
|