9 Methods and Techniques You Must Know (AI 101 Guide Recap)

From:

🔳 Turing Post <turingpost@mail.beehiiv.com>

To:

Hidden Recipient <hidden@emailshot.io>

Date:

7/2/2025, 9:10 PM

This month, we’re spotlighting the most essential AI topics of 2025 – the ones that keep showing up no matter where the tech goes. We’re organizing our AI 101 series recaps into three parts: 1) methods and techniques, 2) models, and 3) core concepts.

Today we’re refreshing 9 techniques from AI 101 that shaped the way we think about AI from January to June 2025. There’s a lot to explore. Happy learning! →

 UPGRADE to get instant access to all articles 

1. What is HtmlRAG, Multimodal RAG and Agentic RAG?

These three RAG methods do what original RAG can’t – 1) HtmlRAG works directly with HTML version of text, 2) Multimodal RAG can retrieve image information; and 3) Agentic RAG incorporates agentic capabilities in RAG technique. Here’s all about them and why they are special.

What is HtmlRAG, Multimodal RAG and Agentic RAG?

We explore in details three RAG methods that address limitations of original RAG and meet the upcoming trends of the new year

www.turingpost.com/p/html-multimodal-agentic-rag

2. Everything You Need to Know about Knowledge Distillation

Proposed a decade ago, knowledge distillation is still a cornerstone technique for transferring knowledge from a larger model — the teacher, to a smaller one — the student. With this training style, smaller models become inheritors of their larger counterparts’ capabilities. Here’s a closer look at knowledge distillation, covering the main aspects you need to know.

AI101: Everything You Need to Know about Knowledge Distillation

This is one of the hottest topics thanks to DeepSeek. Learn with us: the core idea, its types, scaling laws, real-world cases and useful resources to dive deeper

www.turingpost.com/p/kd

3. The Keys to Prompt Optimization

Practical insights for more effective prompting from Isabel González that uncover the four main pillars of query optimization: expansion, decomposition, disambiguation, abstraction.

Topic 25: The Keys to Prompt Optimization

Practical Insights for Large Language Models

www.turingpost.com/p/topic25

4. What is GRPO and Flow-GRPO?

DeepSeek’s Group Relative Policy Optimization (GRPO) is a twist on traditional reinforcement learning methods like PPO (Proximal Policy Optimization). It skips the critic model in the workflow, letting a model learn from its own outputs. This makes it especially efficient for using in Reasoning Models that need to solve hard math and coding tasks, and perform long Chain-of-Thought (CoT) reasoning. Also, we dive into Flow-GRPO – the implementation of GRPO in flow models.

Topic 36: What is GRPO and Flow-GRPO?

we explore very clever algorithm behind DeepSeek-R1's breakthrough and how its latest adoption, Flow-GRPO make RL real for flow models

www.turingpost.com/p/gpro

5. What are Chain-of-Agents and Chain-of-RAG?

Chain-of-… methods and agents are defining trends of 2025, and RAG is always a hot topic. This episode highlights two fascinating advances: Google’s Chain-of-Agents (CoA) uses a structured multi-agent chain for long-context tasks, while Microsoft’s Chain-of-Retrieval-Augmented Generation (CoRAG) enables strong multi-hop reasoning via iterative retrieval. A must read in terms of current trends.

Topic 27: What are Chain-of-Agents and Chain-of-RAG?

We explore Google's and Microsoft's advancements that implement "chain" approaches for long context and multi-hop reasoning

www.turingpost.com/p/coa-and-co-rag/

Upgrade to receive our deep dives in full, directly into your inbox. Join Premium members from top companies like Hugging Face, Microsoft, Google, a16z, Datadog plus AI labs such as Ai2, MIT, Berkeley, .gov, and thousands of others to really understand what’s going on with AI →

 Upgrade today 

6. How to Reduce Memory Use in Reasoning Models

With the rise of Reasoning Models that outline their thought process step by step during inference, the challenge of managing memory use has become even more noticeable. Here we discuss two approaches to mitigate this:

LightThinker that helps models learn how to summarize their own “thoughts” and solve tasks based on these short, meaningful summarization.
DeepSeek’s Multi-head Latent Attention (MLA) mechanism that compresses the KV cache into a much smaller form.

Plus, we propose an idea of their combination.

Topic 31: How to Reduce Memory Use in Reasoning Models

we explore how combining LightThinker and Multi-Head Latent Attention cuts memory and boosts performance

www.turingpost.com/p/mlalightthinker

7. Slim Attention, KArAt, and XAttention Explained – What’s Really Changing in Transformers?

These three types of attention mechanisms take the models we use daily to the next level:

Slim Attention allows to process long context faster and cut memory use up to 32 times.
XAttention improves the effectiveness of sparse attention in long sequences including text and videos.
Kolmogorov-Arnold Attention (KArAt and Fourier-KArAt) is a completely different approach that focuses on making attention learnable and adaptable.

In this episode we discuss all their specific features, strengths, weaknesses and future potential.

Topic 33: Slim Attention, KArAt, and XAttention Explained – What’s Really Changing in Transformers?

We explore three advanced attention mechanisms which improve how models handle long sequences, cut memory use and make attention learnable

www.turingpost.com/p/attentions

8. What is MoE 2.0? Update Your Knowledge about Mixture-of-experts

Mixture-of-Experts (MoE) is a classical approach, but it keeps evolving rapidly. Here we highlight the latest developments: Structural Mixture of Residual Experts (S’MoRE), Symbolic‑MoE that works in pure language space, eMoE, MoEShard, Speculative-MoE, and MoE-Gen. It’s a truly fresh angle on current MoE.

What is MoE 2.0? Update Your Knowledge about Mixture-of-experts

The fresh angle on current Mixture-of-Expert. We discuss what new MoE techniques like S'MoRE, Symbolic-MoE, and others mean to the next generation AI

www.turingpost.com/p/moe2

9. The Human Touch: How HITL is Saving AI from Itself with Synthetic Data

Explore how AI teams are using human-in-the-loop (HITL) systems to make synthetic data useful and safe. We break down how humans guide and validate the process real-world examples of how this is being implemented today.

Topic 44: The Human Touch: How HITL is Saving AI from Itself with Synthetic Data

we explore how human-in-the-loop systems are keeping synthetic data grounded, useful, and safe in the age of AI self-training

www.turingpost.com/p/hit

 UPGRADE 

Want 90,000 serious AI readers — from OpenAI to Hugging Face — to see your product? Get in touch to feature your company in Turing Post.

Update your email preferences or unsubscribe here

1434 Western Ave, Suite 1 #4796
Albany, New York 12203, United States

Similar newsletters

There are other similar shared emails that you might be interested in: