Table of Contents

  • Why I ran this test (short backstory)
  • Quick technical primer: FLOPS vs HBM
  • The prompt I used (a learning template)
  • What I tested (the tools)
  • Comparison notes and what each model did well
  • Key lessons from the experiment
  • Practical prompt improvements (copy / paste)
  • Tips to get the best results
  • Conclusion — which tool won?

Why I ran this test (short backstory)

There were headlines recently about U.S. policy on importing NVIDIA “H20” chips into China, and I wanted to understand the technical and policy tradeoffs quickly. Are these chips powerful? What’s the difference between FLOPS and HBM, and which matters for pretraining vs inference? Instead of digging through dozens of articles, I used the same prompt across several AI models to accelerate learning and compare outputs.

Quick technical primer: FLOPS vs HBM

Two short concepts you’ll hear a lot when talking about chips and model compute:

  • FLOPS — floating point operations per second. Important for pretraining: training huge models requires massive floating-point throughput.
  • HBM — high-bandwidth memory. More important for inference (runtime): when a model answers your prompt, memory bandwidth can be the limiting factor for latency and handling large context windows.

The prompt I used (a learning template)

This is the core prompt I used across models — think of it as a universal “personal subject guide” template you can adapt to any topic:

I want you to be my personal subject guide for: understanding the pros and cons of allowing China to import NVIDIA H20 chips. Provide the following: (1) a simple, relatable analogy; (2) a detailed breakdown with a worked example and visual description; (3) a short knowledge check; (4) next steps and suggested resources; (5) an interactive or shareable artifact (one-page brief, infographic, web app); (6) sources and citations. (Later I added: 7) a bibliography including URLs.)

The prompt template shown on screen asking for analogy, breakdown, knowledge check, resources, and bibliography

What I tested (the tools)

  • Anthropic’s Claude
  • Google’s Gemini (Pro)
  • OpenAI’s ChatGPT / GPT-5
  • Perplexity (AI search engine)
  • NotebookLM (not directly a chat LLM but useful for audio/video overviews and multimodal study)
  • I briefly tried Chinese tools (Manus, Genspark) but used only small free tests and did not evaluate them deeply.

Comparison notes and what each model did well

Claude (Anthropic)

Claude produced a clean, readable breakdown and sections (analogy, explanation, etc.). It was good at structure, but when I asked for more visual, shareable artifacts (like a PDF or infographic), Claude struggled to produce a polished export. Iterating produced a decent interactive mockup and web code — useful if you’re okay refining prompts — but the PDF/export workflow was clunky.

Claude built an interactive web app mockup; host shows it on screen

Gemini (Google) — my top pick for this task

With Gemini Pro, the experience was noticeably smoother and multimodal. Gemini produced:

  • A solid one-page brief
  • A shareable infographic (graphical output that looked miles better than Claude’s)
  • The ability to request alternate artifacts easily (infographic, web app, downloadable material), demonstrating strong UI/UX and coding capabilities
  • Generate Audio Overview of content
  • Build an interactive web application with the content

Gemini is the “clear winner” in IMHO for this multimodal learning task — especially for people who want quick, polished deliverables (infographics, interactive apps, etc.).

NotebookLM audio overview playback screen, summarizing the policy brief

ChatGPT / GPT-5

GPT-5 produced a structured text breakdown similar to the others and — importantly — it generated a downloadable PDF brief that the host could actually save. That’s a practical advantage when you need an immediate file to share. There were some session hiccups (session expiration) during the demo, but when it worked GPT-5 gave a compact brief and a bibliography list (the host noted he would refine the prompt to include direct URLs in the bibliography).

GPT-5 text breakdown and source listing, resembling Perplexity-style references

Perplexity

Perplexity is an AI search engine that did well as a source aggregator. It clearly listed sources and gave a text synthesis, and when I asked it for technical detail (FLOPS vs HBM) it provided solid pointers and links. It didn’t produce polished documents or interactive artifacts as cleanly as Gemini or PDFs like GPT-5, but it’s excellent for quick source-backed exploration and follow-up reading.

Perplexity UI showing source links and text synthesis

Key lessons from the experiment

  1. The same prompt across models will yield different strengths: structure (Claude), multimodal/polish (Gemini), downloadable file (GPT-5), source aggregation (Perplexity).
  2. Multimodal outputs improve retention: combining text + visual + audio is powerful for learning (NotebookLM-style audio + Gemini infographic + GPT-5 PDF = robust absorption).
  3. LLMs are predictive—not deterministic. Expect occasional hallucinations; always ask for citations/URLs and verify critical facts.
  4. Refine prompts iteratively. Add instructions like “include URLs in the bibliography” or “format the one-page brief to be printable as A4 PDF.” Small prompt edits can unlock big improvements.
Interactive web app created by the model showing security calculation and risk outlook

“We are dealing with, almost like, I don’t know, an alien intelligence in the sense of alien being foreign.”

Practical prompt improvements (copy / paste)

Use this refined learning prompt as a template:

  1. I want you to be my personal subject guide on: [TOPIC].
  2. Provide: (a) a simple analogy; (b) a detailed breakdown with a worked example; (c) a short knowledge check (3 Q&A); (d) next steps and recommended resources; (e) one-page shareable brief and an infographic; (f) an interactive outline or web app mockup if possible; (g) a bibliography with full URLs for all sources.
  3. Format the one-page brief so I can save as PDF and include clickable links in the bibliography.

Tips to get the best results

  • Tell the model what final artifact you want (PDF, infographic, web app) before asking it to research.
  • Ask explicitly for URLs and inline citations to minimize hallucinations.
  • Combine tools: use Perplexity or GPT-5 for sourcing, Gemini for visuals and polished artifacts, and NotebookLM for audio/video summaries.
  • Iterate — don’t expect perfection on the first pass. Use follow-ups like “Make this more visual” or “Add technical metrics (FLOPS/HBM) to the breakdown.”

Conclusion — which tool won?

For this specific multimodal learning task, the host concluded that Google’s Gemini (Pro) was the best overall: it nailed the polished infographic, interactive artifacts, and a strong UI for requesting different deliverables. GPT-5 scored points for creating an actual downloadable PDF brief. Claude was solid for structured text and iterative interactive mockups. Perplexity remains a very useful source-focused engine for verifying and gathering links. NotebookLM adds excellent audio/video overviews that aid memory retention.