视觉语言模型可视化

less than 1 minute read

Published: November 02, 2025

主要包含以下内容：像素和标记数据流（the pixel and token data flow），以及张量转换（tensor transformations），上下文窗口（the context window）、多头注意（Multi-Head Attention）、分组查询注意（Grouped-Query Attention）和滑动窗口注意（Sliding-Window Attention）。探索自回归的本质以及空间推理的局限性（spatial reasoning limitations）。

Share on

Twitter Facebook LinkedIn

视觉语言模型可视化

Share on

You May Also Enjoy

李沐：From LLM to Personal Career

基础入门：怎么读论文/找论文

Learning Research

CV领域困惑解答:如何从基础做研究