字节开源 Lance：3B 参数搞定图像视频六合一，VBench 登顶统一模型榜

字节跳动智能创作实验室开源了 Lance——一个用 3B 参数同时拿下六项多模态任务的统一模型。

📌 GitHub：https://github.com/bytedance/Lance
HuggingFace：https://huggingface.co/bytedance-research/Lance
论文：http://arxiv.org/abs/2605.18678
项目主页：https://lance-project.github.io
协议：Apache 2.0

六项能力，一个模型

Lance 把以下六件事塞进了同一套权重：

任务	说明
图像理解	看图说话、VQA、视觉推理
图像生成	文本→图像，最高 768×768
图像编辑	换背景、改内容、去水印
视频理解	视频字幕、视频问答
视频生成	文本/图像→视频，480p / 12fps / 最长 121 帧
视频编辑	改场景、换人物、加特效

以前需要 6 个专用模型分别完成的事，Lance 一套搞定。

基准成绩

GenEval（图像生成）：0.90，与 7B 模型并列统一模型第一
VBench（视频生成）：85.11，统一模型榜第一
GEdit-Bench（图像编辑）：7.30，统一模型最高
MVBench（视频理解）：62.0，统一模型最高，领先第二名十几个点
DPG-Bench（图像生成）：84.67

技术要点

Lance 采用双流 MoE 架构，配合自研的 MaPE（Modality-Aware Rotary Positional Encoding），将理解路径和生成路径解耦，同时共享多模态上下文。分阶段多任务训练策略让各能力互相增益而非干扰。

激活参数仅 3B，训练用了最多 128 张 A100。部署门槛：Python 3.10+，CUDA 12.4+，最低 40GB 显存。

目前以研究预览形式开放，Fine-tuning 代码和图像转视频功能在路线图上，尚未发布。

© 2026 Author: Mycelium Protocol. 本文采用 CC BY 4.0 授权——欢迎转载和引用，须注明作者姓名及原文链接，不得去除署名后以原创发布。

Benchmark Highlights

GenEval (image gen): 0.90 — tied first among unified models alongside 7B models

VBench (video gen): 85.11 — tops all unified models

GEdit-Bench (image edit): 7.30 — highest among unified models

MVBench (video understanding): 62.0 — highest among unified models, double-digit lead over second place

DPG-Bench: 84.67

Architecture

Lance uses a dual-stream MoE design with MaPE (Modality-Aware Rotary Positional Encoding) that decouples understanding and generation pathways while sharing interleaved multimodal context. Staged multi-task training lets the six capabilities reinforce rather than interfere with each other.

Requirements: Python 3.10+, CUDA 12.4+, minimum 40GB VRAM. Fine-tuning code and image-to-video support are on the roadmap, not yet released.

© 2026 Author: Mycelium Protocol. Licensed under CC BY 4.0 — free to share and adapt with attribution. You must credit the author and link to the original; removing attribution and republishing as original is not permitted.

字节开源 Lance：3B 参数搞定图像视频六合一，VBench 登顶统一模型榜

ByteDance Open-Sources Lance: 3B Parameters, Six Tasks, VBench Top Among Unified Models

六项能力，一个模型

基准成绩

技术要点

Six Capabilities, One Model

Benchmark Highlights

Architecture

💬 评论与讨论

字节开源 Lance：3B 参数搞定图像视频六合一，VBench 登顶统一模型榜

ByteDance Open-Sources Lance: 3B Parameters, Six Tasks, VBench Top Among Unified Models

六项能力，一个模型

基准成绩

技术要点

Six Capabilities, One Model

Benchmark Highlights

Architecture

相关文章

💬 评论与讨论