CVPR2025

CVPR 2025 Accepted Papers

CVPR25 Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Video-MME

https://github.com/BradyFU/Video-MME?tab=readme-ov-file

1、Awesome-LLMs-for-Video-Understanding

https://github.com/yunlong10/Awesome-LLMs-for-Video-Understanding

https://arxiv.org/pdf/2312.17432v4 (2407修订版)

From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding

https://arxiv.org/pdf/2409.18938

https://github.com/Vincent-ZHQ/LV-LLMs

paperwithcode Video Understanding

https://paperswithcode.com/task/video-understanding/latest

Awesome-Multimodal-Large-Language-Models

https://github.com/yfzhang114/Awesome-Multimodal-Large-Language-Models?tab=readme-ov-file

MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs

https://arxiv.org/pdf/2411.15296

万字长文总结多模态大模型评估最新进展 - yearn的文章 - 知乎

https://zhuanlan.zhihu.com/p/16815782175

另一个Awesome-Multimodal-Large-Language-Models

https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models?tab=readme-ov-file

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

多模态学习有什么好的研究方向? - 梦想成真的回答 - 知乎

https://www.zhihu.com/question/332876504/answer/130142183129

Logo

火山引擎开发者社区是火山引擎打造的AI技术生态平台,聚焦Agent与大模型开发,提供豆包系列模型(图像/视频/视觉)、智能分析与会话工具,并配套评测集、动手实验室及行业案例库。社区通过技术沙龙、挑战赛等活动促进开发者成长,新用户可领50万Tokens权益,助力构建智能应用。

更多推荐