For the first time, your AI assistant can understand videos, find relevant scenes, and generate summaries for you
SAN FRANCISCO, Sept. 24, 2025 /PRNewswire-PRWeb/ -- TwelveLabs, the leader in multimodal video intelligence, today announced the launch of its Model Context Protocol (MCP) Server. Now TwelveLabs uniquely enables AI assistants and agents to understand and interact with video data at scale for the first time.`
The TwelveLabs MCP Server bridges the company's industry-leading video understanding models with popular AI clients, such as Claude Desktop, Cursor, and Goose. Built on the open MCP standard, the server acts as a universal adapter, allowing developers to give their AI applications video "superpowers" through a plug-and-play interface.
"With MCP, video becomes a first-class capability inside any AI workflow," said Jae Lee, CEO at TwelveLabs. "Developers no longer need to stitch together APIs or build custom integrations. Our view for a long time has been that multi-modal shouldn't mean multi-model. Now, agents can instantly search, summarize, and reason over hours of video, just by spinning up our MCP server."
Unlocking New Use Cases
TwelveLabs MCP Server makes it easy to give an AI agent eyes on video content by simply adding a standardized tool to its toolbox. This can unlock a new wave of multimodal applications, from smarter virtual assistants that understand meeting recordings, to creative generative agents that mix video context into their outputs.
By exposing TwelveLabs' video-native models, Marengo for multimodal embeddings and Pegasus for video-to-text reasoning, through MCP, the server enables:
- Semantic search: Find exact moments across hours of footage with natural language.
- Automatic summaries & Q&A: Turn long content or events into concise reports.
- RAG-style chaining: Combine search and analysis tools to build multi-step video workflows.
- Interactive assistants: AI agents that collaborate with users in real time to explore video.
The TwelveLabs MCP Server has been verified with Claude Desktop, Cursor, and Goose, with more integrations coming. Developers can get started in minutes by following the Installation Guide and connecting with their TwelveLabs API key.
To learn more, dive into our blog here.
About TwelveLabs
TwelveLabs is the world's most powerful video intelligence platform, enabling machines to see, hear, and reason about video like humans do. From semantic search to automated summaries and multimodal embeddings, TwelveLabs empowers developers and enterprises to unlock the full potential of video data across industries including media, advertising, security, and automotive. For more information, visit www.twelvelabs.io.
Media Contact
Amber Moore, Moore Communications, 1 5039439381, [email protected], Moore Communications
SOURCE Moore Communications

Share this article