This large multimodal model combines text, vision, and interface interaction in a single system, enabling it to understand screenshots, videos, and documents. It can also reason in
This large multimodal model combines text, vision, and interface interaction in a single system, enabling it to understand screenshots, videos, and documents. It can also reason in multiple steps and handle over 200 languages
Updated means this listing was last refreshed on Feb 27, 2026.
Create hyperrealistic video montages by replacing any face. This AI deals with complex angles and expressions for virtually undetectable results, both in photos and animated clips.
Create complete videos from a simple sentence using AI. You can combine up to 12 references (9 images, 3 videos, 3 audio clips of up to 15 seconds) to maintain character consistenc
A search engine dedicated to YouTube videos that can identify a specific passage from a simple description. The results point directly to the most relevant content to save time
Deep Swap AI
Create hyperrealistic video montages by replacing any face. This AI deals with complex angles and expressions for virtually undetectable results, both in photos and animated clips.
StudyBuddy
AI tool for summarizing documents and generating study aids like flashcards and quizzes.
Wispr Flow
AI voice-to-text that types anywhere on your computer as you speak
Voicenotes
AI voice recorder that transcribes, summarizes, and organizes your spoken notes