Conversational AI and chatbots
5 tools
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding
The first distributed AGI system. Thousands of autonomous AI agents collaboratively train models, share experiments via P2P gossip, and push breakt...
42万字拆解 AI Agent 的Harness骨架与神经 —— Claude Code 架构深度剖析,15 章从对话循环到构建你自己的 Agent Harness。在线阅读网站:
Desktop AI assistant using ChatGPT for writing, coding, learning, and productivity.
An autonomous agent that takes work, does work, gets paid, and gets better at it.
Production-grade multi-agent orchestration framework. Model-agnostic, supports team collaboration, task scheduling, and inter-agent communication.
DeFi toolkit for AI agents and coding assistants — deposit funds, execute trades, and manage crypto wallets. Works with Claude Code, Cursor, Windsu...
Claude reads its own source code — 17-chapter architectural deep-dive into Claude Code v2.1.88. EN/ZH bilingual.
日本株の自律型リサーチAIエージェント|AI agent for deep financial research on Japanese listed companies. Powered by EDINET DB + J-Quants.
🎬 全自动 AI 视频代理 · 一句话生成带字幕成片 · Fully Automated AI Video Agent · Local Deployment
M-Cube (M³) — Multi-thinking, Multimodal, Multi-verification Patent Drafting Assistant
Mobile-O: Unified Multimodal Understanding and Generation on Mobile Device
MCP server for Google Ads, Meta Ads & GA4 — works with ChatGPT, Claude, Cursor, n8n, Windsurf & more. 250+ tools for campaign management, analytics...
说人话|Chinese-first AI writing refinement skill that reduces boilerplate and AI tone while preserving facts, terminology, and technical context.
OpenCode plugin for Magic Context — cache-aware infinite context, cross-session memory, and background history compression for AI coding agents
Compare the best AI models including ChatGPT-5, Google Gemini 2.5, Claude 4 Sonnet, DeepSeek R1, Llama 4, Perplexity, Grok, and 30+ other AI Models
The open-source memory operating system for AI agents. Persistent memory, semantic search, loop detection, agent messaging, crash recovery, and rea...
This large multimodal model combines text, vision, and interface interaction in a single system, enabling it to understand screenshots, videos, and documents. It can also reason in