Discover AI models for every task
Showing 1-22 of 22 models
by ZAI
ZAI's frontier 744B MoE model (40B activated) with 203K context. Excels at agentic engineering, coding (SWE-bench 77.8%), reasoning, and tool use. Built with asynchronous RL and MIT licensed.
ZAI GLM 5.1 is a 744B parameter Mixture-of-Experts language model built with the GLM‑MoE DSA architecture. It excels at agentic engineering, achieving state-of-the-art performance on benchmarks such as HLE with tools (52.3), SWE‑Bench Pro (58.4) and AIME 2026 (95.3). The model supports extensive tool use and long‑horizon reasoning, with a large context window of up to 128K tokens. It is released under the MIT license.
GLM-4.5 Air is a compact 106B parameter Mixture-of-Experts model with 12B active parameters, optimized for efficiency while maintaining strong performance. Scores 59.8 across 12 industry benchmarks with superior resource efficiency compared to full GLM-4.5. Features hybrid reasoning mode with 128K context, supports intelligent agent functions and tool calling. Released under MIT license with commercial use allowed. Ideal for deployment scenarios requiring balance between capability and computational cost.
GLM-4.5 is a 355B parameter Mixture-of-Experts foundation model with 32B active parameters, designed for intelligent agents. Features hybrid reasoning mode with configurable thinking enabled by default. Ranks 3rd place at 63.2 across 12 industry benchmarks among all proprietary and open-source models. Released under MIT license with 128K context, supports reasoning, coding, and intelligent agent functions including OpenAI-style tool calling. Incorporates MTP (Multi-Token Prediction) layers with speculative decoding for efficient inference.
GLM-4.6 is a frontier-scale 355B parameter Mixture-of-Experts model with a 200K context window and 128K output capability. MIT licensed, making it the only model in its class that enterprises can self-host and deeply customize. Dominates LiveCodeBench v6 (#1, 82.8%), HLE (#1), excels at AIME 2025 (#3, 93.9%) and Terminal-Bench (#3, 40.5%). Near parity with Claude Sonnet 4 (48.6% win rate) while dramatically outperforming other open-source baselines. Purpose-built for agentic workflows, real-world coding, and tool-augmented problem-solving. Supports native tool calling during inference for complex multi-step tasks.
GLM-4.7 is Z.ai's latest large language model with enhanced reasoning capabilities. Excels at mathematical problem solving, coding, and complex logical tasks. Features improved context understanding and multilingual support.
by Qwen
Qwen 3.5 9B is a 9B‑parameter multimodal large language model with a gated‑delta mixture‑of‑experts architecture and a vision encoder. It supports a native context window of 262,144 tokens and operates in a default thinking mode that can be disabled. The model achieves strong results such as 82.5% on MMLU‑Pro, 88.2% on C‑Eval, and 78.4% on MMMU benchmarks. It is released under the Apache 2.0 license.
Qwen 3.5 397B A17B is a 397B-parameter mixture-of-experts vision-language foundation model with a gated delta network architecture and a vision encoder. It supports a native context window of 262,144 tokens (extendable to over 1 million) and operates in a default thinking mode that can be disabled. The model achieves strong results such as 87.8% on MMLU‑Pro, 85.0% on MMMU, and 88.6% on MathVision benchmarks. It is released under the Apache 2.0 license.
High-end multimodal model delivering strong vision-language reasoning with long-context support.
by Moonshot
Moonshot AI's most powerful native multimodal agentic model. Features 1T parameters (32B activated), 256K context, vision capabilities, and advanced reasoning with agent swarm support.
by MiniMax
MiniMax M2.5 is a state-of-the-art reasoning MoE model with 229B total / 10B active parameters. Extensively trained with reinforcement learning across 200,000+ real-world environments, achieving SOTA performance in coding (80.2% SWE-Bench Verified), agentic tool use, search, and office productivity tasks. Features 197K context window, efficient MoE inference, and strong multilingual support.
by BAAI
BAAI BGE Large EN V1.5 is a state-of-the-art English dense retrieval embedding model with 1024-dimensional embeddings and 512 token sequence length. Achieves 64.23 average on MTEB leaderboard across 56 tasks with 54.29 on retrieval. Pre-trained with RetroMAE and fine-tuned on large-scale contrastive learning data. V1.5 improvements include better similarity distribution and flexible usage without query instructions. Ideal for semantic search, document retrieval, re-ranking pipelines, and sentence similarity tasks. Production-ready with 3.4M+ downloads/month.
Qwen3.5-122B-A10B is Alibaba Cloud's native multimodal agent model with 122B total parameters (10B activated). Features 240K context, vision capabilities, hybrid reasoning with extended thinking, function calling, and support for 201 languages. Apache 2.0 licensed.
Qwen2.5 72B Instruct is Alibaba's instruction-tuned large language model with 72B parameters. Excels at following complex instructions, coding, mathematical reasoning, and multilingual tasks. Features 128K context window.
by Google
Larger Gemma model delivering high-quality chat and coding with efficient inference.
Gemma 3 27B IT is a cutting-edge multimodal vision-language model with 27 billion parameters, built on Gemini technology. Trained on 14 trillion tokens, it handles both text and image inputs with a 128K context window and supports 140+ languages. Excels at visual understanding, code generation, mathematical reasoning, and multilingual tasks. Achieves 78.6 on MMLU, 82.6 on GSM8K, 85.6 on DocVQA, and 76.3 on ChartQA. Lightweight enough for laptop deployment with strong safety improvements over previous Gemma versions.
by Meta
Moderation model providing robust safety classification and policy enforcement.
Google Gemma 4 31B is a 31B parameter dense multimodal language model with a 256K context window. It processes text, images, and video inputs and generates text output, featuring a configurable thinking mode for step‑by‑step reasoning. The model achieves 85.2% on MMLU Pro, 80.0% on LiveCodeBench v6, and 88.4% on MMMLU, demonstrating strong performance across reasoning and multimodal benchmarks. Available under the Apache 2.0 license.
by OpenAI
GPT-OSS 120B is a powerful 117B parameter Mixture-of-Experts reasoning model with 5.1B active parameters, released under Apache 2.0. Features configurable reasoning effort (low/medium/high), full chain-of-thought visibility, and runs on a single 80GB GPU thanks to MXFP4 quantization. Native support for function calling, web browsing, Python code execution, and structured outputs. Designed for agentic tasks and complex reasoning with production-grade performance. Fully customizable for specialized use cases on single H100/MI300X.
GPT-OSS 20B is a compact 21B parameter Mixture-of-Experts model with 3.6B active parameters, designed for lower latency and local deployment. Runs within 16GB memory with configurable reasoning effort, full chain-of-thought access, and native agentic capabilities including function calling and structured outputs. Released under Apache 2.0 license, ideal for specialized fine-tuning on consumer hardware. Companion model to GPT-OSS 120B optimized for speed while maintaining strong reasoning capabilities.
BAAI BGE Multilingual Gemma2 is a multilingual dense retrieval embedding model built on Gemma 2 architecture, supporting 100+ languages for cross-lingual semantic search and retrieval. Delivers strong performance across diverse language families including English, Chinese, Spanish, Arabic, Hindi, and many more. Ideal for multilingual search systems, cross-lingual document retrieval, international content recommendation, and global knowledge bases. Trained on large-scale multilingual data with balanced language representation.
Qwen2-based text embedding model optimized for semantic similarity and retrieval tasks.