Discover AI models for every task
Showing 1-24 of 28 models
by Qwen
Qwen 3.5 397B A17B is a 397B-parameter mixture-of-experts vision-language foundation model with a gated delta network architecture and a vision encoder. It supports a native context window of 262,144 tokens (extendable to over 1 million) and operates in a default thinking mode that can be disabled. The model achieves strong results such as 87.8% on MMLU‑Pro, 85.0% on MMMU, and 88.6% on MathVision benchmarks. It is released under the Apache 2.0 license.
Qwen3.5-122B-A10B is Alibaba Cloud's native multimodal agent model with 122B total parameters (10B activated). Features 240K context, vision capabilities, hybrid reasoning with extended thinking, function calling, and support for 201 languages. Apache 2.0 licensed.
Qwen3 30B A3B Thinking is the reasoning-focused MoE variant with 30B total / 3B activated parameters. Features explicit thinking mode for complex problem-solving with 262K native context extending to 1M tokens. Optimized for mathematical reasoning, logical inference, and multi-step problem decomposition while maintaining computational efficiency. Provides strong reasoning capabilities at a fraction of the compute cost of larger thinking models, ideal for resource-conscious deployments requiring deep reasoning.
Qwen3 30B A3B Instruct is a compact Mixture-of-Experts model with 30B total parameters and 3B activated per token, offering excellent efficiency for general-purpose tasks. Features 262K native context with extension to 1M tokens, strong multilingual capabilities, and enhanced instruction following. Balances performance and computational efficiency with support for tool calling, code generation, and logical reasoning. Ideal for deployment scenarios requiring lower resource usage while maintaining quality across diverse task types.
Qwen3 32B is a base foundation model with 32 billion parameters and 262K native context, designed for fine-tuning and custom adaptations. Pre-trained on diverse multilingual data covering 77.5% of languages, providing strong general capabilities across text understanding, code, mathematics, and reasoning. Serves as the foundation for specialized models and custom fine-tuning projects requiring a powerful mid-sized base. Ideal starting point for domain-specific adaptations and research applications.
Qwen 3.5 9B is a 9B‑parameter multimodal large language model with a gated‑delta mixture‑of‑experts architecture and a vision encoder. It supports a native context window of 262,144 tokens and operates in a default thinking mode that can be disabled. The model achieves strong results such as 82.5% on MMLU‑Pro, 88.2% on C‑Eval, and 78.4% on MMMU benchmarks. It is released under the Apache 2.0 license.
Qwen's "thinking-optimized" 80B model designed for sustained multi-step reasoning, structured deliberation, and high-precision problem-solving across math, code, and complex planning tasks.
Qwen3 Coder 30B A3B Instruct is an efficient Mixture-of-Experts coding model with 30B total parameters and 3B activated per token. Specialized for code generation, debugging, and software engineering with excellent computational efficiency. Features 262K native context for processing large codebases, strong multi-language programming support, and optimized for practical coding tasks. Balances coding performance with lower resource requirements, ideal for development environments and real-time code assistance.
Qwen3 Coder 480B A35B Instruct is a specialized Mixture-of-Experts coding model with 480B total parameters and 35B activated. Optimized specifically for code generation, code understanding, debugging, and software engineering tasks. Features 262K native context for handling large codebases, strong performance on coding benchmarks including LiveCodeBench and HumanEval, and support for multiple programming languages. Excels at complex algorithmic problems, code refactoring, and technical documentation generation.
Qwen3 235B A22B Thinking is the reasoning-enhanced MoE variant with 235B total / 22B activated parameters and 128 experts. Features explicit thinking mode for complex problem-solving with native 262K context extending to 1M tokens. Excels at deep reasoning tasks requiring multi-step deliberation including advanced mathematics, logical inference, and complex coding challenges. Built on same architecture as Instruct version but optimized for reasoning-heavy workloads with tool integration and agentic capabilities.
Qwen3 235B A22B Instruct is a Mixture-of-Experts model with 235B total parameters and 22B activated, featuring 128 experts with 8 activated per token. Native 262K context extended to 1M tokens via Dual Chunk Attention. Achieves SOTA: 83.0 MMLU-Pro, 70.3 AIME25, 41.8 ARC-AGI, 79.2 Arena-Hard v2, 51.8 LiveCodeBench, 70.9 BFCL-v3. Non-thinking mode focused on direct task execution with enhanced instruction following, logical reasoning, and long-tail knowledge across multiple languages. Dramatically more efficient than full 235B models.
Qwen3 Embedding 8B is a dense retrieval embedding model with 8 billion parameters, optimized for semantic search, text similarity, and feature extraction. Trained on diverse multilingual data providing strong cross-lingual retrieval capabilities. Supports 262K context for embedding long documents and extensive text passages. Excels at document retrieval, semantic search, clustering, and recommendation systems. Compatible with standard embedding frameworks and optimized for production deployment with efficient inference.
Qwen3-Coder-Next is an open-weight coding model optimized for agents and local development, delivering high performance, long-context reasoning, and efficient IDE integration with minimal active parameters.
Qwen3 VL 235B A22B Instruct is Alibaba's vision-language MoE model with 235B total / 22B active parameters. Combines state-of-the-art text and vision understanding with excellent performance on multimodal reasoning tasks.
High-end multimodal model delivering strong vision-language reasoning with long-context support.
Qwen2.5 72B Instruct is Alibaba's instruction-tuned large language model with 72B parameters. Excels at following complex instructions, coding, mathematical reasoning, and multilingual tasks. Features 128K context window.
Qwen2-based text embedding model optimized for semantic similarity and retrieval tasks.
Image generation model from the Qwen series with advanced text rendering and precise image editing capabilities.
by ZAI
GLM-4.5 Air is a compact 106B parameter Mixture-of-Experts model with 12B active parameters, optimized for efficiency while maintaining strong performance. Scores 59.8 across 12 industry benchmarks with superior resource efficiency compared to full GLM-4.5. Features hybrid reasoning mode with 128K context, supports intelligent agent functions and tool calling. Released under MIT license with commercial use allowed. Ideal for deployment scenarios requiring balance between capability and computational cost.
ZAI GLM 5.1 is a 744B parameter Mixture-of-Experts language model built with the GLM‑MoE DSA architecture. It excels at agentic engineering, achieving state-of-the-art performance on benchmarks such as HLE with tools (52.3), SWE‑Bench Pro (58.4) and AIME 2026 (95.3). The model supports extensive tool use and long‑horizon reasoning, with a large context window of up to 128K tokens. It is released under the MIT license.
by NVIDIA
NVIDIA Nemotron 3 Super 120B A12B FP8 is a 120B parameter (12B active) LatentMixture-of-Experts hybrid model with Mamba-2, MoE and Multi-Token Prediction layers, supporting up to 1M tokens context. It achieves 94.73% on HMMT Feb25 (with tools) and 83.73% on MMLU‑Pro, and scores 73.88% on Arena‑Hard‑V2 (Hard Prompt). The model supports configurable reasoning via an enable_thinking flag, tool use, and structured output. It is available under the NVIDIA Nemotron Open Model License.
by MiniMax
MiniMax M2.5 is a state-of-the-art reasoning MoE model with 229B total / 10B active parameters. Extensively trained with reinforcement learning across 200,000+ real-world environments, achieving SOTA performance in coding (80.2% SWE-Bench Verified), agentic tool use, search, and office productivity tasks. Features 197K context window, efficient MoE inference, and strong multilingual support.
GLM-4.5 is a 355B parameter Mixture-of-Experts foundation model with 32B active parameters, designed for intelligent agents. Features hybrid reasoning mode with configurable thinking enabled by default. Ranks 3rd place at 63.2 across 12 industry benchmarks among all proprietary and open-source models. Released under MIT license with 128K context, supports reasoning, coding, and intelligent agent functions including OpenAI-style tool calling. Incorporates MTP (Multi-Token Prediction) layers with speculative decoding for efficient inference.
ZAI's frontier 744B MoE model (40B activated) with 203K context. Excels at agentic engineering, coding (SWE-bench 77.8%), reasoning, and tool use. Built with asynchronous RL and MIT licensed.