Discover AI models for every task
Showing 1-24 of 35 models
by ZAI
ZAI GLM 5.1 is a 744B parameter Mixture-of-Experts language model built with the GLM‑MoE DSA architecture. It excels at agentic engineering, achieving state-of-the-art performance on benchmarks such as HLE with tools (52.3), SWE‑Bench Pro (58.4) and AIME 2026 (95.3). The model supports extensive tool use and long‑horizon reasoning, with a large context window of up to 128K tokens. It is released under the MIT license.
ZAI's frontier 744B MoE model (40B activated) with 203K context. Excels at agentic engineering, coding (SWE-bench 77.8%), reasoning, and tool use. Built with asynchronous RL and MIT licensed.
GLM-4.5 Air is a compact 106B parameter Mixture-of-Experts model with 12B active parameters, optimized for efficiency while maintaining strong performance. Scores 59.8 across 12 industry benchmarks with superior resource efficiency compared to full GLM-4.5. Features hybrid reasoning mode with 128K context, supports intelligent agent functions and tool calling. Released under MIT license with commercial use allowed. Ideal for deployment scenarios requiring balance between capability and computational cost.
GLM-4.5 is a 355B parameter Mixture-of-Experts foundation model with 32B active parameters, designed for intelligent agents. Features hybrid reasoning mode with configurable thinking enabled by default. Ranks 3rd place at 63.2 across 12 industry benchmarks among all proprietary and open-source models. Released under MIT license with 128K context, supports reasoning, coding, and intelligent agent functions including OpenAI-style tool calling. Incorporates MTP (Multi-Token Prediction) layers with speculative decoding for efficient inference.
GLM-4.6 is a frontier-scale 355B parameter Mixture-of-Experts model with a 200K context window and 128K output capability. MIT licensed, making it the only model in its class that enterprises can self-host and deeply customize. Dominates LiveCodeBench v6 (#1, 82.8%), HLE (#1), excels at AIME 2025 (#3, 93.9%) and Terminal-Bench (#3, 40.5%). Near parity with Claude Sonnet 4 (48.6% win rate) while dramatically outperforming other open-source baselines. Purpose-built for agentic workflows, real-world coding, and tool-augmented problem-solving. Supports native tool calling during inference for complex multi-step tasks.
GLM-4.7 is Z.ai's latest large language model with enhanced reasoning capabilities. Excels at mathematical problem solving, coding, and complex logical tasks. Features improved context understanding and multilingual support.
by Qwen
Qwen3.5-122B-A10B is Alibaba Cloud's native multimodal agent model with 122B total parameters (10B activated). Features 240K context, vision capabilities, hybrid reasoning with extended thinking, function calling, and support for 201 languages. Apache 2.0 licensed.
Qwen 3.5 9B is a 9B‑parameter multimodal large language model with a gated‑delta mixture‑of‑experts architecture and a vision encoder. It supports a native context window of 262,144 tokens and operates in a default thinking mode that can be disabled. The model achieves strong results such as 82.5% on MMLU‑Pro, 88.2% on C‑Eval, and 78.4% on MMMU benchmarks. It is released under the Apache 2.0 license.
Qwen 3.5 397B A17B is a 397B-parameter mixture-of-experts vision-language foundation model with a gated delta network architecture and a vision encoder. It supports a native context window of 262,144 tokens (extendable to over 1 million) and operates in a default thinking mode that can be disabled. The model achieves strong results such as 87.8% on MMLU‑Pro, 85.0% on MMMU, and 88.6% on MathVision benchmarks. It is released under the Apache 2.0 license.
High-end multimodal model delivering strong vision-language reasoning with long-context support.
by Meta
Meta Llama 3.1 8B Instruct is an efficient multilingual instruction-tuned model optimized for dialogue and assistant use cases. With 8 billion parameters and 128K context length, it provides strong performance across general tasks, code generation, and multilingual understanding. Supports function calling and tool use with Grouped-Query Attention architecture. Ideal for deployment scenarios requiring lower compute resources while maintaining quality across English and 7 additional languages including German, French, Spanish, and Hindi.
by Black Forest Labs
Black Forest Labs FLUX.1 [dev] is a cutting-edge 12 billion parameter rectified flow transformer for text-to-image generation. Second only to FLUX.1 [pro] with strong prompt following matching closed-source alternatives. Features guidance distillation for efficient inference, high-resolution generation (1024x1024), accurate text rendering, and detailed composition. Supports both text-to-image and image-to-image generation. Open weights enable scientific research and innovative workflows.
Black Forest Labs FLUX.1 [dev] with LoRA adapter support. This variant enables fine-tuned generation with custom trained LoRA weights for specialized styles, characters, or concepts. Based on the full 12B parameter FLUX.1 [dev] model with all its capabilities including high-resolution generation, accurate text rendering, and detailed composition. Perfect for custom workflows and specialized image generation tasks.
Black Forest Labs FLUX.1 [schnell] is the fastest variant of the FLUX.1 family, optimized for rapid text-to-image generation with fewer inference steps. Built on the same 12B parameter rectified flow transformer architecture as FLUX.1 [dev] but distilled for maximum speed. Generates high-quality 1024x1024 images in 1-4 steps compared to 20-50 steps for standard models. Ideal for real-time applications, interactive tools, and high-throughput image generation scenarios. Apache 2.0 licensed for unrestricted use including commercial applications.
by MiniMax
MiniMax M2.5 is a state-of-the-art reasoning MoE model with 229B total / 10B active parameters. Extensively trained with reinforcement learning across 200,000+ real-world environments, achieving SOTA performance in coding (80.2% SWE-Bench Verified), agentic tool use, search, and office productivity tasks. Features 197K context window, efficient MoE inference, and strong multilingual support.
by Moonshot
Moonshot AI's most powerful native multimodal agentic model. Features 1T parameters (32B activated), 256K context, vision capabilities, and advanced reasoning with agent swarm support.
Meta's flagship 405B parameter model representing the pinnacle of open-source AI. Exceptional reasoning and comprehensive knowledge for demanding applications.
by BAAI
BAAI BGE Large EN V1.5 is a state-of-the-art English dense retrieval embedding model with 1024-dimensional embeddings and 512 token sequence length. Achieves 64.23 average on MTEB leaderboard across 56 tasks with 54.29 on retrieval. Pre-trained with RetroMAE and fine-tuned on large-scale contrastive learning data. V1.5 improvements include better similarity distribution and flexible usage without query instructions. Ideal for semantic search, document retrieval, re-ranking pipelines, and sentence similarity tasks. Production-ready with 3.4M+ downloads/month.
Qwen2.5 72B Instruct is Alibaba's instruction-tuned large language model with 72B parameters. Excels at following complex instructions, coding, mathematical reasoning, and multilingual tasks. Features 128K context window.
MiniMax M2.1 is a state-of-the-art MoE model with 230B total / 10B active parameters, optimized for agentic coding and complex multi-step workflows. Excels at multilingual programming, tool use, and long-horizon planning. Matches Claude Sonnet 4.5 on code benchmarks and exceeds it in multilingual scenarios. Features 196K context window with FP8 efficiency. Released under Modified-MIT license for commercial use.
by Deepseek
DeepSeek V3.1 is an optimized variant of DeepSeek V3 with enhanced chat capabilities. Offers excellent cost-efficiency with 685B MoE architecture and improved response quality for conversational tasks.
by OpenAI
GPT-OSS 120B is a powerful 117B parameter Mixture-of-Experts reasoning model with 5.1B active parameters, released under Apache 2.0. Features configurable reasoning effort (low/medium/high), full chain-of-thought visibility, and runs on a single 80GB GPU thanks to MXFP4 quantization. Native support for function calling, web browsing, Python code execution, and structured outputs. Designed for agentic tasks and complex reasoning with production-grade performance. Fully customizable for specialized use cases on single H100/MI300X.
by Google
Gemma 3 27B IT is a cutting-edge multimodal vision-language model with 27 billion parameters, built on Gemini technology. Trained on 14 trillion tokens, it handles both text and image inputs with a 128K context window and supports 140+ languages. Excels at visual understanding, code generation, mathematical reasoning, and multilingual tasks. Achieves 78.6 on MMLU, 82.6 on GSM8K, 85.6 on DocVQA, and 76.3 on ChartQA. Lightweight enough for laptop deployment with strong safety improvements over previous Gemma versions.
Larger Gemma model delivering high-quality chat and coding with efficient inference.