Discover AI models for every task
Showing 1-14 of 14 models
by MiniMax
MiniMax M2.1 is a state-of-the-art MoE model with 230B total / 10B active parameters, optimized for agentic coding and complex multi-step workflows. Excels at multilingual programming, tool use, and long-horizon planning. Matches Claude Sonnet 4.5 on code benchmarks and exceeds it in multilingual scenarios. Features 196K context window with FP8 efficiency. Released under Modified-MIT license for commercial use.
MiniMax M2.7 is a MiniMaxM2 architecture model with an undisclosed parameter count, optimized for advanced reasoning and agentic workflows. It excels at complex tool use, self‑evolution, and professional software engineering tasks, achieving a 66.6% medal rate on MLE Bench Lite and a 56.2% score on SWE Bench Pro. The model also attains an ELO of 1495 on GDPval‑AA, surpassing other open‑weight models. Available under an Other license.
MiniMax M2 is a powerful MoE model with 200B total / 10B active parameters. Optimized for reasoning and coding tasks with excellent performance in multilingual scenarios. Features 128K context window and efficient inference.
MiniMax M2.5 is a state-of-the-art reasoning MoE model with 229B total / 10B active parameters. Extensively trained with reinforcement learning across 200,000+ real-world environments, achieving SOTA performance in coding (80.2% SWE-Bench Verified), agentic tool use, search, and office productivity tasks. Features 197K context window, efficient MoE inference, and strong multilingual support.
by ZAI
ZAI GLM 5.1 is a 744B parameter Mixture-of-Experts language model built with the GLM‑MoE DSA architecture. It excels at agentic engineering, achieving state-of-the-art performance on benchmarks such as HLE with tools (52.3), SWE‑Bench Pro (58.4) and AIME 2026 (95.3). The model supports extensive tool use and long‑horizon reasoning, with a large context window of up to 128K tokens. It is released under the MIT license.
by Mistral
Mistral Small 3.2 24B Instruct is a multimodal instruction-tuned model supporting both vision and text with 24B parameters and 128K context. Major improvements over 3.1 include better instruction following (84.78%), 2x reduction in repetition errors, and robust function calling. Achieves 65.33% on Wildbench v2, 43.1% on Arena Hard v2, 92.90% on HumanEval Pass@5. Vision benchmarks: 87.4% ChartQA, 94.86% DocVQA, 62.50% MMMU. Supports up to 10 images per prompt with integrated vision-based function calling.
by Deepseek
DeepSeek V3.1 is an optimized variant of DeepSeek V3 with enhanced chat capabilities. Offers excellent cost-efficiency with 685B MoE architecture and improved response quality for conversational tasks.
by Meta
Meta Llama 3.1 8B Instruct is an efficient multilingual instruction-tuned model optimized for dialogue and assistant use cases. With 8 billion parameters and 128K context length, it provides strong performance across general tasks, code generation, and multilingual understanding. Supports function calling and tool use with Grouped-Query Attention architecture. Ideal for deployment scenarios requiring lower compute resources while maintaining quality across English and 7 additional languages including German, French, Spanish, and Hindi.
by intfloat
intfloat E5-Mistral-7B-Instruct is a state-of-the-art instruction-following embedding model built on Mistral 7B architecture. Combines strong language understanding from Mistral with specialized embedding training for retrieval tasks. Features instruction-based embedding generation allowing natural language queries to guide semantic search. Excels at complex retrieval scenarios, multi-hop reasoning in document search, and instruction-guided similarity tasks. Provides significantly improved zero-shot retrieval performance compared to traditional embedding models.
by Black Forest Labs
Black Forest Labs FLUX.1 [dev] is a cutting-edge 12 billion parameter rectified flow transformer for text-to-image generation. Second only to FLUX.1 [pro] with strong prompt following matching closed-source alternatives. Features guidance distillation for efficient inference, high-resolution generation (1024x1024), accurate text rendering, and detailed composition. Supports both text-to-image and image-to-image generation. Open weights enable scientific research and innovative workflows.
Mistral Small 4 is a 119B-parameter Mixture-of-Experts model (128 experts, 4 active per token, 6.5B active parameters) that unifies instruct, reasoning, and coding capabilities into a single multimodal model. It accepts text and image inputs, supports function calling, structured outputs, and configurable reasoning effort (none for fast responses, high for deep step-by-step reasoning). With a 256K context window and Apache 2.0 license, it delivers 40% lower latency and 3x higher throughput compared to Mistral Small 3.
Black Forest Labs FLUX.1 [dev] with LoRA adapter support. This variant enables fine-tuned generation with custom trained LoRA weights for specialized styles, characters, or concepts. Based on the full 12B parameter FLUX.1 [dev] model with all its capabilities including high-resolution generation, accurate text rendering, and detailed composition. Perfect for custom workflows and specialized image generation tasks.
Black Forest Labs FLUX.1 [schnell] is the fastest variant of the FLUX.1 family, optimized for rapid text-to-image generation with fewer inference steps. Built on the same 12B parameter rectified flow transformer architecture as FLUX.1 [dev] but distilled for maximum speed. Generates high-quality 1024x1024 images in 1-4 steps compared to 20-50 steps for standard models. Ideal for real-time applications, interactive tools, and high-throughput image generation scenarios. Apache 2.0 licensed for unrestricted use including commercial applications.
Meta's flagship 405B parameter model representing the pinnacle of open-source AI. Exceptional reasoning and comprehensive knowledge for demanding applications.