Discover AI models for every task
Showing 1-9 of 9 models
by NousResearch
NousResearch Hermes 4 405B is the flagship hybrid-mode reasoning model based on Meta's Llama-3.1-405B architecture. Trained on a massive ~60B token corpus with explicit <think> deliberation segments, it delivers frontier-level performance in math, code, STEM, logic, and creative tasks. Achieves SOTA on RefusalBench for helpful, uncensored responses aligned to user values. Supports advanced function calling, structured JSON outputs, and tool use with extreme steerability and reduced refusal rates.
NousResearch Hermes 4 70B is a frontier hybrid-mode reasoning model based on Llama-3.1-70B, trained on ~60B tokens across ~5M samples. Features explicit <think> deliberation segments with massive improvements in math, code, STEM, logic, and creativity. Achieves SOTA on RefusalBench for helpful, uncensored responses while maintaining alignment to user values. Supports schema-adherent structured JSON outputs, function calling, and tool use. Trained for extreme steerability with reduced refusal rates compared to previous Hermes versions.
by Meta
Meta's flagship 405B parameter model representing the pinnacle of open-source AI. Exceptional reasoning and comprehensive knowledge for demanding applications.
by Google
Google Gemma 4 31B is a 31B parameter dense multimodal language model with a 256K context window. It processes text, images, and video inputs and generates text output, featuring a configurable thinking mode for step‑by‑step reasoning. The model achieves 85.2% on MMLU Pro, 80.0% on LiveCodeBench v6, and 88.4% on MMMLU, demonstrating strong performance across reasoning and multimodal benchmarks. Available under the Apache 2.0 license.
by ZAI
GLM-4.5 is a 355B parameter Mixture-of-Experts foundation model with 32B active parameters, designed for intelligent agents. Features hybrid reasoning mode with configurable thinking enabled by default. Ranks 3rd place at 63.2 across 12 industry benchmarks among all proprietary and open-source models. Released under MIT license with 128K context, supports reasoning, coding, and intelligent agent functions including OpenAI-style tool calling. Incorporates MTP (Multi-Token Prediction) layers with speculative decoding for efficient inference.
GLM-4.5 Air is a compact 106B parameter Mixture-of-Experts model with 12B active parameters, optimized for efficiency while maintaining strong performance. Scores 59.8 across 12 industry benchmarks with superior resource efficiency compared to full GLM-4.5. Features hybrid reasoning mode with 128K context, supports intelligent agent functions and tool calling. Released under MIT license with commercial use allowed. Ideal for deployment scenarios requiring balance between capability and computational cost.
GLM-4.6 is a frontier-scale 355B parameter Mixture-of-Experts model with a 200K context window and 128K output capability. MIT licensed, making it the only model in its class that enterprises can self-host and deeply customize. Dominates LiveCodeBench v6 (#1, 82.8%), HLE (#1), excels at AIME 2025 (#3, 93.9%) and Terminal-Bench (#3, 40.5%). Near parity with Claude Sonnet 4 (48.6% win rate) while dramatically outperforming other open-source baselines. Purpose-built for agentic workflows, real-world coding, and tool-augmented problem-solving. Supports native tool calling during inference for complex multi-step tasks.
GLM-4.7 is Z.ai's latest large language model with enhanced reasoning capabilities. Excels at mathematical problem solving, coding, and complex logical tasks. Features improved context understanding and multilingual support.
by Mistral
Mistral Small 4 is a 119B-parameter Mixture-of-Experts model (128 experts, 4 active per token, 6.5B active parameters) that unifies instruct, reasoning, and coding capabilities into a single multimodal model. It accepts text and image inputs, supports function calling, structured outputs, and configurable reasoning effort (none for fast responses, high for deep step-by-step reasoning). With a 256K context window and Apache 2.0 license, it delivers 40% lower latency and 3x higher throughput compared to Mistral Small 3.