Models

All models available on Lambda Chat.

DeepSeek R1 Distill Llama 3.3 70B logo
Active
DeepSeek R1 Distill Llama 3.3 70B DeepSeek R1 Distill Llama 70B is a distilled large language model based on Llama-3.3-70B-Instruct, using outputs from DeepSeek R1.
DeepSeek R1 671B logo
DeepSeek R1 671B Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass.
DeepSeek V3-0324 logo
Login Required
DeepSeek V3-0324 State of the art Mixture-of-Experts model with coding performance on par with Claude Sonnet 3.7 while being fast and cost-effective.
DeepSeek R1 0528 logo
Login Required
DeepSeek R1 0528 DeepSeek R1 0528 is an FP8-quantized model based on the deepseek_v3 architecture, designed for complex computations and demanding workloads. It has demonstrated outstanding performance across various benchmark evaluations, including mathematics, programming, and general logic. Its overall performance is now approaching that of leading models, such as O3 and Gemini 2.5 Pro.
Apriel 5B Instruct logo
Apriel 5B Instruct The Apriel model family offers high efficiency across diverse tasks through a multi-stage training process that includes continual pretraining, supervised finetuning, and alignment techniques.
Hermes 3 Llama 3.1 405B FP8 logo
Hermes 3 Llama 3.1 405B FP8 Hermes 3 is a generalist finetune of the Llama-3.1 70B foundation model, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board.
Llama 3.1 Nemotron 70B Instruct logo
Llama 3.1 Nemotron 70B Instruct Nvidia's latest Llama fine-tune, topping alignment benchmarks and optimized for instruction following.
Liquid AI 40B logo
Liquid AI 40B LFM-40B offers a new balance between model size and output quality. Its performance is comparable to models larger than itself, while its MoE architecture enables higher throughput and deployment on more cost-effective hardware.
Llama 3.3 70B logo
Llama 3.3 70B The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks.
Llama 4 Maverick 17B logo
Login Required
Llama 4 Maverick 17B Meta's most intelligent multimodal open-source model in its class. Features 17B active parameters with 128 experts (400B total), beating GPT-4o and Gemini 2.0 Flash on coding, reasoning, and image understanding benchmarks. First Llama model with native multimodality and mixture-of-experts architecture.
Llama 4 Scout 17B logo
Login Required
Llama 4 Scout 17B Meta's efficient multimodal model with industry-leading 10M token context window. Features 17B active parameters with 16 experts (109B total), optimized for long-context tasks like multi-document analysis, extensive codebase reasoning, and personalized AI experiences.
Qwen 2.5 Coder 32B Instruct logo
Qwen 2.5 Coder 32B Instruct Qwen2.5-VL-32B is a multimodal vision-language model fine-tuned through reinforcement learning for enhanced mathematical reasoning, structured outputs, and visual problem-solving capabilities.
Qwen3-32B-FP8 logo
Qwen3-32B-FP8 Qwen3-32B-FP8 is part of Alibaba's latest generation of hybrid reasoning models featuring seamless switching between thinking and non-thinking modes. With 32.8B parameters and FP8 quantization, it offers superior reasoning capabilities for mathematics, coding, and logical tasks while maintaining cost efficiency. The model supports 119 languages and dialects with MCP support for advanced agent workflows.