Models/Llama 3.2 11B Vision (Preview)
MetaMeta / Llama 3.2 11B Vision (Preview)
imagetext
Input: $0.18 / Output: $0.18

A powerful multimodal model capable of processing both text and image inputs that supports multilingual, multi-turn conversations, tool use, and JSON mode.

Meta models available on Oxen.ai
ModalityPrice (1M tokens)
ModelInference providerInputOutputInputOutput
BBaseten
texttextN/AN/A
Together.aiTogether.ai
texttext$3.50$3.50
Lambda LabsLambda Labs
texttext$0.20$0.20
GroqGroq
imagetext$0.18$0.18
Lambda LabsLambda Labs
texttext$0.02$0.02
GroqGroq
imagetext$0.90$0.90
GroqGroq
texttext$0.59$0.59
Fireworks AIFireworks AI
texttext$0.20$0.20
CerebrasCerebras
texttext$0.10$0.10
BBaseten
texttextN/AN/A
Together.aiTogether.ai
texttext$0.18$0.18
See all models available on Oxen.ai