Run Llama 3.2 11B Vision (Preview) on your data

Models/Llama 3.2 11B Vision (Preview)

Meta / Llama 3.2 11B Vision (Preview)

imagetext

Price (per 1M tokens)

Input: $0.18 / Output: $0.18

A powerful multimodal model capable of processing both text and image inputs that supports multilingual, multi-turn conversations, tool use, and JSON mode.

Meta models available on Oxen.ai

		Modality		Price (1M tokens)
Model	Inference provider	Input	Output	Input	Output
Llama 3.1 405B Instruct Turbo	Together.ai	text	text	$3.50	$3.50
Llama 3.1 8B Instruct	Lambda Labs	text	text	$0.20	$0.20
Llama 3.2 11B Vision (Preview)	Groq	image	text	$0.18	$0.18
Llama 3.2 90B Vision (Preview)	Groq	image	text	$0.90	$0.90
Llama 3.3 70B Speculative Decoding	Groq	text	text	$0.59	$0.59
Llama-3 8B-instruct	Fireworks AI	text	text	$0.20	$0.20
Llama3.1 8B	Cerebras	text	text	$0.10	$0.10
meta-llama/Meta-Llama-3-8B-Instruct-Turbo	Together.ai	text	text	$0.18	$0.18

See all models available on Oxen.ai