Skip to content

Cost to run

All MiniLM L6 v2

sentence-transformers/all-minilm-l6-v2

Family
MiniLM
Context
256 tokens

All MiniLM L6 v2 can be run self-hosted (rent a GPU + run vLLM/TGI) or through a serverless API (pay per token). Live pricing comparisons:

Need to benchmark this model against another? Try the calculator or see where it ranks on the InferenceScore leaderboard.