bomonike

Let’s get to know the benchmarks AI companies use to compare each others’ versions.

US (English)   Norsk (Norwegian)   Español (Spanish)   Français (French)   Deutsch (German)   Italiano   Português   Estonian   اَلْعَرَبِيَّةُ (Egypt Arabic)   Napali   中文 (简体) Chinese (Simplified)   日本語 Japanese   한국어 Korean

Overview

AI (Artificial Intelligence) is the latest tech Gold Rush. The richest countries, richest billionaires, and largest companies in the world are all investing heavily toward a “winner take all” dominance.

AI Vendors and their LLM Brands

CountryVendorLLM brandClients
ChinaAlibabaQwen
USai21 (Allen AI)Olmo
-aion-labsaion-2.0
USAmazonNova
USAnthropicClaude
USAppleMM1, ReALM
ChinaBytedanceseed
ChinaCerebras-
FranceCohere-
-Contexual-
ChinaDeepSeekR1,R2,V3
USElevenLabs-
-FAL-
USFireworks.aiKwaiKAT-Coder
USGoogleGeminiRemy, Antigravity IDE
-IBMgranite
-Jina-
-Leonardo.ai-
USMetaLlama
USMicrosoftPhi
SingaporeMiniMaxM2, Hailuo, Speech
FranceMistralMedium, Large
-MoonshotKimi
-MorphMorph
USNVIDIANemotron
-nousresearchhermes-llama
USOpenAIGPTChatGPT, OpenClawLargest context length of 2m for highest price.
-Perplexity.aiSonarComet browser
US, Parispoolside.ailagunapoolFree on OpenRouter
-prime-intellectintellect-3-
-reka.aireka-edge/flash-
ChinaTencentHy3
-Together.ai-
USxAIGrok
USXiaomimimo</a>
ChinaZ.Ai (Zhipu)glmchat

Devstral

Clients to Access

PROTIP: Model API Availability

This is an analysis of major AI models and techniques to programmatically make API calls from your local machine.

ai-providers-flow.png/pptx

The first to market in 2023 was the OpenAI API accessing its Codex model in it own cloud service. Today, some use OpenAI’s model to evaluate code generated by other models.

The OpenAI API client can also be used to access other clouds simply by changing the API and endpoint URL such as to NVIDIA’s NIM cloud’s Nemotron models.

NVIDIA also hosts in its AI cloud services, other provider’s models, such as IBM, Meta, and others. Some are offered free, albeit for limited rates.

OpenAI’s API client can also emulate xAI’s API as if Grok models are called using xAI’s own API client. Training with conversations on Twitter make Grok the most conversational and up-to-date, as well as least sychophantic on sensitive subjects. But it’s not available free nor locally.

OpenAI’s API client can also emulate the Claude API client accessing the Anthropic AI cloud. But remember that using API cloud emulation eats token the same rate BUT adds latency from compatibility layer overhead and loses Anthropic-native capabilities such as top_k, metadata, etc. That may cause subtle behavioral differences. The very latest model may not be available.

Claude’s models are currently recognized as best for prose and coding. Clude’s cloud and client tools require a $100/month subscription, but also allow access to other models, some for free. This strategy has resulted in Anthropic making billions.

The DeepSeek model on DeepSeek’s cloud is 30 times less expensive than Claude, so someone created a Proxy service that routes Claude API calls to DeepSeek’s cloud service. Yes, that is a security concern so I recommend blocking it. BTW, AFAIK, none of these services provide for two-way client certificates to ensure that services are who they say they are.

DeepSeek and other models from China, such as Alibaba’s Qwen, have been accused of being based on data distilled from Claude. So it’s smaller. There is still doubt about whether one can trust model providers with proprietary data. So for privacy, organizations created a shortage of Mac Minis to run behind an in-house firewall.

Google’s models were trained from all the books it has been scanning for decades, along with YouTube and searches. Google has one of the first APIs to their top-ranked LLMs. Google also open-sourced the Gemini models in its cloud as Gemma4 models for being pulled inside the firewall to as local models run, at no cost, via the Ollama service.

You can individually go to the websites of DeepSeek, Qwen, Kimi, Mistral, and others to download models to run in privacy offline, and program API calls to each, separately.

OpenAI’s API can simplify access to the OpenRouter.ai gateway service enables a single API (and chat) interface to use LLMs from 60+ authors, including free models and even AWS and its vast cloud Bedrock SageMaker ecosystem.

OpenRouter provides pass-through billing to abstract away the complexity of managing separate accounts, authentication, and billing to 370 models from 60+ providers. It can automatically route requests to the fastest or cheapest provider, including 30 free models. It provides a common security, observability, and tracing interface, which allows for easy A/B testing and comparison between different models. For that, it takes a 5.5% fee when you buy credits. However, this may limit the speed of access and impose usage limits to free LLMs.

OpenAI was hosted exclusively in Microsoft’s Azure cloud until April 2026 when it also appeared among models Amazon makes available on its AWS cloud.

There are now several LLM routers:


Capabilities

Anthropic’s Claude

I have an entire section to Anthropic and its Claude technologies.

OpenRouter models

This GitHub Issue lists 65 free models and 254 paid models available on OpenRouter.ai.

The list is available as a JSON file and webpage.

PROTIP: I generated a Python program to create a CSV or JSON file at openrouter-models.py - last run on 2026-05-07 found 370 models among 60 providers.

Column Description

PROVIDER                            MODELS
==================================================
ai21                                     1
aion-labs                                4
alfredpros                               1
alibaba                                  1
allenai                                  2
alpindale                                1
amazon                                   5
anthracite-org                           1
anthropic                               14
arcee-ai                                 7
baidu                                    7
bytedance                                1
bytedance-seed                           4
cognitivecomputations                    1
cohere                                   4
deepcogito                               1
deepseek                                13
essentialai                              1
google                                  26
gryphe                                   1
ibm-granite                              2
inception                                1
inclusionai                              2
inflection                               2
kwaipilot                                1
liquid                                   3
mancer                                   1
meta-llama                              14
microsoft                                3
minimax                                  8
mistralai                               25
moonshotai                               5
morph                                    2
nex-agi                                  1
nousresearch                             6
nvidia                                  11
openai                                  65
openrouter                               5
perplexity                               5
poolside                                 2
prime-intellect                          1
qwen                                    51
rekaai                                   2
relace                                   2
sao10k                                   5
stepfun                                  1
switchpoint                              1
tencent                                  2
thedrummer                               4
tngtech                                  1
undi95                                   1
upstage                                  1
writer                                   1
x-ai                                    11
xiaomi                                   5
z-ai                                    13
~anthropic                               3
~google                                  2
~moonshotai                              1
~openai                                  2

”~” in front of model names (such as “~google/gemini-flash-latest”) ???

orq.ai models & providers

Count of models by provider (alphabetically):

==================================================
PROVIDER                            MODELS
==================================================
alibaba                                 41
anthropic                               15
aws                                     36
azure                                   25
bytedance                                4
cerebras                                 7
cohere                                  22
contextualai                             3
deepseek                                 2
elevenlabs                               5
fal                                      4
google                                  38
google-ai                               22
groq                                    16
jina                                    12
leonardoai                               4
minimax                                  7
mistral                                 35
moonshotai                               6
openai                                  67
orq                                      1
perplexity                               4
togetherai                               7
xai                                     19
zai                                     11
==================================================

The .csv file adds location and model description.

================================================================================================
PROVIDER        TYPE         MODEL ID                                     IN $/M     OUT $/M   FREE
================================================================================================
openai          chat         o1-pro                                      $150.00     $600.00       
openai          chat         gpt-5.4-pro                                  $30.00     $180.00       
openai          chat         gpt-5.2-pro                                  $21.00     $168.00       
openai          chat         gpt-5-pro                                    $15.00     $120.00       
leonardoai      image        leonardo-diffusion-xl                        $80.00      $80.00       
leonardoai      image        leonardo-kino-xl                             $80.00      $80.00       
leonardoai      image        leonardo-lightning-xl                        $80.00      $80.00       
leonardoai      image        leonardo-vision-xl                           $80.00      $80.00       
openai          chat         o3-pro                                       $20.00      $80.00       
anthropic       chat         claude-opus-4-0                              $15.00      $75.00       
anthropic       chat         claude-opus-4-1                              $15.00      $75.00       
anthropic       chat         claude-opus-4-20250514                       $15.00      $75.00       
anthropic       chat         claude-opus-4.1-20250805                     $15.00      $75.00       
aws             chat         anthropic/claude-opus-4.1 (US)               $15.00      $75.00       
google          chat         anthropic/claude-opus-4-1@20250805           $15.00      $75.00       
google          chat         anthropic/claude-opus-4@20250514             $15.00      $75.00       
azure           chat         o1                                           $15.00      $60.00   
...

Poolside

poolside.ai Founded in 2023

Eiso Kent, CTO

Jason Warner, CEO

XAI’s Grok models

Elon Musk’s XAI Grok series of LLMs do not have free usage.

It can be accessed using OpenAI’s API.

ppm = price_per_million prices, as of 2026-05-07:

model_id context
tokens
features input_ppm output_ppm
grok-4.20-0309-reasoning 2m reasoning, tool calls, structured output $1.25 $2.50
grok-4.20-0309-non-reasoning 2m tool calls, structured output $1.25 $2.50
grok-4.20-multi-agent-0309 2m multi-agent collaboration $2.00 $6.00
grok-4-1-fast-reasoning 2m reasoning, vision, tool calls, structured output $0.20 $0.50
grok-4-1-fast-non-reasoning 2m vision, tool calls, structured output $0.20 $0.50
grok-code-fast-1 256k code optimization, reasoning, tool calls $0.20 $1.50
grok-4-0709 256k reasoning, tool calls, structured output $3.00 $15.00
grok-3 131k tool calls, structured output $3.00 $15.00
grok-3-mini 131k reasoning, tool calls, structured output $0.30 $0.50
grok-4.3 1 million general purpose $1.25 $2.50
grok-2-image-1212 image generation
grok-2-vision-1212 8k image understanding $5.00 $15.00
grok-2-vision-latest 32k image understanding
model_id context_tokens features price
grok-imagine-image image generation $0.02 / image
grok-imagine-image-pro high-quality image generation $0.07 / image
grok-imagine-video video generation $0.05 / second

PROTIP: I generated a Python program to create a CSV or JSON file at deepseek-rates.py

References

https://www.cybergym.io/ found that OpenAI’s GPT5.5 found 82% vs 83% by Mythos. Evaluating AI Agents’ Real-World Cybersecurity Capabilities at Scale

https://www.armorcode.com/blog/the-mythos-moment-is-real-the-fix-it-faster-response-is-not The Mythos Moment is Real. The Fix-It-Faster Response isn’t.

https://www.youtube.com/watch?v=T7bQ86m5AEk Warp open source office hours: why open source?


26-05-07 v040 poolside @ai-providers.md created 2024-12-28