SUTRA are ultrafast, multilingual, online Generative AI models that can operate in 50+ languages with conversation, search, and visual capabilities.

Multilingual

Supports ultrafast text-generation and instruction-following in multiple languages.

Cost-efficient

Energy and cost efficient as well as easier to maintain and add new language capabilities.

Ultrafast

Lightweight MoE architecture and purpose-built tokenizer enables ultra low latency.

Online

Connected and hallucination-free models that provide factual responses in a conversational tone.

SUTRA models are fine-tuned and aligned with proprietary conversational datasets, ensuring coherent and consistent dialogs.

50+ languages with

accuracy & efficiency

SUTRA surpasses leading models by 20-30% on the MMLU benchmark in comprehending and generating responses across numerous languages.

SUTRA models are fine-tuned and aligned with 150M+ conversational data, ensuring coherent and consistent dialogs.

MMLU scores

HINDI

हिंदी

GPT 3.5

39

LLAMA 3

64

SUTRA

65

HINDI

हिंदी

GPT 3.5

39

SUTRA

65

KOREAN

한국어

HyperClova

54

GPT 3.5

51

SUTRA

67

KOREAN

한국어

HyperClova

54

SUTRA

67

GUJARATI

ગુજરાતી

GPT 4

61

GPT 3.5

35

SUTRA

67

GUJARATI

ગુજરાતી

GPT 4

61

SUTRA

67

JAPANESE

日本語

LLAMA 3

70

SAKANA

62

SUTRA

74

JAPANESE

日本語

LLAMA 3

70

SUTRA

74

ARABIC

العربية

LLAMA 3

60

GPT 3.5

49

SUTRA

66

ARABIC

العربية

LLAMA 3

60

SUTRA

66

Cost-efficient tokenization

for non-English languages

SUTRA's purpose-built tokenizer is 5-8X more cost efficient.

Up-to-date &

hallucination-free

SUTRA-Online are internet connected and hallucination-free models that understand queries, browse the web, and summarize information to provide current answers.


It can answer queries like “Who won the game last night” or “What’s the current stock price” accurately, whereas offline models suffer from knowledge cut-off dates.

Models

SUTRA-Pro

SUTRA-Pro is adept at executing instructions in 50+ languages for conversational use cases and complex tasks. Developed with a rich diversity of instructions from both proprietary and leading open-access datasets, our models excel across a spectrum of multilingual benchmarks, offering unparalleled proficiency in over 50 languages including the nuanced spectrums of Latin, Indic, and Far Eastern languages.

Parameters

150b

Parameters

150b

SUTRA-Light

SUTRA-Light is designed for conversation, summarization, and other tasks in 50+ languages. An ideal choice for multilingual conversations at scale, balancing performance, cost, and speed. SUTRA-Light is highly efficient and optimized for ultra-low latency applications.

Parameters

56b - active 13b

Parameters

56b - active 13b

Time-to-first-token (TTFT)

100-200ms

Time-to-first-token (TTFT)

100-200ms

SUTRA-Online

SUTRA-Online can use knowledge from the internet, and thus leverage the most up-to-date information when forming responses. By understanding queries, browsing the web, and summarizing information, SUTRA-Online can accurately respond to time-sensitive queries, unlocking knowledge beyond its training corpus. This means that SUTRA models can respond to queries like “Who won in yesterday’s India vs England cricket match?” accurately, whereas offline models suffer from knowledge cut-off dates.

Parameters

56b - active 13b

Parameters

56b - active 13b

Knowledge-cutoff

Up-to-date

Knowledge-cutoff

Up-to-date



Detailed SUTRA model spec and sample code are available.

SUTRA API

available now

SUTRA models are the technology backbone of TWO’s products and services and are available as Model as a Service (MaaS) to other apps and services via usage-based pricing and simple-to-integrate APIs.