25 languages, one prompt: how config-driven extraction beats per-language code
Most multilingual chatbots ship a separate codepath per language. Typelessity ships one prompt template, one config, and 25+ languages work the same day. Here is the architecture and the 80 strings of localization that remain.
Typelessity supports 25+ languages with one English-language prompt template, one per-industry config, and approximately 80 hand-translated UI strings per locale. There is no per-language codepath, no language-detection branch, and no regex grammar. GPT-4.1-nano handles translation, normalization, and extraction in a single round-trip. Adding a new language is a translation of UI strings; the extraction works the day the language is added.
When engineers hear that Typelessity supports 25+ languages, the next question is always the same: "How big is your i18n team?" Zero. There is no i18n team. There is one prompt template and per-industry configs. Languages are an emergent property of GPT-4.1-nano, not a feature built per market.
What is the trap of per-language code?
The instinct most teams have is: detect the user's language, route to a per-language handler, maintain a regex/grammar/intent set per locale. This works at small scale and collapses at large scale. Every new language is a new codebase to maintain. Every new field in the schema is N coordinated changes — once per language. The cost of adding the 26th language is higher than the cost of the 25th.
Typelessity made the opposite bet. The prompt is in English. The user input is in any language. The output is structured JSON in a stable schema. The model handles translation, normalization, and extraction in one shot.
Bottom line: lean harder on the model than feels comfortable. The places where you still need code are smaller than you think.
What is actually in the config?
A config is a JSON file per client and industry. It defines:
- Required fields — what must be extracted (e.g.
specialty,date,time_window). - Optional fields — preferences (
doctor_gender,language_of_consultation). - Enrichment APIs — endpoints to call when a field is set (e.g.
specialty=dentistry→GET /doctors?specialty=dentistry). - Cascade rules — when field A changes, which downstream fields to clear. See /blog/cascade-corrections.
- Confirmation copy — per-locale strings for the review step. This is per-language, but it is just labels.
A typical medical-clinic config is around 120 lines. Adding a new vertical takes a day, not a sprint.
What is the language-detection step you do not write?
There is no explicit language-detection step. The model sees the user message, the prompt instructs it to respond in the same language, and it does. The detected language is logged in _meta.lang for analytics — but the orchestration layer does not branch on it.
The one place language matters is the review screen, where extracted fields are shown back to the user. Field labels (Date, Doctor, Time) need translation, and Typelessity keeps those in standard messages.{locale}.json files. Approximately 80 short strings per locale. That is the entire translation effort.
Bottom line: the model is the language layer. Code is the labels layer.
How much does latency vary across languages?
Token counts vary across languages — Japanese is roughly 2x English in tokens for the same meaning, due to script density. But the user-perceived latency is dominated by model time-to-first-token, which is largely independent of source language. The Typelessity 1-second p95 budget holds across supported locales.
The per-language p50/p95 distribution is tracked in production telemetry; high-friction or low-resource languages occasionally need prompt tuning, but no language has required a separate codepath. The full latency budget is detailed in /blog/latency-budgets.
What still needs language-specific code?
The list is short and stable:
- Phone number formats —
libphonenumberper region. Phone numbers are not a language problem; they are a regional formatting problem. - Date parsing edge cases —
chrono-nodefor locales where the model occasionally hallucinates a date. The model handles 95% of cases;chrono-nodecovers the long tail. - Currency display —
Intl.NumberFormat. Same logic as phone numbers — regional, not linguistic. - Honorifics — Japanese keigo, Korean speech levels. The prompt asks the model to mirror the user's register. Works most of the time; the residual is acceptable for booking surfaces.
Everything else is the model.
When per-language code is the right choice
Typelessity's bet pays off when:
- The output schema is stable across languages (booking fields are universal: who, what, when, where).
- The extraction surface is bounded (intake forms, not open-ended dialogue).
- Latency and cost can absorb a slightly larger model than a regex.
If the use case is open-ended advisory dialogue with deep cultural nuance — therapy, customer support across dramatically different markets, multilingual education — then per-language tuning, glossaries, and human review still beat the single-prompt approach. For structured booking, single-prompt extraction wins.
FAQ
How does Typelessity support 25+ languages without a per-language codepath? One English-language prompt template plus per-industry config. The model handles translation, normalization, and extraction in one round-trip and returns structured JSON.
What is actually translated by hand?
Approximately 80 short UI strings per locale — labels, confirmation copy, error messages. Stored in standard messages.{locale}.json files.
Does Typelessity detect the user's language?
No explicit detection step. The model responds in the user's language; the detected language is logged in _meta.lang for analytics but does not branch the orchestration layer.
How much does latency vary across languages? Token counts vary, but user-perceived latency stays inside the 1-second p95 budget across supported locales. No language requires a separate codepath.
What still needs language-specific code in Typelessity? Phone formatting (libphonenumber), date parsing edge cases (chrono-node), currency display (Intl.NumberFormat), and honorific handling via prompt instruction.
For the single-call extraction architecture, see Why we replaced the booking form with a single GPT call. For voice input across the same 25+ languages, see Whisper vs Web Speech. For the latency budget that constrains all of it, see Latency budgets.
— Alex Isa, founder of Typelessity. Also founder of Webappski and TypelessForm.