SnackOnAI Engineering | Senior AI Systems Researcher | Technical Deep Dive | June 11, 2026
Every prompt, every intermediate reasoning step, and every generated output must be linguistically correct in the target language, produced dynamically at inference time, with no pre-translated string to fall back on. The solution is a two-track system: a shared rubric framework that captures language rules as a single source of truth, plus adaptive model strategies that apply those rules differently depending on whether the model is large and instruction-following or small and cost-constrained.
Traditional software internationalization is a text replacement problem. You extract strings, send them to translators, import the translated strings, and ship. The strings are static. The translation is a one-time operation. When the strings change, you re-translate.
AI agents break every assumption in this model. LinkedIn's Hiring Assistant does not display strings. It generates text dynamically: candidate summaries, outreach messages, job post drafts, interview scheduling confirmations. Each output is unique. There is no string catalog to translate. And the model generating the text must produce correct French or German output not after the fact, but during the forward pass.
The LinkedIn Engineering blog documenting the French and German expansion is the most detailed public account of what "AI localization at production scale" actually requires. The Hiring Assistant saves recruiters an average of 1.5 hours per role in English. Delivering the same productivity to French and German markets required solving five distinct linguistic problems that do not exist in English.
Scope: LinkedIn's rubric framework design, the two-track adaptive model strategy (instruction-following prompt transformation versus per-language LoRA adapters), the five linguistic challenges and why they cannot be solved by post-generation translation, and the annotation bottleneck that drove the architectural decisions. Not covered: the underlying Plan-and-Execute agent architecture beyond its role as the deployment context, or specific model identities.
What It Actually Does
The Hiring Assistant is a multi-agent system with a Plan-and-Execute architecture. A planner translates recruiter intent into a structured task graph. Specialized sub-agents execute each step: drafting job posts, running candidate queries, generating outreach messages, scheduling interviews. Each sub-agent produces text. In English, this works. In French and German, each sub-agent's output must independently satisfy linguistic rules that English outputs never had to satisfy.
The five problems that don't exist in English:
Challenge | What it requires |
|---|---|
Grammatical gender | Nouns, adjectives, and titles must agree in gender ("experienced engineer" has two different French forms) |
Formal vs. informal address | vous vs. tu in French, Sie vs. du in German carry professional register implications |
Noun capitalization | German capitalizes ALL nouns; getting this wrong reads as unprofessional |
Date and number formats | Different decimal separators, date orders, currency formatting |
Brand and style register | Professional tone conventions differ by market; UK professional norms ≠ German professional norms |
These five challenges compound across every sub-agent. A system with 10 sub-agents expanding to 3 languages has potentially 30 independent prompt surfaces that each need language-aware handling.
The Architecture, Unpacked

Focus on the rubric framework as the O(n) vs. O(n×m) complexity reduction. Without it, localizing each sub-agent for each language requires separate linguist review per combination. The rubric captures rules once and the prompt transformation pipeline applies them automatically across every sub-agent. This is what makes the system scalable to additional languages.
The Code, Annotated
Snippet One: Rubric-Driven Prompt Transformation (Track A)
# LinkedIn-style rubric framework: prompt transformation pipeline
# Reconstructed from LinkedIn Engineering blog description
# The design intent: capture language rules ONCE, apply across all sub-agents
from dataclasses import dataclass, field
from enum import Enum
class Language(Enum):
EN = "en"
FR = "fr"
DE = "de"
@dataclass
class RubricRule:
"""
A linguist-reviewed language rule, stored as the single source of truth.
Designed to be inspectable and auditable by linguists, not just engineers.
"""
dimension: str # "tone", "gender", "orthography", etc.
language: Language
rule_text: str # human-readable rule for linguist review
conditional_block: str # machine-applicable prompt injection
# ── RUBRIC FRAMEWORK: stored once, per language ───────────────────────────────
RUBRIC = {
Language.FR: [
RubricRule(
dimension="tone",
language=Language.FR,
rule_text="Use formal address (vous) for all professional communications",
conditional_block="Vous devez toujours utiliser le vouvoiement (vous) "
"dans un contexte professionnel.",
),
RubricRule(
dimension="gender",
language=Language.FR,
rule_text="Adjust noun/adjective gender to match the candidate's context",
# ← THIS is the trick: gender in French requires pronoun resolution
# "experienced engineer" → "ingénieur(e) expérimenté(e)"
# The conditional block tells the LLM HOW to handle ambiguity
conditional_block="Utilisez les formes épicènes ou la double forme "
"masculin/féminin lorsque le genre n'est pas spécifié.",
),
RubricRule(
dimension="orthography",
language=Language.FR,
rule_text="French punctuation: space before :, !, ?, ; (insécable)",
conditional_block="Respectez les règles typographiques françaises : "
"espace insécable avant les signes de ponctuation doubles.",
),
],
Language.DE: [
RubricRule(
dimension="tone",
language=Language.DE,
rule_text="Use formal address (Sie) for all professional communications",
conditional_block="Verwenden Sie in allen professionellen Kommunikationen "
"die Siezen-Form.",
),
RubricRule(
dimension="orthography",
language=Language.DE,
rule_text="Capitalize ALL nouns (German grammar requirement)",
# ← This cannot be enforced by post-processing alone
# The model must produce correctly capitalized output natively
# Post-hoc capitalization is unreliable for identifying nouns
conditional_block="Beachten Sie, dass im Deutschen alle Substantive "
"großgeschrieben werden müssen.",
),
],
}
# ── PROMPT TRANSFORMATION PIPELINE ───────────────────────────────────────────
class PromptTransformPipeline:
"""
Takes an English prompt and produces a language-aware version.
The key design: re-runs automatically when English prompt changes.
← No manual re-translation required when product updates prompts.
"""
def __init__(self, rubric: dict):
self.rubric = rubric
def transform(self, english_prompt: str, target_lang: Language) -> str:
"""
Analyze English prompt → identify language-dependent elements
→ inject conditional blocks from rubric framework.
"""
if target_lang == Language.EN:
return english_prompt
# Collect all rubric rules for this language
lang_rules = self.rubric.get(target_lang, [])
# Build language-aware preamble from rubric
# ← Each conditional block is a linguist-reviewed instruction
# that tells the model exactly how to handle that dimension
rubric_preamble = "\n".join(
f"[{rule.dimension.upper()}]: {rule.conditional_block}"
for rule in lang_rules
)
# Translate the core instruction (via multilingual LLM or human translation)
# and prepend rubric constraints
# ← The core instruction translation is a one-time operation
# The rubric preamble is auto-applied from the framework
return f"""
LANGUAGE RULES FOR THIS RESPONSE:
{rubric_preamble}
TASK:
{english_prompt}
Respond in {target_lang.value.upper()} following all language rules above.
""".strip()
# ── USAGE EXAMPLE ──────────────────────────────────────────────────────────────
pipeline = PromptTransformPipeline(RUBRIC)
english_outreach_prompt = """
Draft a professional outreach message to a software engineer candidate
inviting them to apply for a Senior Backend Engineer role.
"""
fr_prompt = pipeline.transform(english_outreach_prompt, Language.FR)
de_prompt = pipeline.transform(english_outreach_prompt, Language.DE)
print(fr_prompt)
# Output: French-aware prompt with vous-form, gender, and punctuation rules injected
# When LinkedIn updates the English prompt, pipeline.transform() re-runs automatically
# ← Zero linguist re-work needed for English prompt updates
The rubric_preamble injection is the O(1) cost operation. Instead of a linguist reviewing every sub-agent's prompt for every language update, the rubric captures the rules once and the pipeline injects them automatically. The linguist's time is spent on the rubric definition and its validation, not on each individual prompt.
Snippet Two: Per-Language LoRA Adapter (Track B)
# Per-language LoRA adapter training for cost-efficient multilingual inference
# Reconstructed from LinkedIn Engineering blog description
# The design intent: achieve native output quality at base model serving cost
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model, TaskType
import torch
# ── LORA ADAPTER TRAINING CONFIGURATION ──────────────────────────────────────
def create_language_adapter(
base_model_name: str,
language: str, # "fr" or "de"
training_data_path: str,
) -> None:
"""
Train a per-language LoRA adapter on rubric-aligned professional recruitment text.
Why LoRA specifically?
← LoRA (Low-rank Adaptation) adds only a small number of trainable parameters
to the base model (typically 0.1-1% of total parameters).
The base model weights are FROZEN: no full fine-tuning.
At serving time: base model + adapter = negligible memory overhead.
Cost at inference: effectively same as serving the base model alone.
← THIS is the key cost insight:
A frontier model with prompt engineering = high per-token cost at scale
A cost-efficient base model + LoRA adapter = native quality, fraction of cost
The LoRA adapter encodes the language rules the smaller model can't do via prompts
"""
base_model = AutoModelForCausalLM.from_pretrained(
base_model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
# LoRA configuration: target attention projections (standard choice)
# rank=16 is a common starting point: balance between capacity and cost
lora_config = LoraConfig(
task_type=TaskType.CAUSAL_LM,
r=16, # ← low rank: small number of new parameters
lora_alpha=32, # scaling factor for adapter output
target_modules=["q_proj", "v_proj"], # only adapt attention layers
lora_dropout=0.05,
# ← Language-specific adapters only need to learn:
# grammatical gender patterns, formal register, capitalization norms
# NOT: broad language understanding (already in base model)
# So low rank is sufficient; no need for high-rank adapters
)
model = get_peft_model(base_model, lora_config)
model.print_trainable_parameters()
# Output: "trainable params: 4,194,304 || all params: 7,243,534,336 || trainable%: 0.0579"
# ← ~4M new parameters vs ~7B total = 0.06% overhead
# At serving: fuse adapter into base model for zero runtime cost
# Training data: professional recruitment text in target language
# Aligned with rubric rules: examples demonstrate correct gender agreement,
# formal register, noun capitalization, etc.
train_dataset = load_rubric_aligned_dataset(
training_data_path,
language=language,
# ← Training examples are specifically curated to demonstrate rubric compliance
# Not generic language data: professional HR/recruitment domain text
# filtered for: formal register, correct gender handling, style guidelines
)
# Fine-tune only the LoRA parameters; base model weights unchanged
trainer = LanguageAdapterTrainer(
model=model,
tokenizer=tokenizer,
dataset=train_dataset,
output_dir=f"./adapters/{language}_hiring_assistant",
)
trainer.train()
model.save_pretrained(f"./adapters/{language}_hiring_assistant")
print(f"Saved {language} LoRA adapter: ~8MB vs ~14GB for full model")
# ← The adapter file is tiny; the base model is loaded once and shared
# Multiple language adapters can share the same base model instance
The 0.06% trainable parameter ratio is the efficiency story of Track B. The base model already understands French and German. The LoRA adapter teaches it specifically how LinkedIn's Hiring Assistant should speak French and German: formal register, rubric-compliant gender handling, brand-appropriate tone. This is domain adaptation at a fraction of full fine-tuning cost, and the adapter can be swapped at serving time without reloading the base model.
It In Action: End-to-End Worked Example
Scenario: Generate a German outreach message for a Java engineer candidate
Input (recruiter intent):
Task: generate_outreach_message
Candidate: [Java engineer, 8 years experience, Munich]
Role: Senior Backend Engineer, Berlin
Language: de
Step 1: Planner routes to generate_outreach sub-agent
Plan node: generate_outreach
Sub-agent: outreach_generator
Target language: de
Step 2: Rubric lookup
German rubric rules applied:
[TONE]: Verwenden Sie in allen professionellen Kommunikationen die Siezen-Form
[ORTHOGRAPHY]: Beachten Sie, dass im Deutschen alle Substantive großgeschrieben werden
[CULTURAL]: Direkter, sachlicher Stil bevorzugt; keine überschwänglichen Formulierungen
[FORMAT]: Datum im Format TT.MM.JJJJ; kein Oxford-Komma
Step 3a (Track A, if using instruction-following model):
Transformed German prompt injected with rubric conditional blocks
→ Model produces:
"Sehr geehrter Herr [Name],
mit großem Interesse habe ich Ihr Profil als Java-Entwickler gesehen.
Für unsere Position als Senior Backend Engineer in Berlin suchen wir
erfahrene Softwareingenieure mit Ihrem Hintergrund.
Hätten Sie Interesse an einem kurzen Austausch?
Mit freundlichen Grüßen,
[Recruiter Name]"
Linguistic compliance check:
Sie-form: ✓ (Sie not du — correct formal register)
Noun caps: ✓ Entwickler, Position, Softwareingenieure, Hintergrund, Interesse, Austausch
Direct style: ✓ no anglicisms, no exclamation marks (DE professional norms)
Brand register: ✓ "Mit freundlichen Grüßen" (correct DE closing formula)
Step 3b (Track B, if using cost-efficient model):
Same German prompt → routed to base model with DE LoRA adapter loaded
Adapter encodes: Sie-form conventions, noun capitalization, professional HR register
Output: linguistically equivalent to Track A
Cost: ~60-70% lower per-token vs. frontier model with prompt engineering
Serving overhead: LoRA weights fused; no runtime cost vs. base model alone
Production metrics context:
English baseline: 1.5 hours saved per recruiter per role
FR/DE expansion target: same productivity gain in new markets
AI-Assisted Messages: +40% InMail acceptance rate (documented)
Automated Follow-Ups: +39% accepted InMails vs. manual follow-up
Why This Design Works, and What It Trades Away
The rubric framework addresses the real bottleneck: linguist scarcity. Without it, every sub-agent's prompt must be independently reviewed by a linguist for every language update. A team with 10 sub-agents expanding to 5 languages needs 50 independent prompt reviews for each product iteration. The rubric captures expert knowledge once in a structured format and distributes it automatically. Linguist time goes into defining and validating rubric dimensions, not into reviewing individual prompts.
The two-track approach is the correct engineering decision for a system that must balance quality and cost across model sizes. Frontier instruction-following models respond well to detailed prompt engineering but are expensive at scale. Smaller models cannot be prompted into reliable native-language professional output, but they can be adapted with a lightweight LoRA that costs almost nothing at serving time. Matching the model class to the language adaptation method is the efficiency insight.
The auto-sync property of the prompt transformation pipeline (re-runs when English prompts change) is underrated. In a production agentic system, English prompts are updated frequently as the product evolves. Without auto-sync, every English prompt change requires a manual re-localization effort for every supported language. With auto-sync, the localized versions stay current automatically. This is the design decision that makes international expansion sustainable rather than a continuous maintenance burden.
What the system trades away:
Dynamic gender resolution remains hard. The rubric provides rules for gender agreement, but the model must infer the correct gendered form from context (a candidate's profile, their name, their stated pronouns). In cases of ambiguity, the model must choose between a generic form, a double form, or an incorrect assumption. The rubric reduces the frequency of errors but cannot eliminate them without explicit gender signals.
The LoRA adapter approach (Track B) has a training data dependency. The adapter quality is bounded by the quality and representativeness of the rubric-aligned training data. If the training data underrepresents specific sub-domains (e.g., legal or medical hiring), the adapter will underperform in those contexts while the Track A prompt approach degrades more gracefully.
Annotation quality is a direct function of linguist expertise in the target professional domain. A linguist fluent in French but unfamiliar with French HR conventions will produce rubric rules that are grammatically correct but professionally inappropriate. The rubric framework's quality ceiling is the quality of the linguists who define it.
Technical Moats
The rubric-as-code paradigm. The rubric framework transforms linguistic expertise from implicit knowledge in individual prompts into an explicit, versioned, reusable data structure. This is the same architectural insight that drove the move from imperative configuration to declarative infrastructure-as-code: externalizing the rules from the implementation. Replicating this requires not just the engineering infrastructure but also a process for linguist-engineer collaboration that most teams have not built.
The plan-and-execute architecture as the multiplication surface. The Hiring Assistant's multi-agent structure means each new sub-agent added to the English product automatically inherits internationalization support through the rubric framework. New sub-agents do not require new localization engineering work. This architectural property compounds: a team that builds 20 sub-agents in English effectively gets 60 sub-agents across three languages for the engineering cost of 20. Teams that build monolithic single-prompt systems cannot replicate this compound effect.
The annotation flywheel. Linguist reviews feed rubric refinements. Rubric refinements improve adapter training data quality. Better adapters reduce the frequency of linguist review needed. The system improves its own localization quality over time without linear growth in linguist headcount. Breaking into this flywheel from zero requires significant initial linguist investment.
Insights
Insight One: The real challenge in international AI agent expansion is not language quality. It is language consistency across a distributed system. A single LLM asked to write in French will generally produce grammatically correct French. But a multi-agent system where 10 specialized sub-agents each make independent language decisions will produce inconsistent French, with different register choices, different gender strategies, different tone. The rubric framework's primary value is not grammar correction: it is enforcing consistency across agents. Without a shared rubric, each sub-agent's language output is correct in isolation and inconsistent in aggregate.
Insight Two: The formal vs. informal address problem (vous/tu, Sie/du) is a binary choice in the prompt but a continuous challenge in production. In a recruitment context, the correct choice is almost always formal. But the model must maintain this choice consistently across multi-turn interactions, across different output types (outreach messages, interview confirmations, pipeline summaries), and across sub-agents that may not share conversation state. A model that correctly uses Sie in a job description but defaults to du in a follow-up message has a consistency failure, not a quality failure. The rubric framework enforces consistency by making the register choice an explicit constraint rather than an implicit model behavior.
Surprising Takeaway
German noun capitalization is non-trivially solved by post-processing. The obvious engineering response to "capitalize all German nouns" is to run a POS (part-of-speech) tagger on the output and capitalize the nouns. This works for common nouns but fails on ambiguous cases: German verbs used as nouns (nominalization), compound nouns with unclear boundaries, borrowed words that may or may not be capitalized in context. A post-processing approach that relies on POS tagging introduces a second model with its own error rate on top of the generation model. LinkedIn's rubric approach instead teaches the generation model to produce correctly capitalized output natively by injecting the capitalization rule as a prompt constraint. This eliminates the compound error rate problem entirely. The model that already understands German grammar is better positioned to get capitalization right than a downstream tagger applied to text the model produced assuming it would be corrected.
TL;DR For Engineers
LinkedIn's Hiring Assistant expanded to French and German (June 2026) with a two-track localization architecture: a shared rubric framework (linguist-reviewed rules for tone, gender, orthography, cultural adaptation, format) applied via prompt transformation for instruction-following models, and per-language LoRA adapters for cost-efficient models. Rubric dimensions: language_purity, tone, orthography, cultural_adaptation, gender_rules.
The rubric framework is the O(n) vs. O(n×m) complexity reduction: language expertise captured once, applied automatically across all sub-agents. When English prompts update, the transformation pipeline re-runs automatically, keeping localized versions in sync with zero manual effort.
The two-track model strategy reflects a cost architecture: frontier instruction-following models get rubric-injected prompt templates (higher quality, higher cost); smaller models get per-language LoRA adapters at ~0.06% trainable parameter overhead (native quality, significantly lower serving cost).
Five linguistic challenges not present in English: grammatical gender agreement, formal vs. informal address (vous/tu, Sie/du), noun capitalization (DE), date/number formats, and brand register conventions. Post-generation translation fails for grammatical gender because the gender information is absent from the English source text.
Outcome metrics: 1.5 hours saved per recruiter per role (Hiring Assistant, English baseline). AI-Assisted Messages: +40% InMail acceptance rate. Automated Follow-Ups: +39% accepted InMails vs. manual. French and German expansion targets these same productivity gains.
The Language Rules Are the Infrastructure
LinkedIn's internationalization playbook makes the correct engineering call: do not solve the multilingual problem at the model layer. Solve it at the rules layer. A rubric framework that stores linguist-reviewed language rules as a versioned data structure is infrastructure that compounds: each new language adds a rubric definition, not a new engineering effort per sub-agent per language. Each new sub-agent inherits the existing rubrics automatically.
The LoRA adapter strategy is the companion cost architecture that makes this scale financially. You do not need frontier-model-quality serving costs for every language output if a 0.06% parameter adapter can produce native-quality professional text from a cost-efficient base.
The combination is the playbook: abstract language rules into a shared rubric, apply them automatically through the prompt pipeline for large models, encode them efficiently into adapters for small models, and build a system where adding a new language is a rubric addition, not a re-engineering effort.
References
LoRA: Low-Rank Adaptation of Large Language Models, Hu et al., arXiv:2106.09685 — the PEFT technique used in Track B adapters
2026 LinkedIn Hiring Release Features — AI-Assisted Messages +40% InMail acceptance rate, Automated Follow-Ups +39%
LinkedIn expanded its Hiring Assistant to French and German using a two-track localization architecture: a shared rubric framework capturing linguist-reviewed rules (tone, gender agreement, orthography, cultural adaptation, format) as a single source of truth, applied via prompt transformation pipelines for instruction-following models and via per-language LoRA adapters (~0.06% trainable parameter overhead) for cost-efficient serving. The rubric framework reduces localization complexity from O(sub-agents × languages) to O(languages) by automatically applying language rules across all sub-agents, with auto-sync when English prompts update. Five specific linguistic challenges not solvable by post-generation translation drove the design: grammatical gender agreement (absent from English source text), formal/informal address (vous/tu, Sie/du), German noun capitalization, date/number formats, and brand register conventions. Target outcome: same 1.5-hour-per-role productivity gain documented in English markets.
Sponsored Ad
If you enjoy practical AI insights, check out SnackOnAI and support the newsletter by subscribing, sharing, and exploring our sponsored ad — it helps us keep building and delivering value 🚀
AI help, without the trust tax.
Most AI tools ask you to trade your data for intelligence. Norton Neo doesn't. It's the first safe AI-native browser built by Norton, and it gives you powerful built-in AI without handing your privacy over to get it. Search, summarize, and write with AI built directly into your browser. Your data stays yours. Your context stays private.
Built-in VPN, anti-fingerprinting, and ad blocking come standard. No add-ons. No setup. No compromises.
Fast. Safe. Intelligent. That's Neo.


