Telegram Formatting as a Channel Contract

🤖 Ghostwritten by GPT 5.4 · Fact-checked & edited by Claude Opus 4.6

A production agent should not behave like a debug console. That was the core lesson behind a small May change with outsized impact: treating Telegram formatting as a channel contract, not as presentation cleanup after the fact. Once an operator reads an agent through a chat surface, the rendering rules of that surface become part of the product itself.

For a chief-of-staff agent, that meant two deliberate decisions. First, outbound delivery moved behind a single notify module that owns channel delivery behavior, including optional Telegram HTML parse mode and safe HTML escaping helpers. Second, streaming was turned off by default for this operator-facing path, so intermediate tool chatter never leaks into the human-visible channel. The operator sees the finished answer, formatted for the channel, instead of a jumble of half-complete thoughts, raw tool payloads, or malformed markup.

That sounds minor until it fails. An agent that dumps raw tool output, partial reasoning, or unescaped text into Telegram quickly becomes hard to trust. Readability is not polish. It is the contract.

The Problem: Readable Output Is Part of the Product

TL;DR: If an agent speaks through chat, the channel's rendering rules define whether the system feels reliable or broken.

The easy mistake in agent engineering is to treat message delivery as a thin transport layer. Model returns text, transport sends text, done. That assumption breaks the moment the channel has formatting semantics, truncation behavior, parse modes, or user-visible quirks.

Telegram is a good example because it is simple enough to look harmless and opinionated enough to punish sloppy output. A message can render cleanly with emphasis and structure, or it can become confusing because markup was malformed, content was not escaped, or internal tool chatter spilled into the conversation. The difference between those outcomes is not aesthetic. It directly affects operator trust.

In practice, operators judge an agent by a few basic questions:

Can the answer be read quickly?
Does the formatting make hierarchy obvious?
Does the message contain unexplained junk from tools or middleware?
Does copied user content render safely and predictably?
Does the final answer look intentional?

That last point matters more than many teams expect. A well-formatted answer signals that the system has completed a task and is presenting a result. A stream of intermediate fragments signals uncertainty, leakage, or lack of control.

This is especially important for a chief-of-staff style agent. That role is not just generating text; it is mediating work for a human operator. The operator should receive a concise, structured update, not a transcript of internal machinery.

The broader lesson applies beyond Telegram. Slack, email, SMS, voice transcripts, and internal dashboards all have their own channel contracts. The rendering rules, safety rules, and leakage boundaries are channel-specific. A production agent needs to respect them explicitly.

What Changed: One Notify Module, One Delivery Contract

TL;DR: Centralizing outbound delivery in a notify module created a single place to enforce Telegram formatting, HTML escaping, and final-only message delivery.

The change was architectural more than algorithmic. Instead of letting multiple parts of the system send messages directly, outbound delivery was centralized in a shared notify module. That module became the owner of channel behavior.

That sounds almost boring, which is usually a sign of a good production change. The goal was not to invent a clever abstraction. The goal was to remove ambiguity about who is allowed to speak to the operator and how that speech is rendered.

The centralized-notify pattern does a few useful things:

It gives one module responsibility for outbound delivery.
It standardizes parse-mode decisions for Telegram.
It provides helper functions for safe HTML escaping.
It makes it easier to suppress raw tool chatter.
It creates a single review point for channel-specific security behavior.

A simplified comparison:

Approach	Benefits	Risks
Direct sends from many call sites	Fast to prototype	Inconsistent formatting, duplicated escaping logic, easy leakage of raw tool output
Centralized notify module	Consistent rendering, easier audits, shared safety controls	Requires discipline and some refactoring
Channel-agnostic raw text transport	Minimal abstraction	Ignores parse-mode quirks, weakens readability, increases output injection risk

The practical effect was that the agent no longer had to remember Telegram-specific details at every call site. The caller could ask to notify a channel, and the notify module would decide how to render safely for that destination.

Formatting bugs are rarely dramatic during development. They show up as subtle breakage: emphasis fails, angle brackets vanish, chunks of text disappear, or a user-supplied string unexpectedly changes the rendering of the entire message. Centralization reduces the number of places where those bugs can be introduced.

An Illustrative Pattern

A simplified version of the pattern:

export async function notifyTelegram(options: {
  text: string;
  html?: boolean;
}) {
  const payload = {
    text: options.html ? toTelegramHtml(options.text) : options.text,
    parse_mode: options.html ? "HTML" : undefined,
  };

  return sendToChannel(payload);
}

The point is not the exact function signature. The point is ownership. One module decides whether Telegram HTML is enabled, how content is normalized, and what finally gets sent.

Safe HTML Rendering Requires Escaping, Always

TL;DR: HTML escaping is mandatory whenever arbitrary content flows into an HTML-rendered Telegram message, because unescaped text can break formatting or create output injection bugs.

Once Telegram HTML parse mode is enabled, every variable piece of content becomes suspicious by default. User input, tool output, filenames, snippets, summaries, and copied text can all contain characters that Telegram will interpret as markup unless they are escaped.

The subtle bug class here is output injection through formatting. This is not the same as server-side code execution or browser-based cross-site scripting, but it belongs in the same family of mistakes: untrusted content is interpreted in a richer output context than intended.

The minimum safe control is straightforward. Escape the characters that matter before arbitrary text is inserted into an HTML-rendered message. For Telegram HTML, the critical baseline is escaping ampersands, angle brackets, and — per the Telegram Bot API documentation — also double quotes inside attribute values.

An illustrative helper:

export function escapeHtml(input: string): string {
  return input
    .replace(/&/g, "&amp;")
    .replace(/</g, "&lt;")
    .replace(/>/g, "&gt;")
    .replace(/"/g, "&quot;");
}

export function toTelegramHtml(input: string): string {
  return escapeHtml(input);
}

In a fuller implementation, the formatter may also intentionally wrap trusted structural elements — bold section labels, code-style fragments — after escaping the underlying content. The key rule is order of operations: escape arbitrary content first, then apply trusted formatting.

Why This Bug Is Easy to Miss

It often hides during happy-path testing. Internal test prompts are usually clean, short, and free of markup-looking characters. Real content is not. A tool might return XML-like fragments, a file path summary might include angle brackets from templating syntax, or a user may paste content that includes literal markup.

Without HTML escaping, several failure modes appear:

Parts of the message disappear or render incorrectly.
Intended formatting breaks because the parser sees accidental tags.
Content appears with unintended emphasis or structure.
The operator cannot tell what the agent actually meant to send.

In a human-facing operations channel, that is enough to damage trust. If a message looks malformed, the operator has to wonder whether only the formatting is wrong or whether the underlying result is wrong too.

Why Streaming Off Was the Right Call for This Channel

TL;DR: Disabling streaming prevents intermediate tool-call chatter, partial results, and internal noise from reaching the operator before the answer is ready.

Streaming can be useful in the right context. It improves perceived responsiveness, supports live drafting experiences, and can make long-running tasks feel active instead of stalled. But those benefits depend heavily on the channel and the job to be done.

For an operator-facing Telegram workflow, streaming created the wrong incentives. It exposed intermediate model behavior too early and too literally. Tool-call chatter, partial synthesis, and transient fragments are useful for internal observability, but they are not useful as operator-facing communication.

Turning streaming off for this path enforced a simple rule: only the finished, formatted answer goes to the human-visible channel.

That single decision improved several things at once:

Delivery mode	Operator experience	Leakage risk	Readability
Streaming on	Feels live, but can expose partial thoughts and tool noise	Higher	Often inconsistent
Streaming off	Slightly less immediate, but cleaner final answer	Lower	Stronger and more intentional

This is one of those trade-offs where product quality beats raw interactivity. A chief-of-staff agent is not trying to entertain the operator with a live typing effect. It is trying to deliver a useful answer with confidence and structure.

There is also a security dimension. Raw tool chatter can contain partial results that were never meant for the final audience. Depending on the system, that may include internal file references, stack fragments, intermediate summaries, or diagnostics that make sense to engineers but not to operators. Even when the content is not sensitive, its presence in the channel creates confusion about what is authoritative.

Suppressing that chatter is therefore both a usability choice and a leakage-control choice. Internal traces belong in logs and observability systems, not in the operator's chat thread.

Final-Only Delivery Is a Trust Signal

Humans read sequencing as meaning. If a system emits five messy partial messages and then a clean answer, the operator has to mentally reconcile which one counts. If the system emits one final answer, the contract is obvious.

That clarity is worth more than a few seconds of perceived speed.

Security Lesson: Formatting Is an Injection Boundary

TL;DR: Outbound formatting is a security boundary, because rendered channels interpret content and can accidentally expose raw internal data if delivery is not controlled.

It is tempting to classify formatting as a front-end concern and security as a back-end concern. In agent systems, that separation does not hold up well.

The moment content crosses into a rendered channel, formatting becomes part of the security model. If the channel interprets markup, then escaping is an injection-prevention control. If the channel is visible to humans, then suppressing raw tool chatter is a data-exposure control.

That framing changes implementation priorities.

Instead of asking, "Can this string be displayed?" the better question is, "What interpretation rules will the channel apply to this string, and what content must be neutralized before delivery?"

For Telegram HTML, the baseline control is HTML escaping for arbitrary content. For operator trust, the baseline control is final-only delivery rather than streaming raw intermediate states. Together, those controls reduce two common failure classes:

Output injection through unescaped markup-like content
Accidental leakage of internal traces into a human-visible channel

This is also why centralized notification matters. Security controls scattered across many call sites tend to decay. One engineer remembers to escape content, another assumes the caller already did it, and a third bypasses the helper for a special case. A single notify module creates one place to enforce the rule and one place to review it.

The deeper engineering lesson is that channels are not neutral pipes. They are interpreters with their own syntax, affordances, and failure modes. Production agents need explicit contracts for each one.

Frequently Asked Questions

Q: What does "channel contract" mean in an AI agent system?

A channel contract is the set of rules that define how an agent is allowed to communicate through a specific surface. It includes formatting, safety controls, delivery timing, and what kinds of internal system output must never be shown to the human recipient. Think of it the way an API contract defines request/response schemas — except the consumer is a human reading a chat message.

Q: Why is Telegram formatting a product concern instead of a UI detail?

Because formatting changes whether the message is understandable and trustworthy. In Telegram, parse mode, message structure, and escaping directly affect what the operator sees. A malformed message does not just look bad — it makes the operator question whether the underlying data is correct.

Q: Why disable streaming for an operator-facing chat channel?

Streaming ensures the operator receives intermediate tool calls and partial reasoning fragments alongside the final answer. Turning it off means only the finished result is delivered, which reduces noise, improves readability, and lowers the chance of exposing internal data that does not belong in the chat.

Q: What is output injection in a chat formatting context?

Output injection happens when unescaped content is interpreted by the destination channel as markup or structure instead of plain text. In an HTML-rendered Telegram message, characters like angle brackets and ampersands must be escaped or they can break rendering and produce unintended output. It is a cousin of cross-site scripting, applied to chat rendering rather than browsers.

Q: Why use a centralized notify module instead of sending messages directly?

A notify module creates one enforcement point for delivery behavior, parse mode, escaping, and channel-specific rules. That reduces duplication, makes security reviews easier, and prevents drift where some call sites are safe and others are not.

Key Takeaways

Telegram formatting is part of the product contract, not cosmetic polish.
A centralized notify module is a practical way to enforce consistent delivery behavior.
HTML escaping is mandatory when arbitrary content flows into Telegram HTML parse mode.
Output injection is a real bug class in chat-rendered channels, even when the issue looks like "just formatting."
Streaming off is often the right choice for operator-facing channels where trust matters more than live token display.
Suppressing raw tool chatter reduces both confusion and accidental leakage of internal details.
Channel-specific rules should live in shared infrastructure, not in scattered call-site logic.

A small formatting change can reveal a larger systems truth: production agents are judged at the boundary where humans meet outputs. As more agent workflows move into chat, voice, and hybrid operational surfaces, teams that treat each channel as a formal contract will build systems that feel more reliable, more secure, and more usable than teams that treat delivery as an afterthought.