🤖 Ghostwritten by Claude Opus 4.6 · Fact-checked & edited by GPT 5.4 · Curated by Tom Hundley

MCP Tasks: The Call-Now, Fetch-Later Pattern for Production Agents

MCP Tasks address a common failure mode in agent workflows: blocking on long-running operations. Instead of holding a synchronous request open while an ETL job runs or a document converts, a task-based tool returns a handle immediately. The agent can continue working, then poll for status or subscribe to updates until the task completes, fails, or requests more input.

That is the call-now, fetch-later pattern. For production systems, it is often the difference between a resilient workflow and one that times out, retries unnecessarily, or loses visibility mid-execution. If your agents still rely on synchronous MCP tool calls for operations that can run for tens of seconds or longer, Tasks are the natural upgrade path. The core ideas are simple: return a handle immediately, track state explicitly, and retrieve results asynchronously through polling or subscriptions.

This guide covers the full implementation path: task creation, state management, polling vs. subscription tradeoffs, error handling, and production patterns we use at Elegant Software Solutions. It also builds on related ESS guidance such as agent-to-agent communication patterns and production MCP patterns.

Why Synchronous MCP Calls Break at Scale

TL;DR: Synchronous tool calls become fragile when work runs longer than normal request windows, and many real agent workflows include at least one slow step.

Consider a typical workflow: an agent receives a request to ingest a CSV dataset, transform it, and generate a summary report. The ETL step alone may take minutes, depending on data volume and downstream systems. In a synchronous model, the MCP client blocks on that tool call. The connection stays open. Timeouts become more likely. Retry logic can duplicate work if the client cannot tell whether the original request is still running.

This is a transport and systems problem more than an LLM problem. Standard request-response patterns are a poor fit for operations that may run well beyond ordinary HTTP timeouts or user-interface patience thresholds.

The Cascading Failure Pattern

Synchronous blocking creates three compounding problems:

Problem	Synchronous Impact	Tasks Solution
Client timeout	Connection drops, result may be lost to the caller	Handle persists independently of the original request
Resource lock	Connection or worker stays occupied while idle	Client is freed immediately
Retry storms	Retries may start duplicate work	Idempotency keys can map retries to the same task
Observability gap	Little visibility during execution	State transitions can be surfaced as task updates

If you've already implemented production MCP patterns with custom retry handling and monitoring, Tasks provide a cleaner protocol-level way to solve the same class of problems.

Task Lifecycle: The Five-State Machine

TL;DR: MCP Tasks work best when every task moves through a small, explicit state machine that clients can handle deterministically.

Isometric state machine diagram on a dark navy background with five hexagonal nodes arranged in a flowing left-to-right layout. The nodes are labeled: 'working' (amber glow), 'input_required' (violet

A practical task model is intentionally simple. A task typically includes an identifier, a current status, optional progress metadata, and either a result, an error, or an input request depending on its state.

interface MCPTask {
  id: string;                    // Unique task identifier
  status: TaskStatus;            // Current state
  progress?: number;             // 0-100, optional
  progressMessage?: string;      // Human-readable status
  result?: MCPToolResult;        // Available when completed
  error?: MCPError;              // Available when failed
  inputRequest?: InputRequest;   // Available when input_required
  createdAt: string;             // ISO 8601
  updatedAt: string;             // ISO 8601
}

type TaskStatus = 
  | 'working' 
  | 'input_required' 
  | 'completed' 
  | 'failed' 
  | 'cancelled';

The five states are enough for most production workflows:

working: the task is actively running
input_required: the task is paused until additional input arrives
completed: the task finished successfully
failed: the task ended with an error
cancelled: the task was intentionally stopped

The `input_required` State

The input_required state is what makes Tasks more than a thin wrapper around a background job. When a long-running operation reaches a decision point, it can pause explicitly instead of guessing or failing. That decision might involve ambiguous schema mapping, permission confirmation, or parameter clarification.

The agent can then gather the missing input from a user, another agent, or a policy engine and resume the task. In other words, human-in-the-loop and agent-in-the-loop behavior become part of the task contract rather than an afterthought.

Implementation: Creating and Managing Tasks

TL;DR: A task-based tool should return a handle immediately, then expose status and result retrieval through follow-up task operations.

Server-Side: Exposing a Task-Based Tool

The exact API surface depends on the MCP SDK and version you use, so treat the following as illustrative pseudocode rather than a drop-in implementation. The important pattern is immediate task creation plus background execution.

import asyncio
import uuid
from dataclasses import dataclass
from datetime import datetime, timezone

@dataclass
class TaskRecord:
    id: str
    status: str
    progress: int = 0
    progress_message: str | None = None
    result: dict | None = None
    error: dict | None = None
    input_request: dict | None = None
    created_at: str = datetime.now(timezone.utc).isoformat()
    updated_at: str = datetime.now(timezone.utc).isoformat()


task_store: dict[str, TaskRecord] = {}

async def run_etl_pipeline(source_url: str, target_table: str) -> TaskRecord:
    """Starts an ETL pipeline and returns a task handle immediately."""
    task_id = str(uuid.uuid4())
    task = TaskRecord(
        id=task_id,
        status="working",
        progress=0,
        progress_message="Initializing pipeline"
    )
    task_store[task_id] = task

    asyncio.create_task(_execute_pipeline(task_id, source_url, target_table))
    return task


async def _execute_pipeline(task_id: str, source_url: str, target_table: str) -> None:
    task = task_store[task_id]
    try:
        task.progress = 20
        task.progress_message = "Extracting data from source"
        task.updated_at = datetime.now(timezone.utc).isoformat()
        data = await extract_from_source(source_url)

        schema_issues = validate_schema(data, target_table)
        if schema_issues:
            task.status = "input_required"
            task.input_request = {
                "type": "schema_mapping",
                "message": f"Found {len(schema_issues)} unmapped columns",
                "fields": schema_issues
            }
            task.updated_at = datetime.now(timezone.utc).isoformat()
            return  # Resume later when input is submitted

        task.progress = 60
        task.progress_message = "Transforming and loading"
        task.updated_at = datetime.now(timezone.utc).isoformat()
        result = await transform_and_load(data, target_table)

        task.status = "completed"
        task.progress = 100
        task.result = {"rows_loaded": result.count, "table": target_table}
        task.updated_at = datetime.now(timezone.utc).isoformat()

    except Exception as e:
        task.status = "failed"
        task.error = {"code": "PIPELINE_ERROR", "message": str(e)}
        task.updated_at = datetime.now(timezone.utc).isoformat()

Two implementation notes matter here:

In production, task state should live in durable storage rather than an in-memory dictionary.
Resuming from input_required usually requires a separate submit-input path that requeues or restarts the background work.

Client-Side: Task Handle Management

On the client side, the pattern is the same regardless of language: call the tool, store the task ID, then poll or subscribe until the task reaches a terminal state.

async function executeETLWithTask(client: MCPClient) {
  const task = await client.callTool('run_etl_pipeline', {
    source_url: 'https://data-source.example.com/export.csv',
    target_table: 'quarterly_sales'
  });

  console.log(`Task ${task.id} started — status: ${task.status}`);

  const result = await pollUntilTerminal(client, task.id, {
    intervalMs: 2000,
    maxAttempts: 180,
    onProgress: (t) => {
      console.log(`[${t.progress}%] ${t.progressMessage}`);
    },
    onInputRequired: async (t) => {
      const mapping = await resolveSchemaMapping(t.inputRequest);
      await client.submitTaskInput(task.id, mapping);
    }
  });

  if (result.status === 'completed') {
    console.log('ETL complete:', result.result);
  } else if (result.status === 'failed') {
    console.error('ETL failed:', result.error);
  }
}

Polling vs. Subscription: Choosing Your Retrieval Strategy

TL;DR: Polling is simpler and broadly compatible; subscriptions are better when your transport supports push updates and you need lower-latency progress reporting.

Factor	Polling	Subscription
Transport requirement	Works anywhere the client can re-query task state	Requires a transport and server implementation that support server push
Implementation complexity	Low	Medium
Latency to state change	Up to the polling interval	Usually lower than polling
Server resource usage	More repeated requests	More long-lived connections or streams
Network chattiness	Higher	Lower for long tasks
Best for	Simpler deployments, broad compatibility	Long tasks, richer real-time UX

Polling Implementation

async function pollUntilTerminal(
  client: MCPClient,
  taskId: string,
  options: PollOptions
): Promise<MCPTask> {
  const { intervalMs, maxAttempts, onProgress, onInputRequired } = options;

  for (let attempt = 0; attempt < maxAttempts; attempt++) {
    const task = await client.getTask(taskId);

    switch (task.status) {
      case 'working':
        onProgress?.(task);
        await sleep(intervalMs);
        break;

      case 'input_required':
        await onInputRequired?.(task);
        attempt = 0;
        await sleep(intervalMs);
        break;

      case 'completed':
      case 'failed':
      case 'cancelled':
        return task;
    }
  }

  await client.cancelTask(taskId);
  throw new Error(`Task ${taskId} exceeded max poll attempts`);
}

Polling is often the right default, but avoid overly aggressive intervals. A one- or two-second cadence is usually enough for user-facing progress, and longer intervals may be appropriate for back-office jobs.

Subscription Implementation

If your MCP transport and server support push-style task updates, subscriptions can reduce latency and request volume. The exact mechanism varies by SDK and transport, so this example is also illustrative:

async function subscribeToTask(
  client: MCPClient,
  taskId: string
): Promise<MCPTask> {
  return new Promise((resolve, reject) => {
    const subscription = client.subscribeToTask(taskId, {
      onStateChange: (task) => {
        console.log(`Task ${taskId}: ${task.status} (${task.progress}%)`);

        if (task.status === 'completed') {
          subscription.unsubscribe();
          resolve(task);
        } else if (task.status === 'failed') {
          subscription.unsubscribe();
          reject(new Error(task.error?.message));
        }
      },
      onError: (err) => {
        subscription.unsubscribe();
        reject(err);
      }
    });

    setTimeout(() => {
      subscription.unsubscribe();
      client.cancelTask(taskId);
      reject(new Error('Task subscription timeout'));
    }, 600_000);
  });
}

If you're running MCP servers with OAuth 2.1 token lifecycle patterns, make sure token refresh logic accounts for long-lived task monitoring. A task can easily outlast a short access-token lifetime.

Production Patterns and Anti-Patterns

TL;DR: Reliable MCP Tasks depend on idempotency, durable state, timeout handling, and explicit treatment of paused tasks.

The protocol gives you the primitive. Production resilience comes from the surrounding system design.

Pattern: Idempotency Keys

Prevent duplicate task creation when clients retry after a network interruption:

async def run_etl_pipeline(
    source_url: str,
    target_table: str,
    idempotency_key: str | None = None
) -> TaskRecord:
    if idempotency_key and idempotency_key in idempotency_index:
        return task_store[idempotency_index[idempotency_key]]

    task = create_new_task()
    if idempotency_key:
        idempotency_index[idempotency_key] = task.id
    return task

This is especially important when the client cannot tell whether the original request failed before task creation or after it.

Pattern: Dead-Letter Handling for Stale Tasks

Tasks that remain in working for too long may be orphaned by worker crashes or lost callbacks. A periodic sweeper can mark them failed and surface the issue clearly.

from datetime import datetime, timezone

async def sweep_stale_tasks(max_age_seconds: int = 3600):
    """Transition orphaned tasks to failed state."""
    now = datetime.now(timezone.utc)
    for task_id, task in task_store.items():
        if task.status == "working":
            updated = datetime.fromisoformat(task.updated_at)
            age = (now - updated).total_seconds()
            if age > max_age_seconds:
                task.status = "failed"
                task.error = {
                    "code": "TASK_TIMEOUT",
                    "message": f"Task stale for {age:.0f}s"
                }
                task.updated_at = now.isoformat()
                logger.warning(f"Swept stale task {task_id}")

Anti-Pattern: Polling Without Backoff

Polling every 100 milliseconds is rarely justified and can overload the server. Use a capped backoff strategy instead:

function backoffInterval(attempt: number, baseMs = 1000, maxMs = 30000): number {
  return Math.min(baseMs * Math.pow(1.5, attempt), maxMs);
}

Anti-Pattern: Ignoring `input_required`

If your client does not handle input_required, tasks can stall indefinitely. Every client should define a policy for that state, even if the policy is to cancel the task and log the reason.

case 'input_required':
  if (!onInputRequired) {
    logger.error(`Task ${taskId} requires input but no handler registered`);
    await client.cancelTask(taskId);
    throw new Error('Unhandled input_required state');
  }
  break;

Pattern: Durable State and Recovery

In-memory task stores are fine for demos, but production systems should persist task metadata and state transitions to durable storage. On restart, the server should reconcile any tasks left in working and either resume them safely or mark them failed with a clear recovery code.

Architectural cross-section on a dark charcoal background showing a production MCP Tasks setup. Left zone labeled 'Agent Clients' shows three parallel agent icons in amber. Center zone labeled 'MCP Se

Frequently Asked Questions

Q: How do MCP Tasks differ from a standard job queue like Bull or Celery?

MCP Tasks are a protocol-facing abstraction. The client interacts with a task through the MCP interface rather than directly through queue infrastructure. Under the hood, your server may still use Celery, Bull, or another job system to execute the work. The difference is that the agent sees a consistent task contract instead of queue-specific mechanics.

Q: Can I use MCP Tasks with stdio transport or only HTTP-based transports?

Task creation and polling fit naturally with any transport that lets the client make follow-up requests. Push-style subscriptions depend on transport and SDK support for server-initiated updates, so they are more transport-specific. If you are running local agents over stdio, polling is usually the simpler option.

Q: What happens if the MCP server crashes while a task is in `working` state?

That depends on where task state lives. If state is only in memory, a crash can orphan the task entirely. If state is persisted, the server can recover the task record on restart and either resume execution or mark it failed with a recovery-specific error code. Durable state is the difference between graceful recovery and silent loss.

Q: How should agents handle the `input_required` state in autonomous workflows?

Autonomous systems still need an explicit routing policy. Some tasks should escalate to a human. Others can route to a policy engine or a specialist agent. The important design choice is to make that routing deterministic and auditable rather than letting the task sit indefinitely.

Q: Is there a limit to how many concurrent tasks an MCP server should manage?

There is no universal fixed limit. Capacity depends on memory, worker concurrency, downstream dependencies, and how much state you retain per task. In practice, you should implement admission control, monitor queue depth and task age, and reject or defer new work before the system becomes unstable.

Key Takeaways

MCP Tasks replace blocking tool calls with an immediate handle return so agents can continue working while long operations run asynchronously.
A small explicit state machine makes task handling predictable, including pause-and-resume flows through input_required.
Polling is the simplest retrieval strategy, while subscriptions are better when your transport supports push updates and you need faster feedback.
Production systems need idempotency, durable state, and stale-task recovery to avoid duplicate work and silent failures.
Every client must handle input_required deliberately or tasks will hang in ways that are difficult to debug.
In-memory task stores are for prototypes; durable persistence is the safer default for real workloads.

Moving to Production Task-Based Agents

MCP Tasks mark a practical shift from prototype agent systems to production-ready workflows. The call-now, fetch-later pattern is not just a convenience. It is a better fit for operations that take time, require approval, or need to survive transient failures without tying up the original request.

If your team is building agent workflows for ETL, document processing, code generation, or any other operation that cannot reliably finish within a short synchronous request window, Tasks are worth adopting. The pattern itself is straightforward. The engineering work is in the surrounding details: idempotency, recovery, transport-aware retrieval, and clear state transitions.

Elegant Software Solutions helps development teams implement production MCP patterns, including task-based workflows, OAuth lifecycle management, and multi-agent orchestration. If you're ready to move from synchronous prototypes to resilient async agent workflows, schedule a technical conversation with our team.