Semantic Kernel: RAG in the Microsoft Ecosystem

When your enterprise runs on Azure and .NET, Semantic Kernel is your native path to production RAG.

Why Semantic Kernel?

Microsoft's Semantic Kernel occupies a unique position in the AI framework landscape. While LangChain and LlamaIndex dominate the Python-first world, Semantic Kernel provides first-class .NET support alongside Python, making it the natural choice for enterprises already invested in the Microsoft stack.

But Semantic Kernel is more than just "LangChain for C#." It represents Microsoft's vision for AI orchestration: a kernel-based architecture where AI capabilities are plugins that can be composed, planned, and executed with the same patterns used throughout enterprise .NET development.

If you read Part 1 of this series, you understand that RAG combines retrieval with generation. Semantic Kernel implements this through its Memory and Plugin systems, with deep integration into Azure AI services like Azure OpenAI, Azure AI Search, and Azure Cosmos DB.

The State of Semantic Kernel in 2025

Important Context: On October 1, 2025, Microsoft announced the Microsoft Agent Framework (MAF), merging Semantic Kernel and AutoGen into a unified platform for building AI agents. Think of MAF as "Semantic Kernel v2.0" - built by the same team.

What this means for RAG development:

Semantic Kernel 1.x remains fully supported for memory, RAG, and Azure AI integrations
Microsoft will continue addressing bugs and security issues in SK 1.x
New agent orchestration features will be developed in MAF (targeting 1.0 GA by end of Q1 2026)

For RAG systems, Semantic Kernel's memory and connector ecosystem remains the recommended approach.

API Note: This article uses the MemoryBuilder and ISemanticTextMemory abstractions, which still work but are now considered legacy. Microsoft recommends migrating to the newer Vector Store abstractions (Microsoft.Extensions.VectorData.Abstractions) for new projects. The legacy APIs shown here are simpler for learning but the Vector Store pattern offers more flexibility (custom schemas, metadata pre-filtering, multiple vectors per record). See the Migration Guide for details.

When Semantic Kernel Excels

Semantic Kernel is the right choice when:

Your team writes C#/.NET: First-class support, not a Python port
You run on Azure: Native integration with Azure OpenAI, Azure AI Search, Azure Cosmos DB
Enterprise compliance matters: Microsoft's enterprise security and compliance guarantees
You need Windows deployment: Native Windows service integration
Existing .NET codebase: Leverage existing business logic as plugins

How Semantic Kernel Differs

Aspect	LangChain	LlamaIndex	Haystack	Semantic Kernel
Primary language	Python	Python	Python	C# (Python available)
Core abstraction	Chains/Runnables	Indexes	Pipelines (DAG)	Kernel + Plugins
Cloud provider	Agnostic	Agnostic	Agnostic	Azure-optimized
Memory model	Various	Index-based	Document stores	Unified Memory interface
Target audience	AI/ML teams	Knowledge workers	Enterprise teams	.NET enterprises

Setting Up Semantic Kernel

Installation for C# (.NET)

# Create a new .NET project
dotnet new console -n SemanticKernelRAG
cd SemanticKernelRAG

# Add Semantic Kernel core
dotnet add package Microsoft.SemanticKernel

# Add memory connectors (choose based on your storage)
dotnet add package Microsoft.SemanticKernel.Connectors.AzureAISearch
dotnet add package Microsoft.SemanticKernel.Connectors.AzureCosmosDBMongoDB
dotnet add package Microsoft.SemanticKernel.Connectors.Qdrant

# Add OpenAI/Azure OpenAI connectors
dotnet add package Microsoft.SemanticKernel.Connectors.AzureOpenAI
dotnet add package Microsoft.SemanticKernel.Connectors.OpenAI

Installation for Python

# Core Semantic Kernel
pip install semantic-kernel

# Azure integrations
pip install semantic-kernel[azure]

# Or individual connectors
pip install semantic-kernel-connectors-azure-ai-search
pip install semantic-kernel-connectors-qdrant

Project Structure (C#)

A well-organized Semantic Kernel project:

SemanticKernelRAG/
![Diagram 1 from Semantic Kernel: RAG in the Microsoft Ecosystem](/images/blog/diagrams/building-rag-systems-semantic-kernel-diagram-1.webp)

Configuration

Create appsettings.json:

{
  "AzureOpenAI": {
    "Endpoint": "https://your-resource.openai.azure.com/",
    "DeploymentName": "gpt-4o",
    "EmbeddingDeploymentName": "text-embedding-3-small",
    "ApiKey": ""
  },
  "AzureAISearch": {
    "Endpoint": "https://your-search.search.windows.net",
    "IndexName": "rag-documents",
    "ApiKey": ""
  },
  "AzureCosmosDB": {
    "ConnectionString": "",
    "DatabaseName": "rag-db",
    "ContainerName": "documents"
  }
}

For production, use Azure Key Vault or environment variables instead of storing keys in configuration files.

Core Concepts

Semantic Kernel's architecture revolves around four key abstractions that work together to enable AI applications.

The Kernel

The Kernel is the central orchestrator. It holds references to AI services, plugins, and memory, coordinating their interaction.

using Microsoft.SemanticKernel;

// Build the kernel
var builder = Kernel.CreateBuilder();

// Add AI services
builder.AddAzureOpenAIChatCompletion(
    deploymentName: "gpt-4o",
    endpoint: "https://your-resource.openai.azure.com/",
    apiKey: Environment.GetEnvironmentVariable("AZURE_OPENAI_API_KEY")
);

builder.AddAzureOpenAITextEmbeddingGeneration(
    deploymentName: "text-embedding-3-small",
    endpoint: "https://your-resource.openai.azure.com/",
    apiKey: Environment.GetEnvironmentVariable("AZURE_OPENAI_API_KEY")
);

var kernel = builder.Build();

The Kernel is designed to be dependency-injected in ASP.NET applications:

// In Program.cs or Startup.cs
builder.Services.AddKernel()
    .AddAzureOpenAIChatCompletion(
        deploymentName: configuration["AzureOpenAI:DeploymentName"],
        endpoint: configuration["AzureOpenAI:Endpoint"],
        apiKey: configuration["AzureOpenAI:ApiKey"]
    );

Plugins

Plugins are collections of functions that extend the Kernel's capabilities. For RAG, you will create plugins for document processing, search, and generation.

using Microsoft.SemanticKernel;
using System.ComponentModel;

public class SearchPlugin
{
    private readonly IMemoryStore _memoryStore;

    public SearchPlugin(IMemoryStore memoryStore)
    {
        _memoryStore = memoryStore;
    }

    [KernelFunction("search_documents")]
    [Description("Search the knowledge base for relevant documents")]
    public async Task<string> SearchDocumentsAsync(
        [Description("The search query")] string query,
        [Description("Maximum number of results")] int maxResults = 5)
    {
        var results = await _memoryStore.SearchAsync(
            collection: "documents",
            query: query,
            limit: maxResults
        ).ToListAsync();

        return string.Join("\n\n---\n\n",
            results.Select(r => r.Metadata.Text));
    }
}

// Register the plugin with the kernel
kernel.Plugins.AddFromObject(new SearchPlugin(memoryStore), "Search");

Plugins can be invoked directly or used by the AI to plan and execute complex tasks.

Memory

Memory in Semantic Kernel provides a unified interface for storing and retrieving information using embeddings. It abstracts away the underlying vector database.

using Microsoft.SemanticKernel.Memory;

// Create a memory builder
var memoryBuilder = new MemoryBuilder();

memoryBuilder.WithAzureOpenAITextEmbeddingGeneration(
    deploymentName: "text-embedding-3-small",
    endpoint: azureOpenAIEndpoint,
    apiKey: azureOpenAIApiKey
);

// Choose your storage backend
memoryBuilder.WithAzureAISearchMemoryStore(
    endpoint: searchEndpoint,
    apiKey: searchApiKey
);

var memory = memoryBuilder.Build();

The memory interface is simple:

// Store a document
await memory.SaveInformationAsync(
    collection: "documents",
    id: "doc-001",
    text: "Our vacation policy allows 20 days PTO per year...",
    description: "HR Vacation Policy",
    additionalMetadata: "source=hr_handbook;section=benefits"
);

// Search for relevant documents
var results = memory.SearchAsync(
    collection: "documents",
    query: "How many vacation days do I get?",
    limit: 5,
    minRelevanceScore: 0.7
);

await foreach (var result in results)
{
    Console.WriteLine($"Relevance: {result.Relevance:P2}");
    Console.WriteLine($"Text: {result.Metadata.Text}");
}

Semantic Functions (Prompts)

Semantic functions are prompt templates that can be parameterized and composed. They are central to RAG's generation step.

using Microsoft.SemanticKernel;

// Define a prompt template
string promptTemplate = """
    You are a helpful assistant. Answer the user's question based on the provided context.
    If the context doesn't contain the answer, say "I don't have information about that."

    ## Context
    {{$context}}

    ## Question
    {{$question}}

    ## Answer
    """;

// Create a semantic function
var ragFunction = kernel.CreateFunctionFromPrompt(
    promptTemplate,
    new OpenAIPromptExecutionSettings
    {
        MaxTokens = 1000,
        Temperature = 0.7
    }
);

// Execute it
var result = await kernel.InvokeAsync(ragFunction, new KernelArguments
{
    ["context"] = retrievedContext,
    ["question"] = userQuestion
});

Console.WriteLine(result.GetValue<string>());

RAG Implementation with Azure AI Search

Azure AI Search (formerly Azure Cognitive Search) is Microsoft's enterprise search service and the most common choice for production RAG in the Azure ecosystem.

Setting Up Azure AI Search Memory Store

using Microsoft.SemanticKernel.Connectors.AzureAISearch;
using Microsoft.SemanticKernel.Memory;
using Azure;
using Azure.Search.Documents.Indexes;

// Configure the memory store
var searchIndexClient = new SearchIndexClient(
    new Uri(searchEndpoint),
    new AzureKeyCredential(searchApiKey)
);

var memoryStore = new AzureAISearchMemoryStore(
    searchIndexClient,
    new AzureAISearchMemoryStoreOptions
    {
        // Index will be created if it doesn't exist
        VectorSize = 1536  // Must match your embedding model
    }
);

// Build memory with embedding generation
var memoryBuilder = new MemoryBuilder();
memoryBuilder.WithAzureOpenAITextEmbeddingGeneration(
    deploymentName: "text-embedding-3-small",
    endpoint: azureOpenAIEndpoint,
    apiKey: azureOpenAIApiKey
);
memoryBuilder.WithMemoryStore(memoryStore);

var memory = memoryBuilder.Build();

Indexing Documents

public class DocumentIndexer
{
    private readonly ISemanticTextMemory _memory;
    private readonly string _collection;

    public DocumentIndexer(ISemanticTextMemory memory, string collection = "documents")
    {
        _memory = memory;
        _collection = collection;
    }

    public async Task IndexDocumentAsync(
        string id,
        string content,
        string source,
        Dictionary<string, string>? metadata = null)
    {
        // Chunk the document
        var chunks = ChunkDocument(content);

        for (int i = 0; i < chunks.Count; i++)
        {
            var chunkId = $"{id}_chunk_{i}";
            var additionalMetadata = metadata != null
                ? string.Join(";", metadata.Select(kv => $"{kv.Key}={kv.Value}"))
                : "";

            await _memory.SaveInformationAsync(
                collection: _collection,
                id: chunkId,
                text: chunks[i],
                description: $"Chunk {i + 1} of {chunks.Count} from {source}",
                additionalMetadata: $"source={source};chunk={i};{additionalMetadata}"
            );
        }
    }

    private List<string> ChunkDocument(string content, int chunkSize = 500, int overlap = 50)
    {
        var chunks = new List<string>();
        var words = content.Split(' ', StringSplitOptions.RemoveEmptyEntries);

        for (int i = 0; i < words.Length; i += chunkSize - overlap)
        {
            var chunk = string.Join(" ", words.Skip(i).Take(chunkSize));
            if (!string.IsNullOrWhiteSpace(chunk))
            {
                chunks.Add(chunk);
            }
        }

        return chunks;
    }
}

Complete RAG Service

Here is a complete RAG service that ties everything together:

using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Memory;

public class RAGService
{
    private readonly Kernel _kernel;
    private readonly ISemanticTextMemory _memory;
    private readonly KernelFunction _ragFunction;
    private readonly string _collection;

    public RAGService(
        Kernel kernel,
        ISemanticTextMemory memory,
        string collection = "documents")
    {
        _kernel = kernel;
        _memory = memory;
        _collection = collection;

        // Create the RAG prompt function
        _ragFunction = CreateRagFunction();
    }

    private KernelFunction CreateRagFunction()
    {
        string promptTemplate = """
            You are a knowledgeable assistant. Answer the user's question using
            the provided context. Follow these guidelines:

            1. Only use information from the provided context
            2. If the context doesn't contain the answer, say so clearly
            3. Cite your sources by mentioning the source document
            4. Be concise but thorough

            ## Context
            {{$context}}

            ## Question
            {{$question}}

            ## Answer
            """;

        return _kernel.CreateFunctionFromPrompt(
            promptTemplate,
            new OpenAIPromptExecutionSettings
            {
                MaxTokens = 1000,
                Temperature = 0.3  // Lower temperature for factual responses
            }
        );
    }

    public async Task<RAGResponse> QueryAsync(
        string question,
        int maxResults = 5,
        double minRelevance = 0.5)
    {
        // Step 1: Retrieve relevant documents
        var searchResults = new List<MemoryQueryResult>();

        await foreach (var result in _memory.SearchAsync(
            collection: _collection,
            query: question,
            limit: maxResults,
            minRelevanceScore: minRelevance))
        {
            searchResults.Add(result);
        }

        if (searchResults.Count == 0)
        {
            return new RAGResponse
            {
                Answer = "I couldn't find any relevant information in the knowledge base.",
                Sources = Array.Empty<string>(),
                Confidence = 0
            };
        }

        // Step 2: Build context from retrieved documents
        var context = BuildContext(searchResults);

        // Step 3: Generate response
        var result = await _kernel.InvokeAsync(_ragFunction, new KernelArguments
        {
            ["context"] = context,
            ["question"] = question
        });

        // Extract sources
        var sources = searchResults
            .Select(r => r.Metadata.AdditionalMetadata)
            .Where(m => m != null)
            .Select(m => ExtractSource(m!))
            .Distinct()
            .ToArray();

        // Calculate average relevance as confidence
        var avgRelevance = searchResults.Average(r => r.Relevance);

        return new RAGResponse
        {
            Answer = result.GetValue<string>() ?? "Unable to generate response.",
            Sources = sources,
            Confidence = avgRelevance,
            RetrievedDocuments = searchResults.Count
        };
    }

    private string BuildContext(List<MemoryQueryResult> results)
    {
        var contextBuilder = new StringBuilder();

        foreach (var result in results)
        {
            var source = ExtractSource(result.Metadata.AdditionalMetadata ?? "");
            contextBuilder.AppendLine($"[Source: {source}]");
            contextBuilder.AppendLine(result.Metadata.Text);
            contextBuilder.AppendLine("---");
        }

        return contextBuilder.ToString();
    }

    private string ExtractSource(string metadata)
    {
        var parts = metadata.Split(';')
            .Select(p => p.Split('='))
            .Where(p => p.Length == 2)
            .ToDictionary(p => p[0], p => p[1]);

        return parts.GetValueOrDefault("source", "Unknown");
    }
}

public class RAGResponse
{
    public string Answer { get; set; } = "";
    public string[] Sources { get; set; } = Array.Empty<string>();
    public double Confidence { get; set; }
    public int RetrievedDocuments { get; set; }
}

Azure Cosmos DB Vector Support

Azure Cosmos DB for MongoDB vCore includes native vector search capabilities, making it excellent for RAG when you already use Cosmos DB for your application data.

Setting Up Cosmos DB Memory Store

using Microsoft.SemanticKernel.Connectors.AzureCosmosDBMongoDB;
using MongoDB.Driver;

// Connect to Cosmos DB MongoDB vCore
var mongoClient = new MongoClient(cosmosConnectionString);
var database = mongoClient.GetDatabase("rag-database");

var memoryStore = new AzureCosmosDBMongoDBMemoryStore(
    mongoClient,
    "rag-database",
    new AzureCosmosDBMongoDBConfig
    {
        // Vector index configuration
        IndexName = "vector_index",
        VectorDimensions = 1536,
        SimilarityAlgorithm = CosmosDBSimilarityAlgorithm.Cosine
    }
);

// Build memory
var memoryBuilder = new MemoryBuilder();
memoryBuilder.WithAzureOpenAITextEmbeddingGeneration(
    deploymentName: "text-embedding-3-small",
    endpoint: azureOpenAIEndpoint,
    apiKey: azureOpenAIApiKey
);
memoryBuilder.WithMemoryStore(memoryStore);

var memory = memoryBuilder.Build();

Cosmos DB Vector Index Setup

Before using vector search, create a vector index in your Cosmos DB collection:

// Run in MongoDB shell or Azure Data Studio
db.runCommand({
  createIndexes: "documents",
  indexes: [
    {
      name: "vector_index",
      key: { embedding: "cosmosSearch" },
      cosmosSearchOptions: {
        kind: "vector-hnsw",
        m: 16,
        efConstruction: 64,
        similarity: "COS",
        dimensions: 1536
      }
    }
  ]
});

Building a Complete RAG Application

Here is a production-ready RAG application that demonstrates all the concepts:

using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Memory;
using Microsoft.SemanticKernel.Connectors.AzureAISearch;
using Azure;
using Azure.Search.Documents.Indexes;
using System.Text;

namespace SemanticKernelRAG;

public class Program
{
    public static async Task Main(string[] args)
    {
        // Load configuration
        var config = LoadConfiguration();

        // Build the kernel
        var kernel = BuildKernel(config);

        // Build memory with Azure AI Search
        var memory = await BuildMemoryAsync(config);

        // Create RAG service
        var ragService = new RAGService(kernel, memory);

        // Example: Index some documents
        await IndexSampleDocumentsAsync(memory);

        // Example: Query the knowledge base
        var response = await ragService.QueryAsync(
            "What is our vacation policy?"
        );

        Console.WriteLine($"Answer: {response.Answer}");
        Console.WriteLine($"Sources: {string.Join(", ", response.Sources)}");
        Console.WriteLine($"Confidence: {response.Confidence:P2}");
    }

    private static RAGConfiguration LoadConfiguration()
    {
        return new RAGConfiguration
        {
            AzureOpenAIEndpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")
                ?? throw new InvalidOperationException("AZURE_OPENAI_ENDPOINT not set"),
            AzureOpenAIApiKey = Environment.GetEnvironmentVariable("AZURE_OPENAI_API_KEY")
                ?? throw new InvalidOperationException("AZURE_OPENAI_API_KEY not set"),
            ChatDeploymentName = Environment.GetEnvironmentVariable("AZURE_OPENAI_CHAT_DEPLOYMENT")
                ?? "gpt-4o",
            EmbeddingDeploymentName = Environment.GetEnvironmentVariable("AZURE_OPENAI_EMBEDDING_DEPLOYMENT")
                ?? "text-embedding-3-small",
            AzureSearchEndpoint = Environment.GetEnvironmentVariable("AZURE_SEARCH_ENDPOINT")
                ?? throw new InvalidOperationException("AZURE_SEARCH_ENDPOINT not set"),
            AzureSearchApiKey = Environment.GetEnvironmentVariable("AZURE_SEARCH_API_KEY")
                ?? throw new InvalidOperationException("AZURE_SEARCH_API_KEY not set")
        };
    }

    private static Kernel BuildKernel(RAGConfiguration config)
    {
        var builder = Kernel.CreateBuilder();

        builder.AddAzureOpenAIChatCompletion(
            deploymentName: config.ChatDeploymentName,
            endpoint: config.AzureOpenAIEndpoint,
            apiKey: config.AzureOpenAIApiKey
        );

        return builder.Build();
    }

    private static async Task<ISemanticTextMemory> BuildMemoryAsync(RAGConfiguration config)
    {
        var searchIndexClient = new SearchIndexClient(
            new Uri(config.AzureSearchEndpoint),
            new AzureKeyCredential(config.AzureSearchApiKey)
        );

        var memoryStore = new AzureAISearchMemoryStore(
            searchIndexClient,
            new AzureAISearchMemoryStoreOptions
            {
                VectorSize = 1536
            }
        );

        var memoryBuilder = new MemoryBuilder();

        memoryBuilder.WithAzureOpenAITextEmbeddingGeneration(
            deploymentName: config.EmbeddingDeploymentName,
            endpoint: config.AzureOpenAIEndpoint,
            apiKey: config.AzureOpenAIApiKey
        );

        memoryBuilder.WithMemoryStore(memoryStore);

        return memoryBuilder.Build();
    }

    private static async Task IndexSampleDocumentsAsync(ISemanticTextMemory memory)
    {
        var documents = new[]
        {
            new
            {
                Id = "hr-vacation",
                Content = @"
                    Company Vacation Policy

                    All full-time employees are entitled to 20 days of paid time off (PTO)
                    per calendar year. PTO accrues at a rate of 1.67 days per month.
                    Unused PTO can be carried over to the next year, up to a maximum of
                    5 days. PTO requests must be submitted at least 2 weeks in advance
                    for periods longer than 3 consecutive days.
                ",
                Source = "HR Handbook - Benefits"
            },
            new
            {
                Id = "hr-remote",
                Content = @"
                    Remote Work Guidelines

                    Employees may work remotely up to 3 days per week with manager approval.
                    Remote work days must be scheduled in advance and logged in the HR system.
                    Employees are expected to be available during core hours (10 AM - 3 PM)
                    regardless of work location. Home office equipment stipends of up to $500
                    are available for eligible employees.
                ",
                Source = "HR Handbook - Remote Work"
            },
            new
            {
                Id = "hr-expenses",
                Content = @"
                    Expense Reimbursement Policy

                    Business expenses must be submitted within 30 days of the expense date.
                    Receipts are required for all expenses over $25. Meals during travel are
                    reimbursed up to $75 per day. Flights should be booked through the
                    company travel portal. Personal expenses are not eligible for reimbursement.
                ",
                Source = "HR Handbook - Expenses"
            }
        };

        foreach (var doc in documents)
        {
            await memory.SaveInformationAsync(
                collection: "documents",
                id: doc.Id,
                text: doc.Content.Trim(),
                description: doc.Source,
                additionalMetadata: $"source={doc.Source}"
            );
        }

        Console.WriteLine($"Indexed {documents.Length} documents");
    }
}

public class RAGConfiguration
{
    public string AzureOpenAIEndpoint { get; set; } = "";
    public string AzureOpenAIApiKey { get; set; } = "";
    public string ChatDeploymentName { get; set; } = "";
    public string EmbeddingDeploymentName { get; set; } = "";
    public string AzureSearchEndpoint { get; set; } = "";
    public string AzureSearchApiKey { get; set; } = "";
}

Python Implementation

For teams that prefer Python, Semantic Kernel provides a similar API:

import asyncio
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import (
    AzureChatCompletion,
    AzureTextEmbedding,
)
from semantic_kernel.connectors.memory.azure_ai_search import AzureAISearchMemoryStore
from semantic_kernel.memory import SemanticTextMemory


async def main():
    # Create the kernel
    kernel = Kernel()

    # Add Azure OpenAI chat service
    kernel.add_service(
        AzureChatCompletion(
            deployment_name="gpt-4o",
            endpoint="https://your-resource.openai.azure.com/",
            api_key=os.environ["AZURE_OPENAI_API_KEY"],
        )
    )

    # Create memory with Azure AI Search
    embedding_service = AzureTextEmbedding(
        deployment_name="text-embedding-3-small",
        endpoint="https://your-resource.openai.azure.com/",
        api_key=os.environ["AZURE_OPENAI_API_KEY"],
    )

    memory_store = AzureAISearchMemoryStore(
        search_endpoint="https://your-search.search.windows.net",
        admin_key=os.environ["AZURE_SEARCH_API_KEY"],
        vector_size=1536,
    )

    memory = SemanticTextMemory(
        storage=memory_store,
        embeddings_generator=embedding_service,
    )

    # Save a document
    await memory.save_information(
        collection="documents",
        id="doc-001",
        text="Our vacation policy allows 20 days PTO per year...",
        description="HR Vacation Policy",
    )

    # Search for relevant documents
    results = await memory.search(
        collection="documents",
        query="How many vacation days do I get?",
        limit=5,
        min_relevance_score=0.5,
    )

    for result in results:
        print(f"Relevance: {result.relevance:.2%}")
        print(f"Text: {result.text}")

    # Create a RAG prompt
    prompt = """
    Answer the question based on the provided context.

    Context:
    {{$context}}

    Question: {{$question}}

    Answer:
    """

    rag_function = kernel.add_function(
        plugin_name="RAG",
        function_name="answer",
        prompt=prompt,
    )

    # Execute RAG
    context = "\n".join([r.text for r in results])
    result = await kernel.invoke(
        rag_function,
        context=context,
        question="How many vacation days do I get?",
    )

    print(f"Answer: {result}")


if __name__ == "__main__":
    asyncio.run(main())

Enterprise Patterns

Security and Compliance

When deploying RAG in enterprise environments, security is paramount:

// Use Managed Identity instead of API keys
using Azure.Identity;

var credential = new DefaultAzureCredential();

// Azure OpenAI with Managed Identity
builder.AddAzureOpenAIChatCompletion(
    deploymentName: "gpt-4o",
    endpoint: azureOpenAIEndpoint,
    credential: credential
);

// Azure AI Search with Managed Identity
var searchClient = new SearchClient(
    new Uri(searchEndpoint),
    "documents",
    credential
);

Access Control

Implement document-level access control:

public class SecureRAGService
{
    private readonly ISemanticTextMemory _memory;
    private readonly Kernel _kernel;

    public async Task<RAGResponse> QueryWithAccessControlAsync(
        string question,
        ClaimsPrincipal user)
    {
        // Get user's groups/roles
        var userGroups = user.Claims
            .Where(c => c.Type == "groups")
            .Select(c => c.Value)
            .ToHashSet();

        // Search with metadata filter
        var allResults = new List<MemoryQueryResult>();

        await foreach (var result in _memory.SearchAsync(
            collection: "documents",
            query: question,
            limit: 20))  // Get more than needed
        {
            // Filter by access control
            var docGroups = ExtractGroups(result.Metadata.AdditionalMetadata);
            if (docGroups.Any(g => userGroups.Contains(g)))
            {
                allResults.Add(result);
            }
        }

        // Take top results after filtering
        var filteredResults = allResults.Take(5).ToList();

        // Continue with generation...
    }

    private IEnumerable<string> ExtractGroups(string metadata)
    {
        // Parse groups from metadata
        var parts = metadata.Split(';');
        var groupsPart = parts.FirstOrDefault(p => p.StartsWith("groups="));
        return groupsPart?.Substring(7).Split(',') ?? Array.Empty<string>();
    }
}

Logging and Monitoring

Integrate with Azure Application Insights:

using Microsoft.ApplicationInsights;
using Microsoft.ApplicationInsights.DataContracts;

public class InstrumentedRAGService
{
    private readonly RAGService _ragService;
    private readonly TelemetryClient _telemetry;

    public async Task<RAGResponse> QueryAsync(string question)
    {
        using var operation = _telemetry.StartOperation<RequestTelemetry>("RAG Query");

        try
        {
            var startTime = DateTime.UtcNow;
            var response = await _ragService.QueryAsync(question);
            var duration = DateTime.UtcNow - startTime;

            // Track metrics
            _telemetry.TrackMetric("RAG.QueryDuration", duration.TotalMilliseconds);
            _telemetry.TrackMetric("RAG.RetrievedDocuments", response.RetrievedDocuments);
            _telemetry.TrackMetric("RAG.Confidence", response.Confidence);

            // Track custom event
            _telemetry.TrackEvent("RAGQuery", new Dictionary<string, string>
            {
                ["QuestionLength"] = question.Length.ToString(),
                ["SourceCount"] = response.Sources.Length.ToString()
            });

            return response;
        }
        catch (Exception ex)
        {
            _telemetry.TrackException(ex);
            throw;
        }
    }
}

TextSearchProvider for Retrieval

Semantic Kernel 1.x introduced TextSearchProvider as a standardized abstraction for text search operations, making it easier to swap between different search backends.

using Microsoft.SemanticKernel.Data;

// Create a text search provider
var textSearch = new AzureAISearchTextSearch(
    searchClient,
    new AzureAISearchTextSearchOptions
    {
        // Configure search behavior
        SelectFields = new[] { "content", "title", "source" },
        SearchFields = new[] { "content" }
    }
);

// Use in a function
[KernelFunction("search")]
public async Task<IEnumerable<TextSearchResult>> SearchAsync(
    [Description("Search query")] string query)
{
    var options = new TextSearchOptions
    {
        Top = 5,
        Skip = 0
    };

    var results = await textSearch.SearchAsync(query, options);
    return results;
}

Common Pitfalls and Gotchas

1. Template Encoding Issues

Semantic Kernel uses Handlebars-style templates. Special characters need escaping:

// WRONG: Curly braces in output can break template parsing
string template = "Output as JSON: {\"key\": \"value\"}";

// CORRECT: Escape curly braces or use raw strings carefully
string template = "Output as JSON: {{{{\"key\": \"value\"}}}}";

// BETTER: Keep JSON instructions simple
string template = "Output your response in valid JSON format.";

2. Deprecated Planners

The original Planner classes (ActionPlanner, SequentialPlanner, StepwisePlanner) are deprecated as of late 2024. Use the function calling approach instead:

// DEPRECATED: Old planner approach
// var planner = new SequentialPlanner(kernel);

// CURRENT: Use function calling with auto-invocation
var settings = new OpenAIPromptExecutionSettings
{
    FunctionChoiceBehavior = FunctionChoiceBehavior.Auto()
};

var result = await kernel.InvokePromptAsync(
    "Find documents about vacation policy and summarize them",
    new KernelArguments(settings)
);

3. Memory Store Index Creation

Azure AI Search indices must match your configuration:

// WRONG: Creating index with wrong vector dimensions
var memoryStore = new AzureAISearchMemoryStore(
    searchIndexClient,
    new AzureAISearchMemoryStoreOptions
    {
        VectorSize = 384  // But using text-embedding-3-small (1536)!
    }
);

// CORRECT: Match your embedding model
var memoryStore = new AzureAISearchMemoryStore(
    searchIndexClient,
    new AzureAISearchMemoryStoreOptions
    {
        VectorSize = 1536  // Matches text-embedding-3-small
    }
);

4. API Evolution

Semantic Kernel's API has evolved rapidly. Code from early 2024 may not compile in late 2025:

// OLD (pre-1.0): ChatCompletion style
// kernel.GetService<IChatCompletionService>();

// CURRENT (1.x): Service collection pattern
kernel.GetRequiredService<IChatCompletionService>();

// OLD: Direct text completion
// await kernel.InvokeSemanticFunctionAsync(...)

// CURRENT: Function-based invocation
await kernel.InvokeAsync(function, arguments);

5. Async Enumerable Handling

Memory search returns IAsyncEnumerable, not List:

// WRONG: This doesn't compile
var results = memory.SearchAsync(collection, query);  // IAsyncEnumerable

// CORRECT: Use await foreach
var results = new List<MemoryQueryResult>();
await foreach (var result in memory.SearchAsync(collection, query))
{
    results.Add(result);
}

// OR: Use ToListAsync extension
using System.Linq;
var results = await memory.SearchAsync(collection, query).ToListAsync();

Running the Example

Prerequisites

Azure OpenAI resource with deployments for GPT-4 and text-embedding-3-small
Azure AI Search service (or Azure Cosmos DB for MongoDB vCore)
.NET 8 SDK

Setup

# Clone and navigate to project
cd SemanticKernelRAG

# Set environment variables
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
export AZURE_OPENAI_API_KEY="your-key"
export AZURE_OPENAI_CHAT_DEPLOYMENT="gpt-4o"
export AZURE_OPENAI_EMBEDDING_DEPLOYMENT="text-embedding-3-small"
export AZURE_SEARCH_ENDPOINT="https://your-search.search.windows.net"
export AZURE_SEARCH_API_KEY="your-search-key"

# Run the application
dotnet run

Expected Output

Indexed 3 documents
Answer: According to the company vacation policy, all full-time employees are entitled to 20 days of paid time off (PTO) per calendar year. PTO accrues at a rate of 1.67 days per month. You can carry over up to 5 unused days to the next year.

[Source: HR Handbook - Benefits]
Sources: HR Handbook - Benefits
Confidence: 89.23%

Semantic Kernel vs Other Frameworks

Having covered five frameworks now, here is where Semantic Kernel fits:

Criterion	LangChain	LlamaIndex	Haystack	Semantic Kernel
Best for .NET	No	No	No	Yes
Azure integration	Manual	Manual	Manual	Native
Enterprise security	Add-on	Add-on	Built-in	Native (Azure)
Learning curve	Medium	Medium	Higher	Medium-High
Community size	Largest	Large	Medium	Growing
Python quality	Native	Native	Native	Port (functional)

Choose Semantic Kernel when:

Your team primarily uses C#/.NET
You are deploying to Azure
Enterprise security requirements are non-negotiable
You want Microsoft support and roadmap alignment
You are building Windows services or Azure Functions

Consider alternatives when:

Your team prefers Python
You need the largest ecosystem of integrations
You want cloud-agnostic deployment
Rapid prototyping is the priority

Next Steps

This article covered Semantic Kernel's approach to RAG in the Microsoft ecosystem. Continue with the series:

Part 6: AWS Bedrock Knowledge Bases - Fully managed RAG with zero infrastructure
Part 1: RAG Foundations - Review core concepts if needed
Part 4: Haystack - Compare with another enterprise-focused framework

For Azure-specific deep dives, see:

This is Part 5 of the "Building RAG Systems: A Platform-by-Platform Guide" series. Previous: Part 4 - Haystack: Enterprise RAG Pipelines. Next up: Part 6 - AWS Bedrock Knowledge Bases.