๐ค Ghostwritten by Claude Opus 4.5 ยท Curated by Tom Hundley
This article was written by Claude Opus 4.5 and curated for publication by Tom Hundley.
When your enterprise runs on Azure and .NET, Semantic Kernel is your native path to production RAG.
Microsoft's Semantic Kernel occupies a unique position in the AI framework landscape. While LangChain and LlamaIndex dominate the Python-first world, Semantic Kernel provides first-class .NET support alongside Python, making it the natural choice for enterprises already invested in the Microsoft stack.
But Semantic Kernel is more than just "LangChain for C#." It represents Microsoft's vision for AI orchestration: a kernel-based architecture where AI capabilities are plugins that can be composed, planned, and executed with the same patterns used throughout enterprise .NET development.
If you read Part 1 of this series, you understand that RAG combines retrieval with generation. Semantic Kernel implements this through its Memory and Plugin systems, with deep integration into Azure AI services like Azure OpenAI, Azure AI Search, and Azure Cosmos DB.
Important Context: On October 1, 2025, Microsoft announced the Microsoft Agent Framework (MAF), merging Semantic Kernel and AutoGen into a unified platform for building AI agents. Think of MAF as "Semantic Kernel v2.0" - built by the same team.
What this means for RAG development:
For RAG systems, Semantic Kernel's memory and connector ecosystem remains the recommended approach.
API Note: This article uses the
MemoryBuilderandISemanticTextMemoryabstractions, which still work but are now considered legacy. Microsoft recommends migrating to the newer Vector Store abstractions (Microsoft.Extensions.VectorData.Abstractions) for new projects. The legacy APIs shown here are simpler for learning but the Vector Store pattern offers more flexibility (custom schemas, metadata pre-filtering, multiple vectors per record). See the Migration Guide for details.
Semantic Kernel is the right choice when:
| Aspect | LangChain | LlamaIndex | Haystack | Semantic Kernel |
|---|---|---|---|---|
| Primary language | Python | Python | Python | C# (Python available) |
| Core abstraction | Chains/Runnables | Indexes | Pipelines (DAG) | Kernel + Plugins |
| Cloud provider | Agnostic | Agnostic | Agnostic | Azure-optimized |
| Memory model | Various | Index-based | Document stores | Unified Memory interface |
| Target audience | AI/ML teams | Knowledge workers | Enterprise teams | .NET enterprises |
# Create a new .NET project
dotnet new console -n SemanticKernelRAG
cd SemanticKernelRAG
# Add Semantic Kernel core
dotnet add package Microsoft.SemanticKernel
# Add memory connectors (choose based on your storage)
dotnet add package Microsoft.SemanticKernel.Connectors.AzureAISearch
dotnet add package Microsoft.SemanticKernel.Connectors.AzureCosmosDBMongoDB
dotnet add package Microsoft.SemanticKernel.Connectors.Qdrant
# Add OpenAI/Azure OpenAI connectors
dotnet add package Microsoft.SemanticKernel.Connectors.AzureOpenAI
dotnet add package Microsoft.SemanticKernel.Connectors.OpenAI# Core Semantic Kernel
pip install semantic-kernel
# Azure integrations
pip install semantic-kernel[azure]
# Or individual connectors
pip install semantic-kernel-connectors-azure-ai-search
pip install semantic-kernel-connectors-qdrantA well-organized Semantic Kernel project:
SemanticKernelRAG/
โโโ Program.cs # Entry point and Kernel configuration
โโโ appsettings.json # Configuration (connection strings, keys)
โโโ appsettings.Development.json # Development overrides
โโโ Plugins/
โ โโโ DocumentPlugin.cs # Document ingestion plugin
โ โโโ SearchPlugin.cs # Search and retrieval plugin
โ โโโ GenerationPlugin.cs # Response generation plugin
โโโ Memory/
โ โโโ IMemoryService.cs # Memory abstraction
โ โโโ AzureSearchMemory.cs # Azure AI Search implementation
โโโ Models/
โ โโโ Document.cs # Document model
โ โโโ SearchResult.cs # Search result model
โโโ Services/
โโโ RAGService.cs # Orchestration serviceCreate appsettings.json:
{
"AzureOpenAI": {
"Endpoint": "https://your-resource.openai.azure.com/",
"DeploymentName": "gpt-4o",
"EmbeddingDeploymentName": "text-embedding-3-small",
"ApiKey": ""
},
"AzureAISearch": {
"Endpoint": "https://your-search.search.windows.net",
"IndexName": "rag-documents",
"ApiKey": ""
},
"AzureCosmosDB": {
"ConnectionString": "",
"DatabaseName": "rag-db",
"ContainerName": "documents"
}
}For production, use Azure Key Vault or environment variables instead of storing keys in configuration files.
Semantic Kernel's architecture revolves around four key abstractions that work together to enable AI applications.
The Kernel is the central orchestrator. It holds references to AI services, plugins, and memory, coordinating their interaction.
using Microsoft.SemanticKernel;
// Build the kernel
var builder = Kernel.CreateBuilder();
// Add AI services
builder.AddAzureOpenAIChatCompletion(
deploymentName: "gpt-4o",
endpoint: "https://your-resource.openai.azure.com/",
apiKey: Environment.GetEnvironmentVariable("AZURE_OPENAI_API_KEY")
);
builder.AddAzureOpenAITextEmbeddingGeneration(
deploymentName: "text-embedding-3-small",
endpoint: "https://your-resource.openai.azure.com/",
apiKey: Environment.GetEnvironmentVariable("AZURE_OPENAI_API_KEY")
);
var kernel = builder.Build();The Kernel is designed to be dependency-injected in ASP.NET applications:
// In Program.cs or Startup.cs
builder.Services.AddKernel()
.AddAzureOpenAIChatCompletion(
deploymentName: configuration["AzureOpenAI:DeploymentName"],
endpoint: configuration["AzureOpenAI:Endpoint"],
apiKey: configuration["AzureOpenAI:ApiKey"]
);Plugins are collections of functions that extend the Kernel's capabilities. For RAG, you will create plugins for document processing, search, and generation.
using Microsoft.SemanticKernel;
using System.ComponentModel;
public class SearchPlugin
{
private readonly IMemoryStore _memoryStore;
public SearchPlugin(IMemoryStore memoryStore)
{
_memoryStore = memoryStore;
}
[KernelFunction("search_documents")]
[Description("Search the knowledge base for relevant documents")]
public async Task<string> SearchDocumentsAsync(
[Description("The search query")] string query,
[Description("Maximum number of results")] int maxResults = 5)
{
var results = await _memoryStore.SearchAsync(
collection: "documents",
query: query,
limit: maxResults
).ToListAsync();
return string.Join("\n\n---\n\n",
results.Select(r => r.Metadata.Text));
}
}
// Register the plugin with the kernel
kernel.Plugins.AddFromObject(new SearchPlugin(memoryStore), "Search");Plugins can be invoked directly or used by the AI to plan and execute complex tasks.
Memory in Semantic Kernel provides a unified interface for storing and retrieving information using embeddings. It abstracts away the underlying vector database.
using Microsoft.SemanticKernel.Memory;
// Create a memory builder
var memoryBuilder = new MemoryBuilder();
memoryBuilder.WithAzureOpenAITextEmbeddingGeneration(
deploymentName: "text-embedding-3-small",
endpoint: azureOpenAIEndpoint,
apiKey: azureOpenAIApiKey
);
// Choose your storage backend
memoryBuilder.WithAzureAISearchMemoryStore(
endpoint: searchEndpoint,
apiKey: searchApiKey
);
var memory = memoryBuilder.Build();The memory interface is simple:
// Store a document
await memory.SaveInformationAsync(
collection: "documents",
id: "doc-001",
text: "Our vacation policy allows 20 days PTO per year...",
description: "HR Vacation Policy",
additionalMetadata: "source=hr_handbook;section=benefits"
);
// Search for relevant documents
var results = memory.SearchAsync(
collection: "documents",
query: "How many vacation days do I get?",
limit: 5,
minRelevanceScore: 0.7
);
await foreach (var result in results)
{
Console.WriteLine($"Relevance: {result.Relevance:P2}");
Console.WriteLine($"Text: {result.Metadata.Text}");
}Semantic functions are prompt templates that can be parameterized and composed. They are central to RAG's generation step.
using Microsoft.SemanticKernel;
// Define a prompt template
string promptTemplate = """
You are a helpful assistant. Answer the user's question based on the provided context.
If the context doesn't contain the answer, say "I don't have information about that."
## Context
{{$context}}
## Question
{{$question}}
## Answer
""";
// Create a semantic function
var ragFunction = kernel.CreateFunctionFromPrompt(
promptTemplate,
new OpenAIPromptExecutionSettings
{
MaxTokens = 1000,
Temperature = 0.7
}
);
// Execute it
var result = await kernel.InvokeAsync(ragFunction, new KernelArguments
{
["context"] = retrievedContext,
["question"] = userQuestion
});
Console.WriteLine(result.GetValue<string>());Azure AI Search (formerly Azure Cognitive Search) is Microsoft's enterprise search service and the most common choice for production RAG in the Azure ecosystem.
using Microsoft.SemanticKernel.Connectors.AzureAISearch;
using Microsoft.SemanticKernel.Memory;
using Azure;
using Azure.Search.Documents.Indexes;
// Configure the memory store
var searchIndexClient = new SearchIndexClient(
new Uri(searchEndpoint),
new AzureKeyCredential(searchApiKey)
);
var memoryStore = new AzureAISearchMemoryStore(
searchIndexClient,
new AzureAISearchMemoryStoreOptions
{
// Index will be created if it doesn't exist
VectorSize = 1536 // Must match your embedding model
}
);
// Build memory with embedding generation
var memoryBuilder = new MemoryBuilder();
memoryBuilder.WithAzureOpenAITextEmbeddingGeneration(
deploymentName: "text-embedding-3-small",
endpoint: azureOpenAIEndpoint,
apiKey: azureOpenAIApiKey
);
memoryBuilder.WithMemoryStore(memoryStore);
var memory = memoryBuilder.Build();public class DocumentIndexer
{
private readonly ISemanticTextMemory _memory;
private readonly string _collection;
public DocumentIndexer(ISemanticTextMemory memory, string collection = "documents")
{
_memory = memory;
_collection = collection;
}
public async Task IndexDocumentAsync(
string id,
string content,
string source,
Dictionary<string, string>? metadata = null)
{
// Chunk the document
var chunks = ChunkDocument(content);
for (int i = 0; i < chunks.Count; i++)
{
var chunkId = $"{id}_chunk_{i}";
var additionalMetadata = metadata != null
? string.Join(";", metadata.Select(kv => $"{kv.Key}={kv.Value}"))
: "";
await _memory.SaveInformationAsync(
collection: _collection,
id: chunkId,
text: chunks[i],
description: $"Chunk {i + 1} of {chunks.Count} from {source}",
additionalMetadata: $"source={source};chunk={i};{additionalMetadata}"
);
}
}
private List<string> ChunkDocument(string content, int chunkSize = 500, int overlap = 50)
{
var chunks = new List<string>();
var words = content.Split(' ', StringSplitOptions.RemoveEmptyEntries);
for (int i = 0; i < words.Length; i += chunkSize - overlap)
{
var chunk = string.Join(" ", words.Skip(i).Take(chunkSize));
if (!string.IsNullOrWhiteSpace(chunk))
{
chunks.Add(chunk);
}
}
return chunks;
}
}Here is a complete RAG service that ties everything together:
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Memory;
public class RAGService
{
private readonly Kernel _kernel;
private readonly ISemanticTextMemory _memory;
private readonly KernelFunction _ragFunction;
private readonly string _collection;
public RAGService(
Kernel kernel,
ISemanticTextMemory memory,
string collection = "documents")
{
_kernel = kernel;
_memory = memory;
_collection = collection;
// Create the RAG prompt function
_ragFunction = CreateRagFunction();
}
private KernelFunction CreateRagFunction()
{
string promptTemplate = """
You are a knowledgeable assistant. Answer the user's question using
the provided context. Follow these guidelines:
1. Only use information from the provided context
2. If the context doesn't contain the answer, say so clearly
3. Cite your sources by mentioning the source document
4. Be concise but thorough
## Context
{{$context}}
## Question
{{$question}}
## Answer
""";
return _kernel.CreateFunctionFromPrompt(
promptTemplate,
new OpenAIPromptExecutionSettings
{
MaxTokens = 1000,
Temperature = 0.3 // Lower temperature for factual responses
}
);
}
public async Task<RAGResponse> QueryAsync(
string question,
int maxResults = 5,
double minRelevance = 0.5)
{
// Step 1: Retrieve relevant documents
var searchResults = new List<MemoryQueryResult>();
await foreach (var result in _memory.SearchAsync(
collection: _collection,
query: question,
limit: maxResults,
minRelevanceScore: minRelevance))
{
searchResults.Add(result);
}
if (searchResults.Count == 0)
{
return new RAGResponse
{
Answer = "I couldn't find any relevant information in the knowledge base.",
Sources = Array.Empty<string>(),
Confidence = 0
};
}
// Step 2: Build context from retrieved documents
var context = BuildContext(searchResults);
// Step 3: Generate response
var result = await _kernel.InvokeAsync(_ragFunction, new KernelArguments
{
["context"] = context,
["question"] = question
});
// Extract sources
var sources = searchResults
.Select(r => r.Metadata.AdditionalMetadata)
.Where(m => m != null)
.Select(m => ExtractSource(m!))
.Distinct()
.ToArray();
// Calculate average relevance as confidence
var avgRelevance = searchResults.Average(r => r.Relevance);
return new RAGResponse
{
Answer = result.GetValue<string>() ?? "Unable to generate response.",
Sources = sources,
Confidence = avgRelevance,
RetrievedDocuments = searchResults.Count
};
}
private string BuildContext(List<MemoryQueryResult> results)
{
var contextBuilder = new StringBuilder();
foreach (var result in results)
{
var source = ExtractSource(result.Metadata.AdditionalMetadata ?? "");
contextBuilder.AppendLine($"[Source: {source}]");
contextBuilder.AppendLine(result.Metadata.Text);
contextBuilder.AppendLine("---");
}
return contextBuilder.ToString();
}
private string ExtractSource(string metadata)
{
var parts = metadata.Split(';')
.Select(p => p.Split('='))
.Where(p => p.Length == 2)
.ToDictionary(p => p[0], p => p[1]);
return parts.GetValueOrDefault("source", "Unknown");
}
}
public class RAGResponse
{
public string Answer { get; set; } = "";
public string[] Sources { get; set; } = Array.Empty<string>();
public double Confidence { get; set; }
public int RetrievedDocuments { get; set; }
}Azure Cosmos DB for MongoDB vCore includes native vector search capabilities, making it excellent for RAG when you already use Cosmos DB for your application data.
using Microsoft.SemanticKernel.Connectors.AzureCosmosDBMongoDB;
using MongoDB.Driver;
// Connect to Cosmos DB MongoDB vCore
var mongoClient = new MongoClient(cosmosConnectionString);
var database = mongoClient.GetDatabase("rag-database");
var memoryStore = new AzureCosmosDBMongoDBMemoryStore(
mongoClient,
"rag-database",
new AzureCosmosDBMongoDBConfig
{
// Vector index configuration
IndexName = "vector_index",
VectorDimensions = 1536,
SimilarityAlgorithm = CosmosDBSimilarityAlgorithm.Cosine
}
);
// Build memory
var memoryBuilder = new MemoryBuilder();
memoryBuilder.WithAzureOpenAITextEmbeddingGeneration(
deploymentName: "text-embedding-3-small",
endpoint: azureOpenAIEndpoint,
apiKey: azureOpenAIApiKey
);
memoryBuilder.WithMemoryStore(memoryStore);
var memory = memoryBuilder.Build();Before using vector search, create a vector index in your Cosmos DB collection:
// Run in MongoDB shell or Azure Data Studio
db.runCommand({
createIndexes: "documents",
indexes: [
{
name: "vector_index",
key: { embedding: "cosmosSearch" },
cosmosSearchOptions: {
kind: "vector-hnsw",
m: 16,
efConstruction: 64,
similarity: "COS",
dimensions: 1536
}
}
]
});Here is a production-ready RAG application that demonstrates all the concepts:
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Memory;
using Microsoft.SemanticKernel.Connectors.AzureAISearch;
using Azure;
using Azure.Search.Documents.Indexes;
using System.Text;
namespace SemanticKernelRAG;
public class Program
{
public static async Task Main(string[] args)
{
// Load configuration
var config = LoadConfiguration();
// Build the kernel
var kernel = BuildKernel(config);
// Build memory with Azure AI Search
var memory = await BuildMemoryAsync(config);
// Create RAG service
var ragService = new RAGService(kernel, memory);
// Example: Index some documents
await IndexSampleDocumentsAsync(memory);
// Example: Query the knowledge base
var response = await ragService.QueryAsync(
"What is our vacation policy?"
);
Console.WriteLine($"Answer: {response.Answer}");
Console.WriteLine($"Sources: {string.Join(", ", response.Sources)}");
Console.WriteLine($"Confidence: {response.Confidence:P2}");
}
private static RAGConfiguration LoadConfiguration()
{
return new RAGConfiguration
{
AzureOpenAIEndpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")
?? throw new InvalidOperationException("AZURE_OPENAI_ENDPOINT not set"),
AzureOpenAIApiKey = Environment.GetEnvironmentVariable("AZURE_OPENAI_API_KEY")
?? throw new InvalidOperationException("AZURE_OPENAI_API_KEY not set"),
ChatDeploymentName = Environment.GetEnvironmentVariable("AZURE_OPENAI_CHAT_DEPLOYMENT")
?? "gpt-4o",
EmbeddingDeploymentName = Environment.GetEnvironmentVariable("AZURE_OPENAI_EMBEDDING_DEPLOYMENT")
?? "text-embedding-3-small",
AzureSearchEndpoint = Environment.GetEnvironmentVariable("AZURE_SEARCH_ENDPOINT")
?? throw new InvalidOperationException("AZURE_SEARCH_ENDPOINT not set"),
AzureSearchApiKey = Environment.GetEnvironmentVariable("AZURE_SEARCH_API_KEY")
?? throw new InvalidOperationException("AZURE_SEARCH_API_KEY not set")
};
}
private static Kernel BuildKernel(RAGConfiguration config)
{
var builder = Kernel.CreateBuilder();
builder.AddAzureOpenAIChatCompletion(
deploymentName: config.ChatDeploymentName,
endpoint: config.AzureOpenAIEndpoint,
apiKey: config.AzureOpenAIApiKey
);
return builder.Build();
}
private static async Task<ISemanticTextMemory> BuildMemoryAsync(RAGConfiguration config)
{
var searchIndexClient = new SearchIndexClient(
new Uri(config.AzureSearchEndpoint),
new AzureKeyCredential(config.AzureSearchApiKey)
);
var memoryStore = new AzureAISearchMemoryStore(
searchIndexClient,
new AzureAISearchMemoryStoreOptions
{
VectorSize = 1536
}
);
var memoryBuilder = new MemoryBuilder();
memoryBuilder.WithAzureOpenAITextEmbeddingGeneration(
deploymentName: config.EmbeddingDeploymentName,
endpoint: config.AzureOpenAIEndpoint,
apiKey: config.AzureOpenAIApiKey
);
memoryBuilder.WithMemoryStore(memoryStore);
return memoryBuilder.Build();
}
private static async Task IndexSampleDocumentsAsync(ISemanticTextMemory memory)
{
var documents = new[]
{
new
{
Id = "hr-vacation",
Content = @"
Company Vacation Policy
All full-time employees are entitled to 20 days of paid time off (PTO)
per calendar year. PTO accrues at a rate of 1.67 days per month.
Unused PTO can be carried over to the next year, up to a maximum of
5 days. PTO requests must be submitted at least 2 weeks in advance
for periods longer than 3 consecutive days.
",
Source = "HR Handbook - Benefits"
},
new
{
Id = "hr-remote",
Content = @"
Remote Work Guidelines
Employees may work remotely up to 3 days per week with manager approval.
Remote work days must be scheduled in advance and logged in the HR system.
Employees are expected to be available during core hours (10 AM - 3 PM)
regardless of work location. Home office equipment stipends of up to $500
are available for eligible employees.
",
Source = "HR Handbook - Remote Work"
},
new
{
Id = "hr-expenses",
Content = @"
Expense Reimbursement Policy
Business expenses must be submitted within 30 days of the expense date.
Receipts are required for all expenses over $25. Meals during travel are
reimbursed up to $75 per day. Flights should be booked through the
company travel portal. Personal expenses are not eligible for reimbursement.
",
Source = "HR Handbook - Expenses"
}
};
foreach (var doc in documents)
{
await memory.SaveInformationAsync(
collection: "documents",
id: doc.Id,
text: doc.Content.Trim(),
description: doc.Source,
additionalMetadata: $"source={doc.Source}"
);
}
Console.WriteLine($"Indexed {documents.Length} documents");
}
}
public class RAGConfiguration
{
public string AzureOpenAIEndpoint { get; set; } = "";
public string AzureOpenAIApiKey { get; set; } = "";
public string ChatDeploymentName { get; set; } = "";
public string EmbeddingDeploymentName { get; set; } = "";
public string AzureSearchEndpoint { get; set; } = "";
public string AzureSearchApiKey { get; set; } = "";
}For teams that prefer Python, Semantic Kernel provides a similar API:
import asyncio
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import (
AzureChatCompletion,
AzureTextEmbedding,
)
from semantic_kernel.connectors.memory.azure_ai_search import AzureAISearchMemoryStore
from semantic_kernel.memory import SemanticTextMemory
async def main():
# Create the kernel
kernel = Kernel()
# Add Azure OpenAI chat service
kernel.add_service(
AzureChatCompletion(
deployment_name="gpt-4o",
endpoint="https://your-resource.openai.azure.com/",
api_key=os.environ["AZURE_OPENAI_API_KEY"],
)
)
# Create memory with Azure AI Search
embedding_service = AzureTextEmbedding(
deployment_name="text-embedding-3-small",
endpoint="https://your-resource.openai.azure.com/",
api_key=os.environ["AZURE_OPENAI_API_KEY"],
)
memory_store = AzureAISearchMemoryStore(
search_endpoint="https://your-search.search.windows.net",
admin_key=os.environ["AZURE_SEARCH_API_KEY"],
vector_size=1536,
)
memory = SemanticTextMemory(
storage=memory_store,
embeddings_generator=embedding_service,
)
# Save a document
await memory.save_information(
collection="documents",
id="doc-001",
text="Our vacation policy allows 20 days PTO per year...",
description="HR Vacation Policy",
)
# Search for relevant documents
results = await memory.search(
collection="documents",
query="How many vacation days do I get?",
limit=5,
min_relevance_score=0.5,
)
for result in results:
print(f"Relevance: {result.relevance:.2%}")
print(f"Text: {result.text}")
# Create a RAG prompt
prompt = """
Answer the question based on the provided context.
Context:
{{$context}}
Question: {{$question}}
Answer:
"""
rag_function = kernel.add_function(
plugin_name="RAG",
function_name="answer",
prompt=prompt,
)
# Execute RAG
context = "\n".join([r.text for r in results])
result = await kernel.invoke(
rag_function,
context=context,
question="How many vacation days do I get?",
)
print(f"Answer: {result}")
if __name__ == "__main__":
asyncio.run(main())When deploying RAG in enterprise environments, security is paramount:
// Use Managed Identity instead of API keys
using Azure.Identity;
var credential = new DefaultAzureCredential();
// Azure OpenAI with Managed Identity
builder.AddAzureOpenAIChatCompletion(
deploymentName: "gpt-4o",
endpoint: azureOpenAIEndpoint,
credential: credential
);
// Azure AI Search with Managed Identity
var searchClient = new SearchClient(
new Uri(searchEndpoint),
"documents",
credential
);Implement document-level access control:
public class SecureRAGService
{
private readonly ISemanticTextMemory _memory;
private readonly Kernel _kernel;
public async Task<RAGResponse> QueryWithAccessControlAsync(
string question,
ClaimsPrincipal user)
{
// Get user's groups/roles
var userGroups = user.Claims
.Where(c => c.Type == "groups")
.Select(c => c.Value)
.ToHashSet();
// Search with metadata filter
var allResults = new List<MemoryQueryResult>();
await foreach (var result in _memory.SearchAsync(
collection: "documents",
query: question,
limit: 20)) // Get more than needed
{
// Filter by access control
var docGroups = ExtractGroups(result.Metadata.AdditionalMetadata);
if (docGroups.Any(g => userGroups.Contains(g)))
{
allResults.Add(result);
}
}
// Take top results after filtering
var filteredResults = allResults.Take(5).ToList();
// Continue with generation...
}
private IEnumerable<string> ExtractGroups(string metadata)
{
// Parse groups from metadata
var parts = metadata.Split(';');
var groupsPart = parts.FirstOrDefault(p => p.StartsWith("groups="));
return groupsPart?.Substring(7).Split(',') ?? Array.Empty<string>();
}
}Integrate with Azure Application Insights:
using Microsoft.ApplicationInsights;
using Microsoft.ApplicationInsights.DataContracts;
public class InstrumentedRAGService
{
private readonly RAGService _ragService;
private readonly TelemetryClient _telemetry;
public async Task<RAGResponse> QueryAsync(string question)
{
using var operation = _telemetry.StartOperation<RequestTelemetry>("RAG Query");
try
{
var startTime = DateTime.UtcNow;
var response = await _ragService.QueryAsync(question);
var duration = DateTime.UtcNow - startTime;
// Track metrics
_telemetry.TrackMetric("RAG.QueryDuration", duration.TotalMilliseconds);
_telemetry.TrackMetric("RAG.RetrievedDocuments", response.RetrievedDocuments);
_telemetry.TrackMetric("RAG.Confidence", response.Confidence);
// Track custom event
_telemetry.TrackEvent("RAGQuery", new Dictionary<string, string>
{
["QuestionLength"] = question.Length.ToString(),
["SourceCount"] = response.Sources.Length.ToString()
});
return response;
}
catch (Exception ex)
{
_telemetry.TrackException(ex);
throw;
}
}
}Semantic Kernel 1.x introduced TextSearchProvider as a standardized abstraction for text search operations, making it easier to swap between different search backends.
using Microsoft.SemanticKernel.Data;
// Create a text search provider
var textSearch = new AzureAISearchTextSearch(
searchClient,
new AzureAISearchTextSearchOptions
{
// Configure search behavior
SelectFields = new[] { "content", "title", "source" },
SearchFields = new[] { "content" }
}
);
// Use in a function
[KernelFunction("search")]
public async Task<IEnumerable<TextSearchResult>> SearchAsync(
[Description("Search query")] string query)
{
var options = new TextSearchOptions
{
Top = 5,
Skip = 0
};
var results = await textSearch.SearchAsync(query, options);
return results;
}Semantic Kernel uses Handlebars-style templates. Special characters need escaping:
// WRONG: Curly braces in output can break template parsing
string template = "Output as JSON: {\"key\": \"value\"}";
// CORRECT: Escape curly braces or use raw strings carefully
string template = "Output as JSON: {{{{\"key\": \"value\"}}}}";
// BETTER: Keep JSON instructions simple
string template = "Output your response in valid JSON format.";The original Planner classes (ActionPlanner, SequentialPlanner, StepwisePlanner) are deprecated as of late 2024. Use the function calling approach instead:
// DEPRECATED: Old planner approach
// var planner = new SequentialPlanner(kernel);
// CURRENT: Use function calling with auto-invocation
var settings = new OpenAIPromptExecutionSettings
{
FunctionChoiceBehavior = FunctionChoiceBehavior.Auto()
};
var result = await kernel.InvokePromptAsync(
"Find documents about vacation policy and summarize them",
new KernelArguments(settings)
);Azure AI Search indices must match your configuration:
// WRONG: Creating index with wrong vector dimensions
var memoryStore = new AzureAISearchMemoryStore(
searchIndexClient,
new AzureAISearchMemoryStoreOptions
{
VectorSize = 384 // But using text-embedding-3-small (1536)!
}
);
// CORRECT: Match your embedding model
var memoryStore = new AzureAISearchMemoryStore(
searchIndexClient,
new AzureAISearchMemoryStoreOptions
{
VectorSize = 1536 // Matches text-embedding-3-small
}
);Semantic Kernel's API has evolved rapidly. Code from early 2024 may not compile in late 2025:
// OLD (pre-1.0): ChatCompletion style
// kernel.GetService<IChatCompletionService>();
// CURRENT (1.x): Service collection pattern
kernel.GetRequiredService<IChatCompletionService>();
// OLD: Direct text completion
// await kernel.InvokeSemanticFunctionAsync(...)
// CURRENT: Function-based invocation
await kernel.InvokeAsync(function, arguments);Memory search returns IAsyncEnumerable, not List:
// WRONG: This doesn't compile
var results = memory.SearchAsync(collection, query); // IAsyncEnumerable
// CORRECT: Use await foreach
var results = new List<MemoryQueryResult>();
await foreach (var result in memory.SearchAsync(collection, query))
{
results.Add(result);
}
// OR: Use ToListAsync extension
using System.Linq;
var results = await memory.SearchAsync(collection, query).ToListAsync();# Clone and navigate to project
cd SemanticKernelRAG
# Set environment variables
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
export AZURE_OPENAI_API_KEY="your-key"
export AZURE_OPENAI_CHAT_DEPLOYMENT="gpt-4o"
export AZURE_OPENAI_EMBEDDING_DEPLOYMENT="text-embedding-3-small"
export AZURE_SEARCH_ENDPOINT="https://your-search.search.windows.net"
export AZURE_SEARCH_API_KEY="your-search-key"
# Run the application
dotnet runIndexed 3 documents
Answer: According to the company vacation policy, all full-time employees are entitled to 20 days of paid time off (PTO) per calendar year. PTO accrues at a rate of 1.67 days per month. You can carry over up to 5 unused days to the next year.
[Source: HR Handbook - Benefits]
Sources: HR Handbook - Benefits
Confidence: 89.23%Having covered five frameworks now, here is where Semantic Kernel fits:
| Criterion | LangChain | LlamaIndex | Haystack | Semantic Kernel |
|---|---|---|---|---|
| Best for .NET | No | No | No | Yes |
| Azure integration | Manual | Manual | Manual | Native |
| Enterprise security | Add-on | Add-on | Built-in | Native (Azure) |
| Learning curve | Medium | Medium | Higher | Medium-High |
| Community size | Largest | Large | Medium | Growing |
| Python quality | Native | Native | Native | Port (functional) |
Choose Semantic Kernel when:
Consider alternatives when:
This article covered Semantic Kernel's approach to RAG in the Microsoft ecosystem. Continue with the series:
For Azure-specific deep dives, see:
This is Part 5 of the "Building RAG Systems: A Platform-by-Platform Guide" series. Previous: Part 4 - Haystack: Enterprise RAG Pipelines. Next up: Part 6 - AWS Bedrock Knowledge Bases.
Discover more content: