· 6 min read

Build a Document Search Engine in C#

Build a full-text search engine in C# with keyword search, semantic search, hybrid ranking, and reranking. Index files and query them in 10 lines of code. No Elasticsearch, no external services.

Build a Document Search Engine in C#

Index local files and search them by keyword, by meaning, or both — in about 10 lines of C#. No Elasticsearch. No external services. No API keys.

dotnet add package Kjarni

NuGet

using Kjarni;

using var indexer = new Indexer(model: "minilm-l6-v2", quiet: true);
indexer.Create("my_index", new[] { "docs/" });

using var searcher = new Searcher(
    model: "minilm-l6-v2",
    rerankerModel: "minilm-l6-v2-cross-encoder");

var results = searcher.Search("my_index", "how do returns work?",
    mode: SearchMode.Hybrid);

foreach (var r in results)
    Console.WriteLine($"  {r.Score:F4}: {r.Text}");

The indexer reads your files, splits them into chunks, encodes each chunk as a vector, and builds a BM25 keyword index. The searcher queries both indexes and combines the results. Everything runs locally on CPU.

The Usual Options — and Why They're Heavy

Most search implementations fall into one of two camps: spin up an Elasticsearch cluster, or call a managed search API like Azure AI Search or Algolia. Both work. Both add infrastructure, configuration, and ongoing cost.

Kjarni is a third option: a NuGet package that gives you keyword search, semantic search, hybrid ranking, and reranking in a single library. No cluster. No API key. No data leaving your machine.

Setup

Create a few text files to search over:

mkdir -p docs

docs/returns.txt:

Our return policy allows customers to return any unused item within 30 days
of purchase for a full refund. Items must be in their original packaging.
Shipping costs are non-refundable.

docs/shipping.txt:

We ship to all 50 US states and internationally to over 40 countries.
Standard shipping takes 5-7 business days. Express shipping is available
for an additional fee.

docs/account.txt:

To reset your password, click "Forgot Password" on the login page.
You will receive an email with a reset link. The link expires after 24 hours.

Three short documents. In practice these could be product manuals, support articles, internal wikis, or any text files.

Indexing

using var indexer = new Indexer(model: "minilm-l6-v2", quiet: true);
indexer.Create("my_index", new[] { "docs/" });

The indexer does three things:

  1. Reads all files in the given directories
  2. Chunks each file into passages (for long documents)
  3. Encodes each chunk into a 384-dimension vector using the embedding model

It also builds a BM25 keyword index over the same chunks. The result is a local index on disk that you can query repeatedly without re-indexing.

Three Search Modes

Keyword Search (BM25)

Matches documents that contain the query words. The same algorithm that powers Elasticsearch and Solr.

var results = searcher.Search("my_index", "return policy refund",
    mode: SearchMode.Keyword);
  7.8795: Our return policy allows customers to return any unused item
          within 30 days of purchase for a full refund...

This works because the query words — "return", "policy", "refund" — appear in the document. If you searched for "send items back and get money" instead, keyword search would find nothing.

Matches documents by meaning, regardless of the exact words used.

var results = searcher.Search("my_index", "can I send items back and get money?",
    mode: SearchMode.Semantic);

This finds the returns document even though none of those exact words appear in it. The embedding model understands that "send items back" means "return" and "get money" means "refund."

For more on how embeddings and similarity work, see Semantic Search in C#.

Combines keyword and semantic results. This is usually the best default.

var results = searcher.Search("my_index", "how do returns work?",
    mode: SearchMode.Hybrid);
   1.3282: Our return policy allows customers to return any unused item
           within 30 days of purchase for a full refund. Items must be in
           their original packaging. Shipping costs are non-refundable.

 -10.5874: To reset your password, click "Forgot Password" on the login
           page. You will receive an email with a reset link. The link
           expires after 24 hours.

 -11.0939: We ship to all 50 US states and internationally to over 40
           countries. Standard shipping takes 5-7 business days. Express
           shipping is available for an additional fee.

Hybrid search catches both exact keyword matches and semantically related content. The scores are from the reranker (more on that below), which is why the gap between relevant and irrelevant results is so large. The returns document scores 1.3, while the other two are deep in the negatives.

Reranking

The results above use a cross-encoder reranker. This is the difference between good search and great search.

The Problem with Embeddings Alone

Embedding models are fast because they encode the query and each document independently. But this means they can't model the interaction between query and document directly. They're comparing summaries, not reading both texts together.

How Reranking Fixes This

A cross-encoder takes the query and a document as a single input and outputs a relevance score. It reads both texts at the same time, so it can attend to specific words in the document that answer the specific question.

Bi-encoder (embedding):     Query -> Vector    Document -> Vector    Compare
Cross-encoder (reranker):   [Query + Document] -> Relevance Score

The cross-encoder is slower because it processes each query-document pair individually. That's why it's used as a second stage: the embedding model retrieves candidates quickly, then the cross-encoder reranks the top results precisely.

Using the Reranker Directly

You can also use the reranker on its own:

using var reranker = new Reranker();

var results = reranker.Rerank(
    "What is machine learning?",
    new[] {
        "Machine learning is a subset of artificial intelligence.",
        "Deep learning uses neural networks with many layers.",
        "The weather today is sunny.",
    });

foreach (var r in results)
    Console.WriteLine($"  {r.Score:F4}: {r.Document}");
  10.5139: Machine learning is a subset of artificial intelligence.
  -5.5301: Deep learning uses neural networks with many layers.
 -11.1001: The weather today is sunny.

The scores are logits, not probabilities. What matters is the relative ordering and the gap between scores. A positive score means the cross-encoder thinks the document is relevant to the query. A negative score means it's not.

The Full Pipeline

Here's how the pieces fit together:

Query
  |
  +-- BM25 Keyword Index ----> Top N candidates by word match
  |
  +-- Vector Index ----------> Top N candidates by meaning
  |
  v
  Merge candidates (union or intersection)
  |
  v
  Cross-Encoder Reranker ----> Final ranked results
  |
  v
  Return to user

Each stage filters and refines. BM25 is cheap and catches exact matches. The vector index catches semantic matches that keywords miss. The reranker reads both query and document together to produce a precise ranking.

using var indexer = new Indexer(model: "minilm-l6-v2", quiet: true);
indexer.Create("my_index", new[] { "docs/" });

using var searcher = new Searcher(
    model: "minilm-l6-v2",
    rerankerModel: "minilm-l6-v2-cross-encoder");

// Hybrid = BM25 + Semantic + Reranker
var results = searcher.Search("my_index", "how do returns work?",
    mode: SearchMode.Hybrid);

When to Use Each Mode

ModeBest forMisses
KeywordExact terms, error codes, IDsSynonyms, rephrased queries
SemanticIntent matching, fuzzy queriesExact phrases, rare terms
HybridGeneral purpose (recommended)Slightly slower

Start with Hybrid. Switch to Keyword if your users search for exact identifiers. Switch to Semantic if your users describe what they want in natural language.

Practical Patterns

Searching Different File Types

The indexer reads text files from disk. For other formats, extract the text first and write it to a file:

// Extract text from PDFs, HTML, etc. into a directory
ExtractTextFromPdfs("input/", "docs/");

using var indexer = new Indexer(model: "minilm-l6-v2", quiet: true);
indexer.Create("my_index", new[] { "docs/" });

Re-indexing

When documents change, re-create the index:

indexer.Create("my_index", new[] { "docs/" });

This rebuilds the full index. For large corpora where incremental updates matter, you'd manage the vector storage separately.

Filtering Results

Apply a score threshold to filter out irrelevant results:

var results = searcher.Search("my_index", query, mode: SearchMode.Hybrid);

var relevant = results.Where(r => r.Score > 0.0);

With reranking, a score above 0 is a reasonable default threshold for "probably relevant." Adjust based on your precision/recall needs.

Search + Classification

Find relevant documents, then classify their sentiment:

using var searcher = new Searcher(model: "minilm-l6-v2");
using var classifier = new Classifier("roberta-sentiment");

var results = searcher.Search("reviews_index", "battery life",
    mode: SearchMode.Hybrid);

foreach (var r in results.Take(10))
{
    var sentiment = classifier.Classify(r.Text);
    Console.WriteLine($"  {sentiment}  \"{r.Text}\"");
}

See Sentiment Analysis in C# for more on classification.

How It Compares

ApproachSetupLatencyCostOffline
ElasticsearchCluster + configLowServer costsNo
Azure AI SearchPortal + API keyLowPer-query pricingNo
AlgoliaDashboard + API keyLowPer-search pricingNo
Kjarnidotnet add packageLowFreeYes

The tradeoff: Kjarni runs in-process on a single machine. If you need distributed search across billions of documents, use Elasticsearch. If you need search over thousands to millions of documents on a single server, a local engine works well and eliminates a dependency.

How It Works Under the Hood

Kjarni builds two indexes per collection:

  1. BM25 index — inverted index over tokenized text, with term frequency saturation and document length normalization
  2. Vector index — encoded embeddings for each chunk, queried by cosine similarity

At search time, both indexes return candidates. The results are merged and optionally reranked by a cross-encoder model that reads the query and each candidate together.

The engine is written in Rust with SIMD-optimized kernels (AVX2/FMA on x86, NEON on ARM). The C# package wraps a single native library. No Python, no JVM, no external service.

For the full story on why Kjarni exists and how it compares to Python and ONNX Runtime, see Why I Built a Native ML Inference Engine in Rust.

Install:  dotnet add package Kjarni
NuGet:    https://www.nuget.org/packages/Kjarni
GitHub:   https://github.com/olafurjohannsson/kjarni

Next Steps