Vector Search in Rails with SQLite (sqlite-vec)

Vector search enables applications to find results based on semantic similarity rather than exact keyword matches. For example, a query like “funny cat videos” can return content about playful kittens or humorous pet moments, even without identical wording. This works through embeddings, numerical representations that capture the meaning of text, images, or other data.

Previously, vector search often required dedicated vector databases or cloud services. Now, using Rails with SQLite and the sqlite-vec extension, you can implement efficient, private, local semantic search without external dependencies. This approach keeps everything in a single lightweight database file.

This article explains how to set up Rails SQLite vector search using sqlite-vec. It covers installation, generating embeddings, performing queries, practical examples including a movie search and a RAG (Retrieval-Augmented Generation) system, and important considerations for performance and production use.

Why Use Local Vector Search with Rails and SQLite?

SQLite is the default database for Rails development and has seen significant improvements in Rails 8, making it more suitable for production in small to medium-scale applications. Rails 8 enhances SQLite support with better defaults for concurrency (via WAL mode), caching, queuing, and more.

The sqlite-vec extension, developed by Alex Garcia, adds vector storage and search capabilities directly to SQLite. It is written in pure C, has no external dependencies, and runs on most platforms (macOS,Linux, Windows, ARM devices, even browsers via WASM). It introduces the vec0 virtual table type and supports fast k-nearest neighbors (KNN) queries using distances like cosine or Euclidean.

Key advantages include:

Full data privacy — no data leaves your server or device.
Zero infrastructure cost (and fully free when using local embedding models).
Simplicity — one database file, no separate service to manage.
Good performance for many use cases, especially datasets up to tens of thousands of vectors.

This setup suits internal tools, personal knowledge bases, offline-capable apps, or privacy-sensitive features.

Setting Up sqlite-vec in a Rails Application

Start with a Rails 8 (or later) application using SQLite.

Step 1: Add Required Gems

In your Gemfile:

gem "sqlite-vec"
gem "neighbor"          # Rails-friendly interface for vector operations
gem "ruby-openai"       # For generating embeddings (can swap for local alternatives)

Run bundle install.

Note: Some older examples used platform: :ruby_33, but this is not standard. The gem typically installs platform-specific binaries automatically (e.g., for x86_64-linux, arm64-darwin). Omit the platform specifier unless addressing a specific compatibility issue.

Step 2: Initialize the Neighbor Gem for SQLite

Create or edit config/initializers/neighbor.rb:

Neighbor::SQLite.initialize!

This loads sqlite-vec into SQLite connections.

You can also run:

rails generate neighbor:sqlite

This creates the initializer if needed.

Defining the Vector Schema

Use vec0 virtual tables for best performance with KNN queries.

Example migration for storing movie embeddings:

class CreateMovieVectors < ActiveRecord::Migration[8.0]
  def change
    create_virtual_table :movie_vectors, :vec0, [
      "movie_id integer primary key",
      "embedding float[1536] distance_metric=cosine"
    ]
  end
end

The dimension (1536 here) matches models like OpenAI’s text-embedding-3-small. Use cosine distance for most text similarity tasks.

Create the model:

class MovieVector < ApplicationRecord
  self.primary_key = "movie_id"
  has_neighbors :embedding, dimensions: 1536
end

The has_neighbors declaration (from the neighbor gem) provides a convenient abstraction over vector operations.

Associate with your main model:

class Movie < ApplicationRecord
  has_one :movie_vector
  # Include any search concern if desired
end

Generating Embeddings

Embeddings convert text to vectors. Options include:

Cloud-based (quick start) — OpenAI:

class EmbeddingService
  def self.generate(text)
    client = OpenAI::Client.new
    response = client.embeddings(
      parameters: { model: "text-embedding-3-small", input: text }
    )
    response.dig("data", 0, "embedding")
  end
end

Local/offline — Use Ollama with models like nomic-embed-text or mxbai-embed-large. Call the local API (http://localhost:11434) — the output format remains the same.

Generate and store embeddings (ideally in a background job):

# Example in model
def create_vector!
  return if movie_vector.present?
  text = [title, overview].join(" ")
  embedding = EmbeddingService.generate(text)
  MovieVector.create!(movie_id: id, embedding: embedding)
end

Process existing records: Movie.find_each(&:create_vector!)

Running Vector Search Queries

With neighbor, queries are straightforward:

query_embedding = EmbeddingService.generate("funny space adventure movies")
results = MovieVector
  .nearest_neighbors(:embedding, query_embedding, distance: "cosine")
  .first(10)
  .map(&:movie)

In a controller:

def search
  if params[:query].present?
    embedding = EmbeddingService.generate(params[:query])
    @results = MovieVector
      .nearest_neighbors(:embedding, embedding, distance: "cosine")
      .limit(20)
      .map(&:movie)
  end
end

Combine with filters:

MovieVector
  .nearest_neighbors(:embedding, embedding)
  .where("release_year > ?", 2015)
  .limit(10)

For raw SQL (more control, especially in RAG):

Use the virtual table KNN syntax:

SELECT movie_id, distance
FROM movie_vectors
WHERE embedding MATCH ?
ORDER BY distance
LIMIT 10

Bind the serialized query vector (neighbor handles this internally in most cases).

Practical Example: Semantic Movie Search

In a movie database:

Store title and overview in movies.
Embed overview + title in movie_vectors.
Enable natural-language search: “romantic films involving time travel” finds relevant matches.
Implement “similar movies”:

def similar
  vector = movie_vector.embedding
  MovieVector
    .nearest_neighbors(:embedding, vector, distance: "cosine")
    .where.not(movie_id: id)
    .limit(8)
    .map(&:movie)
end

Practical Example: Local RAG for Document Q&A

Build a private Q&A system over your documents.

Split documents into chunks (~400–800 tokens, with overlap).
Generate embeddings per chunk.
Store in tables: chunks (content, metadata) + vec_chunks (virtual table with embedding).

Query example (idiomatic vec0 style):

SELECT c.id, c.content, distance
FROM vec_chunks v
JOIN chunks c ON v.rowid = c.id
WHERE v.embedding MATCH ?
ORDER BY distance
LIMIT 5

Pass top chunks as context to an LLM (OpenAI or local via Ollama) with a prompt restricting answers to provided context.

This creates a fully local chatbot over your notes, docs, or knowledge base.

Performance, Scaling, and Best Practices

sqlite-vec uses brute-force search on vec0 tables but performs well for local workloads. Benchmarks show:

Queries on 100,000 vectors (384 dimensions) often under 100 ms on modern hardware.
Larger dimensions (1536) or datasets (1M+) slow down significantly (seconds possible).

Performance varies by:

Embedding dimension (smaller = faster).
Hardware (CPU cores, RAM).
Query load.

It suits tens of thousands of vectors reliably; beyond that, consider partitioning, hybrid search (vector + FTS5), or alternatives like pgvector.

Tips:

Use cosine for text embeddings.
Enforce consistent dimensions.
Generate embeddings asynchronously (e.g., via Solid Queue).
For production: Verify extension loading (SELECT vec_version();).
Add metadata columns to vec0 tables for filtering.
Test concurrency — SQLite excels at read-heavy loads but has single-writer limits; Rails 8 mitigates this for moderate traffic.

Fully Local Embeddings

Replace OpenAI with Ollama or similar for true zero external cost and offline capability. The rest of the pipeline remains unchanged.

Comparison Table

Solution	Complexity	Cost	Privacy	Rails Fit (Production)
sqlite-vec + SQLite	Low	Free	Full	Good for small/medium apps
pgvector + Postgres	Medium	Low/Free	High	Excellent, scales higher
Managed (Pinecone)	High	Paid	Lower	Easy but external

For many internal or privacy-focused Rails apps, sqlitevec offers the best balance of simplicity and capability.

Also Read:

OpenVino vs TensorRt
Best IDEs for JavaScript Beginners
MPSgraph vs MPSkernals: What to Use for ML Workloads on macOS