Vector search enables applications to find results based on semantic similarity rather than exact keyword matches. For example, a query like “funny cat videos” can return content about playful kittens or humorous pet moments, even without identical wording. This works through embeddings, numerical representations that capture the meaning of text, images, or other data.
Previously, vector search often required dedicated vector databases or cloud services. Now, using Rails with SQLite and the sqlite-vec extension, you can implement efficient, private, local semantic search without external dependencies. This approach keeps everything in a single lightweight database file.
This article explains how to set up Rails SQLite vector search using sqlite-vec. It covers installation, generating embeddings, performing queries, practical examples including a movie search and a RAG (Retrieval-Augmented Generation) system, and important considerations for performance and production use.
Why Use Local Vector Search with Rails and SQLite?
SQLite is the default database for Rails development and has seen significant improvements in Rails 8, making it more suitable for production in small to medium-scale applications. Rails 8 enhances SQLite support with better defaults for concurrency (via WAL mode), caching, queuing, and more.
The sqlite-vec extension, developed by Alex Garcia, adds vector storage and search capabilities directly to SQLite. It is written in pure C, has no external dependencies, and runs on most platforms (macOS,Linux, Windows, ARM devices, even browsers via WASM). It introduces the vec0 virtual table type and supports fast k-nearest neighbors (KNN) queries using distances like cosine or Euclidean.
Key advantages include:
- Full data privacy — no data leaves your server or device.
- Zero infrastructure cost (and fully free when using local embedding models).
- Simplicity — one database file, no separate service to manage.
- Good performance for many use cases, especially datasets up to tens of thousands of vectors.
This setup suits internal tools, personal knowledge bases, offline-capable apps, or privacy-sensitive features.
Setting Up sqlite-vec in a Rails Application
Start with a Rails 8 (or later) application using SQLite.
Step 1: Add Required Gems
In your Gemfile:
gem "sqlite-vec"
gem "neighbor" # Rails-friendly interface for vector operations
gem "ruby-openai" # For generating embeddings (can swap for local alternatives)
Run bundle install.
Note: Some older examples used platform: :ruby_33, but this is not standard. The gem typically installs platform-specific binaries automatically (e.g., for x86_64-linux, arm64-darwin). Omit the platform specifier unless addressing a specific compatibility issue.
Step 2: Initialize the Neighbor Gem for SQLite
Create or edit config/initializers/neighbor.rb:
Neighbor::SQLite.initialize!
This loads sqlite-vec into SQLite connections.
You can also run:
rails generate neighbor:sqlite
This creates the initializer if needed.
Defining the Vector Schema
Use vec0 virtual tables for best performance with KNN queries.
Example migration for storing movie embeddings:
class CreateMovieVectors < ActiveRecord::Migration[8.0]
def change
create_virtual_table :movie_vectors, :vec0, [
"movie_id integer primary key",
"embedding float[1536] distance_metric=cosine"
]
end
end
The dimension (1536 here) matches models like OpenAI’s text-embedding-3-small. Use cosine distance for most text similarity tasks.
Create the model:
class MovieVector < ApplicationRecord
self.primary_key = "movie_id"
has_neighbors :embedding, dimensions: 1536
end
The has_neighbors declaration (from the neighbor gem) provides a convenient abstraction over vector operations.
Associate with your main model:
class Movie < ApplicationRecord
has_one :movie_vector
# Include any search concern if desired
end
Generating Embeddings
Embeddings convert text to vectors. Options include:
Cloud-based (quick start) — OpenAI:
class EmbeddingService
def self.generate(text)
client = OpenAI::Client.new
response = client.embeddings(
parameters: { model: "text-embedding-3-small", input: text }
)
response.dig("data", 0, "embedding")
end
end
Local/offline — Use Ollama with models like nomic-embed-text or mxbai-embed-large. Call the local API (http://localhost:11434) — the output format remains the same.
Generate and store embeddings (ideally in a background job):
# Example in model
def create_vector!
return if movie_vector.present?
text = [title, overview].join(" ")
embedding = EmbeddingService.generate(text)
MovieVector.create!(movie_id: id, embedding: embedding)
end
Process existing records: Movie.find_each(&:create_vector!)
Running Vector Search Queries
With neighbor, queries are straightforward:
query_embedding = EmbeddingService.generate("funny space adventure movies")
results = MovieVector
.nearest_neighbors(:embedding, query_embedding, distance: "cosine")
.first(10)
.map(&:movie)
In a controller:
def search
if params[:query].present?
embedding = EmbeddingService.generate(params[:query])
@results = MovieVector
.nearest_neighbors(:embedding, embedding, distance: "cosine")
.limit(20)
.map(&:movie)
end
end
Combine with filters:
MovieVector
.nearest_neighbors(:embedding, embedding)
.where("release_year > ?", 2015)
.limit(10)
For raw SQL (more control, especially in RAG):
Use the virtual table KNN syntax:
SELECT movie_id, distance
FROM movie_vectors
WHERE embedding MATCH ?
ORDER BY distance
LIMIT 10
Bind the serialized query vector (neighbor handles this internally in most cases).
Practical Example: Semantic Movie Search
In a movie database:
- Store title and overview in
movies. - Embed overview + title in
movie_vectors. - Enable natural-language search: “romantic films involving time travel” finds relevant matches.
- Implement “similar movies”:
def similar
vector = movie_vector.embedding
MovieVector
.nearest_neighbors(:embedding, vector, distance: "cosine")
.where.not(movie_id: id)
.limit(8)
.map(&:movie)
end
Practical Example: Local RAG for Document Q&A
Build a private Q&A system over your documents.
- Split documents into chunks (~400–800 tokens, with overlap).
- Generate embeddings per chunk.
- Store in tables:
chunks(content, metadata) +vec_chunks(virtual table with embedding).
Query example (idiomatic vec0 style):
SELECT c.id, c.content, distance
FROM vec_chunks v
JOIN chunks c ON v.rowid = c.id
WHERE v.embedding MATCH ?
ORDER BY distance
LIMIT 5
Pass top chunks as context to an LLM (OpenAI or local via Ollama) with a prompt restricting answers to provided context.
This creates a fully local chatbot over your notes, docs, or knowledge base.
Performance, Scaling, and Best Practices
sqlite-vec uses brute-force search on vec0 tables but performs well for local workloads. Benchmarks show:
- Queries on 100,000 vectors (384 dimensions) often under 100 ms on modern hardware.
- Larger dimensions (1536) or datasets (1M+) slow down significantly (seconds possible).
Performance varies by:
- Embedding dimension (smaller = faster).
- Hardware (CPU cores, RAM).
- Query load.
It suits tens of thousands of vectors reliably; beyond that, consider partitioning, hybrid search (vector + FTS5), or alternatives like pgvector.
Tips:
- Use cosine for text embeddings.
- Enforce consistent dimensions.
- Generate embeddings asynchronously (e.g., via Solid Queue).
- For production: Verify extension loading (
SELECT vec_version();). - Add metadata columns to vec0 tables for filtering.
- Test concurrency — SQLite excels at read-heavy loads but has single-writer limits; Rails 8 mitigates this for moderate traffic.
Fully Local Embeddings
Replace OpenAI with Ollama or similar for true zero external cost and offline capability. The rest of the pipeline remains unchanged.
Comparison Table
| Solution | Complexity | Cost | Privacy | Rails Fit (Production) |
|---|---|---|---|---|
| sqlite-vec + SQLite | Low | Free | Full | Good for small/medium apps |
| pgvector + Postgres | Medium | Low/Free | High | Excellent, scales higher |
| Managed (Pinecone) | High | Paid | Lower | Easy but external |
For many internal or privacy-focused Rails apps, sqlitevec offers the best balance of simplicity and capability.
Also Read:
OpenVino vs TensorRt
Best IDEs for JavaScript Beginners
MPSgraph vs MPSkernals: What to Use for ML Workloads on macOS