← Back to Blog
AI by

Transformers in the browser? A concrete example: semantic search over 3,000+ Odoo modules

Transformers in the browser? A concrete example: semantic search over 3,000+ Odoo modules

How we built a fully client-side semantic searchA search technique that finds results based on meaning and intent rather than exact keyword matches. Semantic search converts queries and documents into embeddings and retrieves the most semantically… engine for 3,000+ Odoo modules with no backend, no API key, no internet required after first load.

Finding the right module used to mean browsing GitHub, reading README files, and relying on knowledge held by experienced team members. We wanted something better: type what you need, get relevant results instantly. So at Trobz, we built a database of all modules from Odoo, OCA, our own Trobz modules generic or custom for projects, and a selection of partner repositories.

The result is odoo-modules.trobz.com: a semantic searchA search technique that finds results based on meaning and intent rather than exact keyword matches. Semantic search converts queries and documents into embeddings and retrieves the most semantically… engine that runs entirely in your browser, with no backend, no API key, and no internet connection required after the first load. This post explains how it works and why we built it this way.

Traditional search is exact: searching for “bank statement import” won’t find a module described as “reconcile transactions from OFX files”. Synonyms, paraphrases, and domain vocabulary all trip it up.

Semantic searchA search technique that finds results based on meaning and intent rather than exact keyword matches. Semantic search converts queries and documents into embeddings and retrieves the most semantically… solves this. Instead of matching words, it matches meaning by converting text into numerical vectors (embeddings) and measuring how close two vectors are in space. Queries and documents that mean similar things end up near each other, even if they share no words.

The catch: generating embeddings usually requires a server or an external API (OpenAI, Cohere, etc.). We wanted to try if we could do without: no dependency on an external service, and no privacy concern.

Transformers.js: ML Models in the Browser

Transformers.js is a JavaScript port of the Hugging Face Transformers library. It runs models compiled to ONNX format using the ONNX Runtime Web (WebAssembly), directly in the browser with no Python, no GPU, no server.

Model Used

We use all-MiniLM-L6-v2 (quantized), a sentence embeddingA dense numerical vector representation of text (or other data) that captures semantic meaning. Semantically similar texts have embeddings that are geometrically close. Embeddings power semantic… modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… that produces 384-dimensional vectors. It’s fast (~50–200ms per query in-browser), compact (~23 MB), and produces high-quality semantic similarity scores.

Example Code

import { pipeline, env } from "./lib/transformers.min.js";

env.allowLocalModels = true;
env.allowRemoteModels = false;
env.localModelPath = "./model/";
env.backends.onnx.wasm.wasmPaths = "./lib/";

const extractor = await pipeline("feature-extraction", "Xenova/all-MiniLM-L6-v2", {
  quantized: true,
});

// Embed a query, runs entirely in the browser
const output = await extractor(["expense management"], {
  pooling: "mean",
  normalize: true,
});

The Architecture

Offline Build (Once, at Deploy Time)

  1. Data: Each module is described in a JSON file with purpose and features fields, written by the team or generated with AI assistance.
  2. EmbeddingA dense numerical vector representation of text (or other data) that captures semantic meaning. Semantically similar texts have embeddings that are geometrically close. Embeddings power semantic…: We run generate_embeddings.js (Node.js + the same MiniLM modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or…) to embed every module, producing three vectors per module: combined, purpose-only, and features-only.
  3. Storage: Embeddings are stored as raw BLOB columns in a SQLite database (sqlite_public.db).
  4. Deploy: The database, modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… weights, and JS dependencies are uploaded to a static file host. No server-side code whatsoever.
  1. Load: On first visit, the browser downloads the database (~28 MB) and the modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… weights (~23 MB). Both are cached by the browser; subsequent visits are instant.
  2. Read: sql.js (SQLite compiled to WASM) reads the database and loads all embeddingA dense numerical vector representation of text (or other data) that captures semantic meaning. Semantically similar texts have embeddings that are geometrically close. Embeddings power semantic… vectors into Float32Array buffers in memory.
  3. Embed query: The user’s query is embedded in-browser using the same MiniLM modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or….
  4. Search: Cosine similarity is computed in pure JavaScript, a dot product loop over all vectors (L2-normalized vectors make cosine similarity equivalent to dot product).
  5. Rank: Top 50 results are returned, sorted by score, filtered by org if selected.

The entire search pipeline (embeddingA dense numerical vector representation of text (or other data) that captures semantic meaning. Semantically similar texts have embeddings that are geometrically close. Embeddings power semantic…, similarity computation, ranking) runs client-side in under 300ms.

Two Separate Search Fields

One insight that improved result quality: modules have two distinct types of text, a short purpose (what the module does) and a list of features (how it does it).

We store separate embeddings for each and expose two search inputs:

  • Purpose: “What kind of module are you looking for?” e.g. expense management
  • Features: “What specific behaviour do you need?” e.g. cancel validated expense reports

When both fields are filled, scores are averaged. This lets users progressively narrow results without re-running a single monolithic query.

Pros and Cons

Pros

  • No backend costs: The search runs entirely in the browser. Hosting is a static file server with no compute, no database server, no API quotas.
  • No tracking: Queries never leave the user’s device. There are no logs, no analytics on what people search for.
  • Works offline: After the first load, the tool works with no internet connection, useful in client environments with restricted access.
  • Scales freely: Every user runs their own search. More users don’t mean more server load.
  • Reproducible: The modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… weights are pinned and served locally; results don’t change because an API updated its modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or….

Cons

  • First load is heavy: Downloading ~50 MB on first visit (modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… + database) takes a few seconds on a slow connection. We mitigate this with a step-by-step progress indicator.
  • Memory usage: Loading all 3,289 module vectors into Float32Array uses ~5 MB of RAM. Acceptable for a desktop browser, but worth monitoring as the catalog grows.
  • Linear scan: Similarity is computed over all vectors on every search. At 3,289 modules this is fast (~5ms). At 100,000 modules it would need approximate nearest-neighbour indexing (e.g. HNSW). For our scale, brute force is fine.
  • No incremental updates: Adding a module requires regenerating and redeploying the database. We automate this with GitHub Actions; a push to data/ triggers regeneration and deployment automatically.

What We Learned

The most surprising thing: a 23 MB quantized modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… running in WASM is good enough for production use. The quality of MiniLM-L6-v2 embeddings on short technical descriptions is comparable to much larger models for this specific use case.

The second insight: separating purpose and features embeddings matters. A combined embeddingA dense numerical vector representation of text (or other data) that captures semantic meaning. Semantically similar texts have embeddings that are geometrically close. Embeddings power semantic… averages over both, which dilutes specificity. Two separate fields let the modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… focus on what the user actually cares about.

Try It

The public version is at odoo-modules.trobz.com/search/: 3,289 OCA and Odoo modules, fully offline after the first load.

Ready to put AI to work?

Let's explore how Trobz AI can automate your processes, enhance your ERP, and help your team make better decisions — faster.