Building a RAG-Powered Odoo Support Agent That Knows Your Product Catalogue

A step-by-step guide to building an Odoo support agent that retrieves product data, pricing rules, and quote history to answer sales rep questions accurately — and knows when to say it doesn't know.

Key Takeaways: A RAG agent over your product catalogue is useful, but only if the retrieval layer is built properly — naive keyword matching misses too much. This tutorial covers indexing product.template data with pgvector, using a cross-encoder for reranking, prompting the model to abstain when confidence is low, and wiring everything to Odoo’s chatbot widget. The hardest part isn’t the AI. It’s deciding what goes into the index and what the fallback path looks like.

A sales rep in the middle of a customer call shouldn’t have to search three different places to answer “does this come in a 5-litre drum, and is there a price break for orders over 50 units?” That question sits at the intersection of product variant data, pricing rules, and possibly archived quote notes. The answer exists in Odoo. Getting it fast is the problem.

A retrieval-augmented agent over your product data solves this — but only if the retrieval layer is tight. This tutorial builds it end-to-end: indexing, retrieval, answer generation, and the Odoo chatbot integration. It also covers the part most tutorials skip — what happens when the agent doesn’t have enough context to answer confidently.

The stack: Python, pgvector running in your existing Odoo PostgreSQL database, sentence-transformers for embedding, a cross-encoder for reranking, and Odoo’s built-in chatbot for the UI layer. No external vector database required.

Step 1: Index Your Product Data

The index defines what the agent can answer. Build it poorly and you’ll get accurate retrieval on the easy cases and noise on the hard ones.

What to index

Pull from three sources: product.template (base product data), product.pricelist.item (pricing rules), and sale.order.line descriptions from historical quotes. Each needs a different chunking strategy because they have different structures and different staleness profiles.

For product.template, combine the product name, sales description (description_sale), internal notes (description), category path, and variant attribute values into a single text chunk. Variants matter: if “Widget A” comes in three sizes, each variant’s attributes should be in the chunk so a question about a specific size retrieves the right record.

For pricelists, index each product.pricelist.item as a separate chunk that includes the product name, rule type (fixed, percentage, formula), minimum quantity, and the computed price where it’s deterministic. Formula-based rules are a special case — covered in the gotcha below.

For quote history, the useful signal sits in sale.order.line.name — the description field that sales reps edit manually during quoting. These carry institutional knowledge about how products have been positioned in real deals.

Generating embeddings

from sentence_transformers import SentenceTransformer
import psycopg2

model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

def build_product_chunks(conn):
    cur = conn.cursor()
    cur.execute("""
        SELECT
            pt.id,
            pt.name,
            pt.description_sale,
            pt.description,
            pc.complete_name AS category,
            STRING_AGG(DISTINCT pav.name, ', ') AS variants
        FROM product_template pt
        LEFT JOIN product_category pc ON pc.id = pt.categ_id
        LEFT JOIN product_product pp ON pp.product_tmpl_id = pt.id
        LEFT JOIN product_template_attribute_line ptal ON ptal.product_tmpl_id = pt.id
        LEFT JOIN product_attribute_value pav ON pav.attribute_id = ptal.attribute_id
        WHERE pt.active = true AND pt.sale_ok = true
        GROUP BY pt.id, pt.name, pt.description_sale, pt.description, pc.complete_name
    """)
    chunks = []
    for row in cur.fetchall():
        tmpl_id, name, desc_sale, desc, category, variants = row
        text = f"Product: {name}\nCategory: {category or 'Uncategorized'}\n"
        if desc_sale:
            text += f"Sales description: {desc_sale}\n"
        if desc:
            text += f"Internal notes: {desc}\n"
        if variants:
            text += f"Available variants: {variants}\n"
        chunks.append({"id": tmpl_id, "type": "product", "text": text})
    return chunks

def embed_and_store(conn, chunks):
    cur = conn.cursor()
    texts = [c["text"] for c in chunks]
    embeddings = model.encode(texts, batch_size=64, show_progress_bar=True)
    for chunk, embedding in zip(chunks, embeddings):
        cur.execute("""
            INSERT INTO ai_product_embeddings (source_id, source_type, content, embedding)
            VALUES (%s, %s, %s, %s::vector)
            ON CONFLICT (source_id, source_type) DO UPDATE
            SET content = EXCLUDED.content, embedding = EXCLUDED.embedding
        """, (chunk["id"], chunk["type"], chunk["text"], embedding.tolist()))
    conn.commit()

The table schema

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE ai_product_embeddings (
    id SERIAL PRIMARY KEY,
    source_id INTEGER NOT NULL,
    source_type VARCHAR(50) NOT NULL,
    content TEXT NOT NULL,
    embedding vector(384),
    created_at TIMESTAMPTZ DEFAULT NOW(),
    UNIQUE (source_id, source_type)
);

CREATE INDEX ON ai_product_embeddings
    USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);

all-MiniLM-L6-v2 produces 384-dimensional vectors. If you switch to all-mpnet-base-v2 (768 dimensions), update the schema. The IVFFlat index works well up to around 100k products. Beyond that, consider HNSW.

⚠️ Gotcha: Pricelist rules that use formula pricing (e.g., “15% above cost”) are not static. If you embed a computed price, it goes stale when the cost changes. For formula-based rules, embed the rule description, not the computed value. Calculate the actual price at query time using Odoo’s product.pricelist._get_product_price() method.

For an alternative approach using the OCA module to store embeddings directly on Odoo records, see Introduction to the field_vector OCA Module for Odoo.

Step 2: Retrieval with Vector Search and Reranking

Cosine similarity on a single query embedding is fast. It’s also too coarse for product questions, where the user’s phrasing rarely matches the product description precisely. A question like “something for sealing industrial pipes under 10 bar pressure” might embed near “pipe fittings” without surfacing the specific gasket material that actually matches.

The fix is two-stage retrieval: fetch 20 candidates with pgvector, then rerank with a cross-encoder that scores each candidate against the actual question text.

from sentence_transformers import SentenceTransformer, CrossEncoder

bi_encoder = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
cross_encoder = CrossEncoder("cross-encoder/ms-marco-MiniLM-L-6-v2")

def retrieve(conn, question: str, top_k: int = 5, candidate_pool: int = 20) -> list[dict]:
    query_embedding = bi_encoder.encode(question).tolist()

    cur = conn.cursor()
    cur.execute("""
        SELECT source_id, source_type, content,
               1 - (embedding <=> %s::vector) AS cosine_similarity
        FROM ai_product_embeddings
        ORDER BY embedding <=> %s::vector
        LIMIT %s
    """, (query_embedding, query_embedding, candidate_pool))

    candidates = [
        {"source_id": row[0], "source_type": row[1], "content": row[2], "score": row[3]}
        for row in cur.fetchall()
    ]

    # Rerank with cross-encoder
    pairs = [(question, c["content"]) for c in candidates]
    rerank_scores = cross_encoder.predict(pairs)

    for candidate, score in zip(candidates, rerank_scores):
        candidate["rerank_score"] = float(score)

    candidates.sort(key=lambda x: x["rerank_score"], reverse=True)
    return candidates[:top_k]

The cross-encoder (ms-marco-MiniLM-L-6-v2) scores each question-passage pair jointly rather than independently. It runs in roughly 20-50ms for a batch of 20 candidates on CPU — slow enough that you don’t want to run it on more candidates than necessary. In testing on a 3,000-product catalogue, reranking moved the correct answer from position 4-7 to position 1-2 on roughly 30% of product-specific queries. That’s the difference between a useful agent and a frustrating one.

Step 3: Answer Generation with Faithful Prompting

The model should only answer from retrieved context. When the context doesn’t support a confident answer, the agent should say so explicitly. A sales agent that confidently confabulates pricing information is worse than one that escalates — because the sales rep trusts it.

import anthropic

client = anthropic.Anthropic()

SYSTEM_PROMPT = """You are a product knowledge assistant for a sales team. Answer questions about products using only the context provided below.

Rules:
1. Answer only from the provided context. Do not add information from your training knowledge.
2. If the context doesn't contain enough information to answer confidently, say exactly: "I don't have enough information to answer this. I'll escalate to a product specialist."
3. For pricing, state the exact figure from the context. If it's a formula-based rule, describe the rule — do not calculate a number.
4. Keep answers short. Two to four sentences. Sales reps are usually on a call.
5. If the question involves complex configuration or multiple interacting products, flag it for human review."""

def generate_answer(question: str, context_chunks: list[dict]) -> dict:
    context = "\n\n---\n\n".join(
        f"[{c['source_type'].upper()} #{c['source_id']}]\n{c['content']}"
        for c in context_chunks
    )

    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=512,
        system=SYSTEM_PROMPT,
        messages=[{
            "role": "user",
            "content": f"Context:\n{context}\n\nQuestion: {question}"
        }]
    )

    answer = response.content[0].text
    needs_escalation = (
        "I don't have enough information" in answer
        or "escalate" in answer.lower()
    )

    return {
        "answer": answer,
        "needs_escalation": needs_escalation,
        "sources": [{"id": c["source_id"], "type": c["source_type"]} for c in context_chunks]
    }

The needs_escalation flag is structural, not cosmetic. The next step uses it to route the conversation.

One failure mode worth flagging: when the context contains conflicting information — two pricelist rules that apply to the same product at the same quantity threshold — the model tends to pick one and present it as definitive. The safer design is to surface both rules in the answer and let the sales rep interpret them. You can add a check for this before the generation step by looking for multiple product.pricelist.item chunks for the same product in the top-k results.

Step 4: Wiring to Odoo’s Chatbot

Odoo’s im_livechat module has a built-in chatbot system. It’s the right integration point for an internal agent — the UI is already in Discuss, access control works through existing Odoo users, and message history persists in mail.message.

The minimal module structure:

product_support_agent/
├── __manifest__.py
├── models/
│   └── product_agent.py
└── views/
    └── chatbot_script.xml

__manifest__.py

{
    "name": "Product Support Agent",
    "version": "17.0.1.0.0",
    "depends": ["product", "im_livechat", "mail"],
    "data": ["views/chatbot_script.xml"],
    "installable": True,
}

models/product_agent.py

import json
import requests
from odoo import models, api

class ProductAgentStep(models.Model):
    _inherit = "im_livechat.chatbot.script.step"

    @api.model
    def _call_product_agent(self, question: str) -> dict:
        """
        POST the question to an external FastAPI microservice running the
        retrieval pipeline. Returns the structured response dict.
        """
        response = requests.post(
            "http://localhost:8001/ask",
            json={"question": question},
            timeout=15
        )
        response.raise_for_status()
        return response.json()

⚠️ Gotcha: Don’t call the agent synchronously from within an Odoo RPC request if you can avoid it. The cross-encoder adds latency that can stall Odoo workers. Run the retrieval pipeline as a separate HTTP microservice (FastAPI works well here) and call it asynchronously, or use Odoo’s bus.bus for non-blocking message delivery. The timeout=15 above is a hard ceiling — tune it based on your hardware.

views/chatbot_script.xml

<odoo>
    <record id="product_support_chatbot" model="im_livechat.chatbot.script">
        <field name="title">Product Support Agent</field>
    </record>

    <record id="product_agent_step_greet" model="im_livechat.chatbot.script.step">
        <field name="message">Hi! Ask me anything about our product catalogue — pricing, variants, availability.</field>
        <field name="step_type">free_input_multi</field>
        <field name="sequence">1</field>
        <field name="chatbot_script_id" ref="product_support_chatbot"/>
    </record>
</odoo>

The dynamic response injection — taking the agent’s answer and posting it back into the chat thread — requires a JavaScript override of im_livechat’s frontend chatbot service. That’s a static/src/js/ customization beyond the scope of this post. The hook to override is _sendMessageChatbot in the livechat service.

Step 5: Escalation and the Human Handoff

The escalation path needs to be as deliberate as the retrieval path. When needs_escalation is true, the agent should do three things:

1. Log the failure. Store the question, the retrieved context that wasn’t sufficient, and a timestamp. These logged failures are your index improvement backlog — each one represents a gap in product descriptions or pricing coverage.

2. Send an honest response. Something like: “I found some related products but I’m not confident this fully answers your question. Let me connect you with a product specialist.”

3. Route to a human. Either through im_livechat’s built-in operator escalation, or by creating a project.task assigned to the product team with the question and context attached.

The logging step is what most implementations skip. Those unanswered questions are the most valuable data you’ll collect. A question the agent couldn’t answer confidently today is a data gap to close tomorrow — usually by adding better descriptions to product.template.description_sale, not by changing the model.

Track the escalation rate explicitly. If it’s above 15-20% in the first two weeks, the index isn’t covering the actual questions sales reps ask. In our experience, the fix is almost always content, not model. Better product descriptions outperform better models.

Key Takeaways

Index product.template, pricelist items, and historical quote descriptions separately — they have different structures and different staleness profiles.
Two-stage retrieval (bi-encoder candidates + cross-encoder reranking) meaningfully improves precision over single-pass cosine similarity, especially for product-specific terminology.
Prompt the model to abstain when the context is insufficient. A confident wrong answer in a sales context causes real damage.
For formula-based pricing rules, embed the rule description and compute the price at query time. Never embed a stale computed value.
Escalation rate is the most useful early metric. If it’s high, fix the index content before touching anything else.

At Trobz, we build these agents as part of AI PoC sprints — defining retrieval quality metrics before integrating into Odoo, testing against real product catalogues, and measuring escalation rates in the first week. If you’re building this for your own catalogue, reach out and we can share the evaluation setup we use.