Key Takeaways: The most popular AI integration tools — cloud orchestrationThe coordination of multiple AI agents, tools, or services to accomplish a complex task. An orchestrator directs subagents, manages state, handles errors, and aggregates results. Frameworks like… platforms, hosted vector databases, API-only inferenceThe process of using a trained model to generate predictions or outputs on new data. Unlike training (which is computationally intensive), inference is typically faster and is the production-time… — all introduce dependencies that erode control at exactly the wrong moments: during a compliance audit, when pricing changes, or when a vendor sunsets a featureAn individual measurable property or characteristic of the data used as input to a model. Feature engineering — selecting, transforming, and creating features — is a critical step in the ML pipeline.. We chose open-source alternatives — n8n over Zapier/Make, pgvector over Pinecone, Transformers.js over cloud embeddingA dense numerical vector representation of text (or other data) that captures semantic meaning. Semantically similar texts have embeddings that are geometrically close. Embeddings power semantic… APIs — because control and auditability were non-negotiable. Each choice came with real trade-offs. This post explains both sides.
Every AI project we have deployed in production has hit the same inflection point: the moment the convenient cloud service becomes a liability. Pricing doubles. A vendor deprecates an endpoint. A client’s data governance team realises that customer records are being sent to a third-party API without an explicit processing agreement.
These are not edge cases. They are predictable outcomes of building open-source AI on infrastructure you do not control.
The “Just Use the API” Problem
Cloud-based AI tools have excellent developer experience. Zapier connects anything in minutes. Pinecone has a clean SDK. OpenAI’s API is genuinely impressive. We use all three in prototyping.
The problem shows up later. Per-execution pricing scales badly once you have real volume. Hosted vector databases hold your embeddings on infrastructure you cannot audit. SaaS automation tools have terms of service that allow them to change pricing, deprecate workflows, or go offline — and your business logic goes with them.
For a solo side project, none of this matters. For an enterprise running Odoo with 200+ users, customer financial data, and compliance obligations, it matters considerably.
This is not a blanket argument that open source is always better. It is an argument that for AI workflows embedded in ERP operations, the costs of dependency are higher than most teams modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… upfront.
n8n: Why Self-Hostable Matters More Than Convenience
The standard pitch for Zapier and Make is that they are faster to build with. That is true for the first workflow. By workflow ten, you are paying per execution, hitting rate limits you did not anticipate, and debugging flows in a UI that has no real diff view.
n8n runs on your own infrastructure — a Docker container on the same server as your Odoo instance if you want. Every workflow is a JSON file you can version in git. Run it behind your VPN and sensitive data from account.move or crm.lead never leaves your network to reach the automation layer.
The practical differences worth naming:
- No per-execution pricing. A workflow polling
stock.quantevery 15 minutes for low-inventory alerts does not accumulate a monthly bill. A client running 50 active automation workflows would pay thousands per month on Zapier’s Business plan. On self-hosted n8n, the cost is the compute. - Auditable execution logs. When a client’s finance team asks whether the invoice reminder actually sent on a specific date, you can show them the exact payload, timestamp, and HTTP response code. With cloud automation tools, that data is in someone else’s system.
- Code-escapable. When a visual node is not enough, n8n lets you drop into JavaScript. You do not hit a wall at the point where your workflow gets interesting.
The honest trade-off: you are now responsible for hosting, upgrades, and uptime. We run n8n as a Docker container with automatic restarts and a weekly backup job. That is not zero effort. But it is predictable effort — the kind you can budget for, unlike the kind that arrives in a vendor pricing email.
We have covered the n8n and Odoo JSON-RPC integration pattern in depth in an earlier tutorial on automating Odoo finance alerts, if you want to see the concrete wiring.
pgvector: The Vector Database You Already Have
The vector databaseA database optimized for storing and querying high-dimensional embedding vectors. Used in RAG and semantic search to find documents or data points most similar to a query vector. Examples: Pinecone,… market is crowded. Pinecone, Weaviate, Qdrant, Milvus — all purpose-built for storing and querying embeddings at scale, all with good documentation and impressive benchmarks.
They are also separate infrastructure. That means another service to provision, another authentication scheme, another monitoring setup, and another bill. For most enterprise Odoo deployments, it is infrastructure you do not need.
pgvector is a PostgreSQL extension. Odoo already runs on PostgreSQL. Installing pgvector means running CREATE EXTENSION vector; and adding a vector column to the table you care about. Your embeddings live in the same database as your product.template records, subject to the same backup schedule, the same access controls, and the same transaction semantics.
The performance question is legitimate. pgvector will not outperform Pinecone at 100 million vectors with millisecond latency requirements. But most enterprise use cases — semantic searchA search technique that finds results based on meaning and intent rather than exact keyword matches. Semantic search converts queries and documents into embeddings and retrieves the most semantically… over a product catalogue, similarity matching on supplier records, document classification for vendor invoices — involve tens of thousands of vectors, not hundreds of millions. pgvector handles that without strain, and the query speed difference is imperceptible to end users.
More importantly: you can join vectors with your operational data in a single SQL query. Finding the three most semantically similar product.template records to a search string, filtered by available stock and supplier country, is one query with pgvector. With a separate vector databaseA database optimized for storing and querying high-dimensional embedding vectors. Used in RAG and semantic search to find documents or data points most similar to a query vector. Examples: Pinecone,…, it is two round trips and a join in application code.
We have documented how to wire pgvector into Odoo through the field_vector OCA module, which gives you a declarative way to attach vector fields to any Odoo modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… without touching core.
Transformers.js: When the Compute Belongs in the Browser
The default assumption for AI inferenceThe process of using a trained model to generate predictions or outputs on new data. Unlike training (which is computationally intensive), inference is typically faster and is the production-time… is server-side: send the text to an API, receive the result. This works. It also means every query goes to an external server, which adds latency, costs money per call, and puts data in transit.
Transformers.js is Hugging Face’s JavaScript port of the Transformers library. It runs quantized models — small enough to download once and cache in the browser — using WebAssembly and, where available, WebGPU. The modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… runs locally. No API call. No data leaves the device.
For enterprise applications, this matters in specific scenarios:
- Semantic searchA search technique that finds results based on meaning and intent rather than exact keyword matches. Semantic search converts queries and documents into embeddings and retrieves the most semantically… over private data. A salesperson searching through 3,000 product descriptions should not be sending those descriptions to an external API on every keystroke.
- Offline capability. Field operations in manufacturing or logistics often have unreliable connectivity. Client-side inferenceThe process of using a trained model to generate predictions or outputs on new data. Unlike training (which is computationally intensive), inference is typically faster and is the production-time… keeps working when the network does not.
- Latency. For interactive search, round-trip latency to a cloud API adds perceptible delay. A quantized embeddingA dense numerical vector representation of text (or other data) that captures semantic meaning. Semantically similar texts have embeddings that are geometrically close. Embeddings power semantic… modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… running in the browser responds in under 50 milliseconds after the initial load.
The limitation is real and worth being explicit about: Transformers.js works well for embeddingA dense numerical vector representation of text (or other data) that captures semantic meaning. Semantically similar texts have embeddings that are geometrically close. Embeddings power semantic… models and small classification models. For generation tasks that require large models — long-document summarisation, answer synthesis, structured extraction from complex unstructured text — server-side inferenceThe process of using a trained model to generate predictions or outputs on new data. Unlike training (which is computationally intensive), inference is typically faster and is the production-time… is still the right choice. We do not use Transformers.js for generation.
We ran a proof of concept doing semantic search over 3,000+ Odoo modules entirely in the browser. The modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… download was under 30MB. Search was instant after that first load. The result convinced us this approach was viable for production, not just demos.
The Common Thread: Control, Auditability, No Lock-In
These three choices follow from a consistent set of constraints.
Control. Self-hosted tools cannot unilaterally change their pricing, deprecate endpoints, or go offline. You upgrade on your schedule, not theirs. You can inspect the source. You can run the stack in an environment that meets your compliance requirements.
Auditability. When a client’s external auditor asks for a log of every automated action that touched a customer record, we can produce it from our own systems. With cloud automation tools, the execution logs are on someone else’s infrastructure, exported in formats not designed for audit workflows.
No lock-in. n8n exports to JSON. pgvector is standard SQL with an extension. Transformers.js uses the same modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… weights as the Python Transformers library. If any of these stops being the right tool, migrating away is a data problem — not a hostage negotiation.
None of this means we are opposed to cloud services as a category. We use managed PostgreSQL hosting. We call cloud LLMA neural network trained on vast amounts of text data to understand and generate human language. LLMs use the Transformer architecture and can perform a wide range of tasks — summarization,… APIs for generation tasks that require large-modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… capability. We deploy on Odoo.sh for clients where managed infrastructure makes sense. The question is always: which parts of the system need to be under your control, and which parts can you afford to delegate?
For the AI workflow layer — the part that touches customer data, financial records, and core business logic — delegating control consistently costs more than it saves.
What This Looks Like in a Real Deployment
A typical AI stack for an Odoo production deployment:
- n8n in Docker, connected to Odoo via JSON-RPC, handling workflow automation: reminders, low-stock alerts, approval routing, data sync across systems
- pgvector on the Odoo PostgreSQL instance, used for semantic searchA search technique that finds results based on meaning and intent rather than exact keyword matches. Semantic search converts queries and documents into embeddings and retrieves the most semantically…, similarity matching, and embeddingA dense numerical vector representation of text (or other data) that captures semantic meaning. Semantically similar texts have embeddings that are geometrically close. Embeddings power semantic… storage for document AI workflows
- Transformers.js in custom Odoo web client components for client-side embeddingA dense numerical vector representation of text (or other data) that captures semantic meaning. Semantically similar texts have embeddings that are geometrically close. Embeddings power semantic… and interactive search
- A cloud LLMA neural network trained on vast amounts of text data to understand and generate human language. LLMs use the Transformer architecture and can perform a wide range of tasks — summarization,… API (Claude or GPT-4) for generation tasks requiring large-modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… capability — the one part of the stack where a hosted API is genuinely the right tool
That last point deserves emphasis: we do use cloud LLMs. The argument is not that everything should run locally. It is that the data movement, execution logs, and workflow orchestrationThe coordination of multiple AI agents, tools, or services to accomplish a complex task. An orchestrator directs subagents, manages state, handles errors, and aggregates results. Frameworks like… should be under your control — and the LLMA neural network trained on vast amounts of text data to understand and generate human language. LLMs use the Transformer architecture and can perform a wide range of tasks — summarization,… call should be a well-defined, auditable step within that workflow, not the entire architecture.
At Trobz, this is the stack we have deployed across manufacturing, distribution, and services companies running Odoo in Vietnam and Southeast Asia. If you are evaluating the same choices for your environment, we are happy to walk through the trade-offs for your specific constraints — reach out here.