Skip to content

AI Platform Architecture

Intent

This document turns the AI handoff into a working architecture direction for the Souza Hub AI platform under the current Tentacles business segment.

It is intentionally staged and prototype-first.

Core Architectural Goals

  • multi-tenant separation from day one
  • support multiple agents per client
  • support multiple channels per client
  • keep model backend swappable
  • keep operational blast radius controlled

Application Layer

  • Dify
  • client-facing agent apps
  • knowledge bases
  • workspace separation
  • future web embedding

Workflow Layer

  • n8n
  • orchestrations
  • channel integrations
  • message routing
  • enrichment and automation steps

Observability Layer

  • Langfuse
  • prompt/version tracking
  • traceability
  • evaluation support

Data Layer

  • PostgreSQL
  • app state
  • workflow state
  • tenant metadata

  • Qdrant or aligned retrieval backend

  • document retrieval
  • embeddings-backed search

Model Layer

  • hosted APIs first
  • provider abstraction mindset from the beginning

That means the backend should remain replaceable across:

  • OpenAI
  • Google
  • Anthropic
  • future local GPU inference

Channel Plan

V1

  • WhatsApp
  • email

Interpretation:

  • WhatsApp is mandatory
  • email is included only if it does not slow the first working prototype materially

Later

  • web chat
  • website embedding
  • telephony / voice
  • internal operator consoles

Tenant Model

Each client should have:

  • a client workspace
  • isolated prompts and knowledge
  • isolated automation paths
  • one or more virtual employee personas

Each client may also have multiple channels and multiple business functions.

Do not treat high-risk production infrastructure as the first place to experiment.

Prefer:

  • isolated business-tagged infrastructure
  • or a dedicated pilot VM / service boundary

Security and Safety Principles

  • keep tenant isolation explicit
  • separate secrets from normal documentation
  • record operational decisions in Markdown
  • avoid direct mutation of critical production systems without explicit approval

V1 Outcome Definition

A useful first prototype is not a full AI operating system.

A useful v1 should prove:

  • one client can interact with a real agent
  • the interaction works over WhatsApp
  • email can be used for supporting communication or knowledge intake
  • the agent can use curated knowledge
  • the workflow can handle inbound and outbound channel events
  • the operator can trace what happened
  • the pricing and support model is understandable