Oare Arene
← Back to Projects

Tessari

AI-powered knowledge-base helper that integrates Confluence, Google Docs, Notion, and more (WIP)

Role
Solo dev — product direction, system design, retrieval architecture, UX
Timeframe
2025 — Early WIP (concept + prototyping)
Tessari screenshotTessari screenshot

Overview

Tessari is an open-source, self-hosted knowledge retrieval engine that lets teams search across Notion, Confluence, Google Drive, Slack, and more with a single query. It’s retrieval-first (hybrid keyword + vector search) with permission-aware results, and an optional AI layer for answer synthesis with citations.

Problem

Teams waste hours each week searching for information scattered across tools. Existing solutions are often cloud-only, expensive at scale, and treat the LLM as the product rather than making retrieval quality the core.

Goals

  • Ship a self-hostable alternative to Glean/Guru with strong default retrieval quality
  • Hybrid search (BM25 + vector) fused with RRF for consistently relevant ranking
  • Enforce source-system permissions so users only see what they can access
  • Keep the AI synthesis layer fully optional and disableable via config
  • Deploy via Docker Compose in minutes with clear, self-serve docs

Solution

Tessari combines a pluggable connector framework, a chunking + indexing pipeline, and a hybrid search engine (keyword + vector) to return grounded results across sources. When enabled, an AI layer can synthesize a concise answer with citations — but the system remains valuable with AI completely disabled.

Architecture / Approach

  • Hybrid search: BM25 keyword search (Typesense) + vector search (Qdrant) fused with Reciprocal Rank Fusion (RRF)
  • Connector framework with a standardized interface (fetch, normalize, sync, permissions)
  • Indexing pipeline: extract → semantic chunk → embed → store across Postgres/Typesense/Qdrant
  • Permission-aware retrieval: filter results server-side using source-derived ACLs
  • Optional AI synthesis layer (Claude/GPT/local) with citation-only answers; disableable via a single config flag
  • Self-hosted deployment via Docker Compose (fast setup, minimal operational overhead)

Outcomes

  • PRD established: retrieval-first, self-hosted, OSS (Apache 2.0) direction
  • Architecture defined: Typesense + Qdrant hybrid search fused with RRF and permission-aware filtering
  • Concept UI explored via mock screenshots (conceptual — not production UI yet)

Next Steps

  • Build the connector framework + first-party connectors (Notion/Confluence/Google Drive/Slack)
  • Implement indexing pipeline (chunking, embeddings, storage) and hybrid retrieval + RRF
  • Add permission enforcement end-to-end (source ACL extraction + query-time filtering)
  • Ship Docker Compose quickstart and self-serve documentation

Tech Stack

Next.jsRAGEmbeddingsMulti-source
← Back to Projects