NewNew robotics series: building a home robot from scratch
← Home

AI Research & Automation

Canvas: a trusted AI workspace for source-heavy teams.

An AI research and requirements tool that turns messy source material into evidence-backed artefacts a team can review, own and ship, not another chat box.

Role
AI Engineer / Developer
Stack
Next.js, Vercel AI SDK, Claude, OpenAI, Tavily, pgvector
Platform
Enterprise AI research and requirements workspace
Status
Deployed in multiple Fortune 500 companies

01Overview

An AI workspace for product work that has to be defended.

Canvas helps teams move from messy source material to reviewed product artefacts: LVCs, BRDs, PRDs, research, stories and delivery plans.

The design problem was not whether AI could draft a document. It was whether product leads could trust, challenge and own the output before it shaped delivery.

02Research

Use research to turn AI output into a reviewable workflow.

The research focused on real product planning behaviour: how teams gather context, validate claims, divide ownership and decide when an AI-generated artefact is safe enough to use.

Selected method

Contextual inquiry

Where does confidence break down when a lead turns source material into a PRD?

Method
Observed product leads and BAs rebuilding requirements from decks, research notes, spreadsheets and old Jira tickets.
Research output
A breakdown of where users paused, cross-checked evidence, asked for owner input or marked assumptions.
UI/UX decision
Canvas fields became reviewable objects with sources, confidence, status and history instead of plain document paragraphs.
Interface pattern
Source chips, evidence drawers and field-level review states.

03Product

The product is a workspace, not a chat wrapper.

Live demo

Workspaces

The workspace view shows how source packs, artefacts and hierarchy sit together before anyone opens the canvas itself.

Preview

Canvas

Delivery

Jira and Confluence sync

Reviewed stories can move into Jira, while approved artefacts export into Confluence without losing evidence links.

Export to Jira modal showing selected epics and features ready to sync.

Governance

Custom artefacts and hierarchy

Teams can define artefact types, connect parent-child work, and keep strategy, requirements and delivery output in one chain.

Custom artefact cards showing Requirements Canvas, Lean Value Case, BRD and PRD types.

04UX challenges

The hard parts were trust, ownership and review.

Users treated confidence as decoration.

Challenge

Early designs buried confidence in metadata, so people ignored it.

Solution

I made confidence a visible field state and paired low confidence with required inputs or follow-up questions.

The agent felt too powerful.

Challenge

Direct edits looked impressive in demos but made stakeholders nervous.

Solution

The agent now returns proposals. Users accept, reject or inspect the evidence before anything changes.

Generated documents became walls of text.

Challenge

A complete BRD looked useful but was hard to scan and challenge.

Solution

I broke artefacts into reviewable fields with status, evidence and ownership on each field.

Collaboration created ownership ambiguity.

Challenge

When several people edited the same artefact, nobody knew which source of truth mattered.

Solution

Workspaces, audit log and presence made ownership explicit without locking the document down.

05Engineering

Agentic capability, designed with guardrails.

The agent is an orchestration layer inside the product, not a chat widget bolted onto the side. A single turn can resolve the right model tier, load prior session history, compact long conversations, stream the response, emit tool lifecycle events, record usage and persist the full session.

The capability surface is typed and product-aware. It can read individual or batched canvas fields, inspect readiness and backlog hierarchy, search canvas-scoped uploads through pgvector, fall back to direct document chunks, search generated research, run parallel research tracks and bridge enabled MCP tools into the same tool interface.

The powerful part is where the autonomy stops. The model can gather evidence and propose structured changes, but it cannot silently edit the canvas. Accepting a proposal re-checks access, validates the configured field, preserves metadata, appends field history and saves an audit entry.

Agent loop

A constrained loop, not an open-ended chatbot.

The agent reads the current work, gathers evidence, runs a typed capability, then returns a proposal for review. Nothing lands in the canvas until the user accepts it.

01

Turn runtime

Runs each turn through a shared agent runtime: model-tier routing, streamed output, tool events, session history, compaction and usage/cost logging.

02

Product context

Builds the prompt from artefact type, active tab, configured fields, field confidence, readiness gaps, research state and story hierarchy.

03

Retrieval and research

Searches canvas-scoped uploads with pgvector, falls back to direct document chunks, searches generated research and can run parallel market, internal and benchmark research tracks.

04

Tool bridge

Combines native canvas tools with enabled MCP tools converted into the Vercel AI tool interface, so the agent can query product state and external systems through one capability layer.

05

Proposal gate

Every write becomes a proposal against a configured field. Accepting it re-checks access, validates the target field, preserves metadata and writes field history.

01

Frontend

Next.js canvas UI, review states, live demo mode

02

Collaboration

Yjs + Hocuspocus presence and shared editing

03

Agent runtime

Tool router, proposal generation, tab-aware context

04

Backend

Fastify APIs, document processing, export and integrations

05

Data

Postgres, pgvector retrieval, source-linked artefacts

06Next time

What I would do differently.

01

Show the workspace around the canvas much earlier. People understood the document, but they needed to see where sources, drafts and exports lived.

02

Test the writing separately from the interface. A screen can be easy to use while the generated PRD still feels too vague to trust.

03

Create a repeatable way to review agent suggestions sooner. Good proposals, weak proposals and risky proposals needed to be judged with the same care as the UI.

Next project

Banking

HSBC