CASCADE

Prompt-Driven Generation of Visual Node Graphs in Volumetric Space
via Agentic Chain-of-Influence Synthesis

A mixed-initiative AI design assistant that transforms a natural-language brief directly into a fully executable, browser-rendered volumetric node graph — no code required, no GPU shader knowledge necessary.

View on GitHub ▶  Watch Demos

Abstract

Professional creative visual coding tools that use node-graph environments for motion graphics, live performance, and interactive installations impose prohibitive cognitive barriers: users must master node definitions, GPU shader authoring, signal routing, and engine-specific APIs before producing any meaningful output. We present Cascade, an AI design assistant that removes this barrier by transforming a natural-language brief (and optional reference images) directly into a fully executable, browser-based volumetric compositing stack of nodes.


Rather than replacing the user's creative agency, Cascade acts as a co-designer: it interprets intent, proposes structure, generates code, and sustains a live repair loop, while the user retains full control over the resulting graph through an interactive grid editor. Cascade operates through a two-phase agentic pipeline. Phase 1 applies CLIP-based multimodal analysis to extract brand emotions, computes per-level creative divergence scores, and retrieves node archetypes via a knowledge-graph RAG system. Phase 2 runs the Chain-of-Influence pipeline: a Reasoner agent designs a typed influence graph, a deterministic Compiler instantiates a volumetric grid, and the Mason agent generates polyglot code (GLSL, p5.js, Three.js, WebAudio) for every node with headless validation. A RuntimeInspector agent closes an autonomous repair loop by routing all browser-side errors to a large language model for hot-reload patching.

Creative Coding Visual Programming Generative AI Node Graph Code Generation Agentic Pipeline Co-Creative AI GLSL p5.js WebGL2 Three.js WebAudio CLIP RAG Democratization

Live Output Demos

All outputs below were generated directly by Cascade from natural-language prompts — no code was written by hand. Click any thumbnail to expand.

Interface & Editor Screenshots

Cascade editor screenshot 1
Cascade editor screenshot 2
Cascade editor screenshot 3

Two-Phase Agentic Pipeline

A natural-language brief flows through a semantic divergence engine and a chain-of-influence synthesiser, producing a live, repairable node graph in the browser — with zero code authored by the user.

Phase 1

Semantic Divergence Engine

  • CLIP Brand Extractor — zero-shot emotion and brand attribute classification over reference images via ViT-B/32 cosine similarity across 24 emotion + 18 brand labels
  • Creative Level Estimator — computes per-level divergence weights (surface, flow, narrative) modulated by creative mode (functional / aesthetic / flow)
  • LLM Divergence Cascade — per-level D values collapse to D_agg, controlling LLM temperature (0.2 → 1.3), top-p, and presence penalty via softmax regime mixing
  • RAG Archetype Retriever — Sentence-BERT knowledge graph spanning Book of Shaders, p5.js docs, design theory (Albers, Itten, Kandinsky) grounds Phase 2 BuildSheets

Phase 2

Chain-of-Influence Pipeline

  • Reasoner Agent — single structured LLM call producing a typed DAG (InfluenceGraphIR) of nodes linked by typed edges: mask_and_emit, warp_only, color_grade, composite, feedback_trails, data_drive
  • Influence Compiler — deterministic (no LLM): Z-depth assignment, Sugiyama-style X placement, DAG quality scoring, orphan node auto-connection
  • Mason Agent — generates per-node polyglot code (GLSL / p5.js / Three.js / Canvas2D / WebAudio) validated by a three-stage filter: structural → syntax → performance
  • Target node count — N = clamp(5 + 6·w_s + 6·w_f + 5·w_n + 5·D_n + ε, 5, 30)
RuntimeInspector — autonomous hot-reload repair loop intercepts console errors, deduplicates by node+code+message hash, re-prompts Mason up to 5× per node, patches session JSON on success

Polyglot Node Executors

All executors share a central WebGL2 texture hub for GPU allocation and Z-layer compositing.

ShaderExecutor
Engines: glsl, regl
→ RGBA render texture
JSModuleExecutor
Engines: p5, three_js, js_module
→ Canvas → WebGL texture
Canvas2DExecutor
Engines: canvas2d
→ Canvas → WebGL texture
VideoExecutor
Engines: html_video
→ Video → WebGL texture
WebAudioExecutor
Engines: webaudio
→ FFT data texture
EventExecutor
Engines: events
→ Data texture

Key Technical Contributions

🎛

Creative Divergence Model

A single dial D_global ∈ [0,1] propagates through the semantic model to affect graph topology, LLM sampling temperature (0.2–1.3), and parameter ranges — giving users expressive control without technical knowledge.

Typed Code Synthesis

Mason generates and validates polyglot GPU shader, particle, and audio code per node. A Triple Filter (structural → syntax → performance) enforces BuildSheet contracts before any code runs in the browser.

🔄

Autonomous Repair Loop

The RuntimeInspector intercepts browser-side failures — GLSL precision bugs, ESM export issues, hallucinated APIs — and hot-reloads LLM-generated fixes autonomously. 97% of nodes reach approved state across 4,510 generated.

🗂

Session JSON Persistence

All edits, repairs, and predefined node additions write back to a canonical Session JSON automatically. Drag the file back into the editor to restore the exact design state across sessions.

🤖

Design Copilot

A chat-based post-generation copilot lets users edit the live graph in plain language via 7 tool calls: create_node, duplicate_node, regenerate_node, update_parameters, delete_node, move_node, replace_concept.

🧱

Predefined Node Library

Seven hardened, guaranteed-compile nodes (Video Input, Audio Input, Tracking, Color, Noise, Blend, Particles) provide a zero-code on-ramp for beginners to augment any generated pipeline instantly.

Predefined Zero-Code Nodes

Seven hardened nodes ship with guaranteed-compile templates, available as one-click augmentation to any live pipeline — no prompt required.

Node Engine Role Key Parameters
Video Input html_video Source Webcam / file / URL, flip
Audio Input webaudio Source Mic / file, FFT bins, gain
Tracking canvas2d Process Face / hand / body (ml5.js)
Color Node glsl Process Hue, exposure, saturation
Noise Generator glsl Source Perlin / Simplex / FBM, frequency
Blend Node glsl Process 9 blend modes, opacity
Particle Node p5 Source Count, shape, speed, links

Tech Stack

Cascade generates and executes code across six polyglot engine types, all sharing a single WebGL2 compositing context.

GLSL / WebGL2
p5.js
Three.js
WebAudio API
Canvas 2D
CLIP ViT-B/32
Sentence-BERT
Knowledge Graph RAG
ml5.js (Tracking)
Agentic LLM Pipeline
Sugiyama Layout
Session JSON
Headless Node.js Validation
Hot-Reload Repair Loop

Open Source

The full Cascade system — agentic pipeline, browser runtime, node library, and session editor — is available on GitHub.

github.com/debayanmkrj/Cascade

debayanmkrj / Cascade