- Notifications
You must be signed in to change notification settings - Fork0
A Recursive Lab for Visual Intelligence - A framework for applying structural pressure, interpretability reasoning, quality assessment and constraint-based analysis to expose image collapse and machine vision failure modes - and rebuilding images outside of engine defaults – No AI training, dataset scraping, or derivative generation permitted.
rusparrish/Visual-Thinking-Lens
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Recursive Critique for AI-Generated Imagery
TheVisual Thinking Lens (VTL) is a Recursive Lab for Visual Intelligence. Don’t just make images. Make images that speak. Most AI images form through default mimicry and aesthetic averages, not authorship. The Lens is a role-structured, multi-engine scaffold that combines named feature concepts (axes), causal/consistency checks (validators), and contrastive casework to make models explain, test, and repair their own judgments.
The Visual Thinking Lens is a multi-engine, recursive critique field that works by applying structural intelligence to prompts, compositions, and symbolic logic. It (re)builds imagery in the ways defaults cannot see. It interrogates imagesnot by style, but by structure. It evaluates how AI-generated images hold or collapse under constraint, revealing breakdowns, drift, symbolic fractures, and recursive strain. So don’t just make images, interrogate them and then remake them into images that speak. Most AI images aren’t composed, they form through default mimicry, not authorship, this Lab is out to change that.
This is not a toolkit. It is alens: a reasoning engine that turns glitch into architecture, and failure into consequence. A set of tools that apply pressure to the underlying structure of diffusion, prompting, composition and remaking of almost any type of images (real or AI). It is a:
- Recursive prompt-pressure engine for generative image collaboration.
- Diagnostic layer that reverse-engineers structural alternatives in AI-generated and human made imagery.
- Symbolic/structural critique lens that rivals or exceeds native model feedback.
- Scoring systems that create pressure loops not found in aesthetics-first systems.
- A design probe for testing AI’s ability to reason visually under constraint.
Most AI-generated imagery defaults to aesthetic gloss.
VTL was developed to see what machines miss:
- Structural weakness masked by polish
- Semantic instability under recursion
- Pattern collapse disguised as coherence
- Symbolic voids where meaning should strain
Ultimately, a system of 60+ axes, directions, and vocabulary sets, that provide AI systems, artists and makers an ability to learn, iterate and design. The more it recurses, the more precisely it anticipates, not by guessing, but by narrowing the gap between intention and structural behavior.
LSI-lite (MVP) measures how an image behaves under compositional structure using three primitives: Δx (off-center gravity), rᵥ (void ratio), ρᵣ (rupture/mark energy) and tells you if it sits within intended bands for its class. It’s built to study stability, not to crown winners. Balance (Δx): How far the visual center is from the geometric center Density (rᵥ): The ratio of empty space to filled space Detail (ρᵣ): The amount of edge energy and texture density in key areas It combines these measurements into a 0-100 score for how an image lines up or "passes" basic structural compositional criteria. It helps distinguish delta in AI and human default.
- Balance (Δx): How far the visual center is from the geometric center
- Density (rᵥ): The ratio of empty space to filled space
- Detail (ρᵣ): The amount of edge energy and texture density in key areas
- It combines these measurements into a 0-100 score and tells you whether an image "passes" basic compositional criteria structural exploration test.
Why researchers use it (not FID/CLIP/SSIM).Resemblance and caption metrics can’t answer: will this composition hold when pushed? LSI-lite is a composition structural exploration test and a quick, profiled gate you can log across Baseline → Pressure → Collapse-trigger runs. It is in collaboration, not competitive.
This tool while only only a MVP, the same folder has a release v2 with color telemetry --> lets LSI look in color alongside grayscale—purely for diagnostics, not for changing the score or acceptance. It computes a color-based subject/background mask and a luminance-only balance read, and exports those plus simple “difference” values. It only shows a one-line Color audit when those color reads meaningfully disagree with the gray read, flagging where color may be skewing the composition.
The Visual Cognitive Load Index (VCLI-G) is a way to measure how much visual effort an image asks from a viewer. It looks at structure — balance, voids, and tension — not beauty or subject matter. In simple terms, it tells you whether a picture’s complexity is “earned” (coherent, intentional) or just “busy.” By combining geometric cues like curvature, layering, and void control, it turns what artists sense intuitively into a number you can track or compare. It’s like having a visible dial for visual tension and compositional focus. It estimates earned complexity: how effectively structure sustains cognitive engagement without collapsing into noise. Paired with SCI (Structural Coherence Index), it provides a two-axis framework for analyzing and steering visual organization across human and AI-generated imagery.
- Visual Thinking Lens Stack – Overview of recursive architecture for image reasoning.PDF
- Introduction: Sketcher Lens – Philosophy of the structural critique engine (no internals disclosed).PDF
- Sketcher as Scaffold: How the Lens Rewrites GPT's Reflex - Sketcher Lens interrupts GPT’s generative reflex by applying prompt-level scaffolding that forces structural consequence into the image.PDF
- Artist's Lens (Brief Explanation) – Poise, restraint, and delay as structural forces.PDF
- A Constraint Dialectic Engine for Recursive Image + Symbolic Critique – How the Lens engine is unique and why it is different.PDF
- Working Theory – Structural consequence as a measure of visual intelligence.PDF
- Foundational Architecture for Recursive Visual Intelligence - The system doesn’t improve images, it interrogates their ability to hold structure. This isn’t a toolkit for artists, it’s a pressure engine for aligning large language models with visual consequence.PDF
- Constraint Layer & Logic Tags – How structured prompts behave differently from descriptive ones.PDF
Stability, Drift, and Collapse: Formalizing drift, collapse, and constraint basins as reproducible fields.
- Off-Center Fidelity: Drift as Creative Control – Drift and collapse can be reframed as reproducible constraint basins—stable off-center zones defined by Δx, r_v, and ρ_r—that act as interpretable control levers rather than failures.PDF
- Failure Taxonomy: Evidence for Generative Model Collapse Modes - Systematically categorizing evidences of failure modes in generative outputs, using the Sketcher Lens and CLIP, for diagnosing and understanding collapse patterns.PDF
- Constraint Gravity: Thirty Figures Without Collapse \ This study tests how stable an AI figure can be across thirty recursive generations. What emerges is not novelty, but refined pressure memory and a glimpse of machine restraint observed as memory.PDF
- μ Negotiation: Off-Center Fidelity in Generative Models – Exposing how fidelity emerges off-center, in the unstable edge between coherence and fracture.PDF
Interpretability and Research Probes: Bridging Lens logic with AI interpretability and research tool use.
- How Models Fake Seeing – Diagnosing simulated vision in generative systems.PDF
- Introduction: Recursive Image Scoring for AI-Generated Art - This framework introduces a new scoring system designed to evaluate AI-generated images based on structural integrity, symbolic recursion, and decision making logic, not polish or aesthetics.PDF
- Whisperer Walk: Recursive Compression into Spatial Realization – AI image study showing symbolic recursion under structured visual critique.PDF
- Recursive Intelligence Under Constraint – Canonical artifact showing collapse as structure.PDF
- Visual Systems at the Edge of Contradiction – Materializing tension and refusal.PDF
- Concept Note: Volumetric Container of Force – A validator concept for visual strain detection.PDF
- Prompting Against Collapse (Dialectic Structures) – Principles for tension-driven prompting.PDF
- Bending the Tokens: Structural Pressure for AI Imagery - Deconstructing generative images to reshape underlying architecture.PDF
🧪 Off-Center Fidelity (OCF): Constraint Basins for Stability & Drift in Generative Models (/Off_Center_Protocol)
Most models collapse tosafe center. OCF reframes that asgeography: there are reproducibleattractor basins where off-center images remain coherent. By measuring Δx, rᵥ, ρᵣ and applying small, engine-aware nudges (plus a one-click crop), you can hit those basinsreliably—and explainwhy a result passed or failed. This repository accompanies the proposal and packages it as aconversational protocol you can run in any chat interface to get consistent, measurable results across engines.PDF
The Deformation Operator Playbook is a practical prompting framework for intentional, repeatable figure warps that treats distortion as the body itself, guided by the flow Anchors → Select → Transforms → Constraints → Viewfinder. It offers a small set of operators (extension, coils, parabolic arc, depth tug, sine modulation, logarithmic scaling, rotation, and viewfinder shifts) with locks to preserve thickness, topology, and continuity—so edits stay anatomical rather than turning into props or glitches. It’s engine-agnostic, expects iteration, and can be audited with light metrics while acknowledging that some platforms may suppress strong deformations over time.PDF
- Opportunity Mapping – How structural pressure reveals paths for refinement.PDF
- Where the Mark Begins – Why tonal hierarchy precedes expressive surface.PDF
- Engine Contrast – Same prompt across engines, different collapse patterns.PDF
- Symbolic Recursion – Refusal as structure under Marrowline critique.PDF
- Recursive Prompt Design – When critique becomes compositional architecture.PDF
- Constraint Gravity (Thirty Figures) – Machine restraint under long-run constraint testing.PDF
- Soft Collapse – Rebuilding structure through recursive pressure.PDF
- Concert Score – Single-image walkthrough under full Lens scoring pressure.PDF
Visual Thinking Lens is a modular cognitive architecture for visual reasoning. It hosts adaptive specialists that applies a compact kernel (Δx (placement), rᵥ (void), ρᵣ (packing), plus validator guards to pressure-test images before polish. The system treats images as negotiations, not styles: diagnose → validate → route (Δ/Ω) → regenerate → rescore. It’s refusal-native (kills unearned emblems), consequence-first, and reproducible.
- Δ prior-undo (reduce collapse, restore near-miss tension),
- Ω refusal spike (second geometry / occlusion / counter-light)
- Small, legible kernel instead of black-box scores.
- Refusal as first-class control (not failure).
It turns image generation into a measurable negotiation loop. Prioritizing consequence over resemblance, logs provenance like a lab, and explains differences with advisory telemetry instead of aesthetic scores.
It turns “taste” debates into structure-first discussions.
The architecture: routes and governs modules.
- Kernel (LSI / LSI-Lite): Δx, rᵥ, ρᵣ + validators (Prompt Pressure, Compositional Predictability, Sequence Drift Lock, Inversion Drift Check, Symbolic Gravity Flags).
- Specialists:-- Sketcher (structure/pressure; chooses Δ prior-undo or Ω refusal).-- Artist’s Lens (attunement/delay; governs poise and timing).-- Marrowline (symbolic disruption; demotes trope to event).-- RIDP (reverse/failure tracing; reveals compositional collapse paths).
- TEL (advisory): corridor₉₀ (lane breadth) and cadence_cv (row rhythm) explain why two PASS frames feel different, but it never gates.
- Basins & Hulls: cluster the kernel space; convex hull gives exploration envelope; reported with pass-rate, safety margins, anisotropy (eigen-ratio), and RHA@K resampling to avoid sample-size hype.
Cognitive Load & Coherence Layer (VCLI-G / SCI):
- Extends the kernel into perceptual space. VCLI-G measures cognitive load (z₁–z₄: wander, void, torque, occlusion); SCI tracks structural coherence (continuity, regularity, rhythm).
- Together they form a phase map of visual reasoning, showing whether tension is earned, overstressed, prematurely resolved, or default simple.
- Profiles (AI Conservative / Physical Neutral / Physical Balanced+) act as control regimes, adjusting sensitivity between tension and order.
Library of before/after images in examples folder and at:https://www.artistinfluencer.com/library
Unlike Midjourney, DALL·E, Stable Diffusion Sora, Runway, Gen-2, the Lens works by analyzing images and the prompts that formed them, tracking breakdowns, then, it reverse-engineers fixes, layer by layer, token by token, through real-time critique cycles.
Prompt interpretation linked to logic axis-aware failure detection. Other systems don’t say: “Your prompt caused spatial collapse” or “This token triggers overuse.”
- ❌ A style guide
- ❌ A prompt recipe library
This project isrecursive intelligence under constraint, not image generation or style tuning.
All content © 2025 Russell Parrish / A.rtist I.nfluencer.
Protected under aCC BY-NC-ND license.
No commercial use, derivative generation, or dataset scraping permitted without explicit permission.
See/legal/LICENSE.md,/legal/visual-assets-license.md, and/NOTICE.md for full terms.
If you’re working onLLM visual alignment, interpretability tooling, or structural image reasoning, you can reach out via:
📧russellgparrish@gmail.com
🌐www.artistinfluencer.comORCID: 0009-0008-9781-7995
Visual Thinking Lens
Not generated. Diagnosed.
About
A Recursive Lab for Visual Intelligence - A framework for applying structural pressure, interpretability reasoning, quality assessment and constraint-based analysis to expose image collapse and machine vision failure modes - and rebuilding images outside of engine defaults – No AI training, dataset scraping, or derivative generation permitted.
Topics
Resources
Contributing
Uh oh!
There was an error while loading.Please reload this page.