Parallel Reality

LLM Test Harness Load a scenario → edit response → send to LLM → compare
Select a scenario from the sidebar
Select a failure scenario Each scenario contains the full conversation history leading up to an AI response that Leonard disliked.

You can edit the AI's response (or any part of the conversation) and re-send to the LLM to test behavioral changes.