▲ loworbit.ai

a workshop in public

daniel kinney-spears

anduril · aws alum

every prototype below was built by an ai agent from a one-page prd. no agency. no templates. no hand-tuning. the code is in the open. the prompts are in the open. this is either the future or a cautionary tale. we'll see.

live nowsurface · panel · encyclopedia · drift · capture

in flightcapture is moving

last spawn19h ago

deploy16f5d5b

agents run15 total

— now → 2026-04-16 —

full /now →

april 2026 — building loworbit. building panel (ai evaluation rooms) and iterating on surface. new at anduril. working from seattle. reading a short stories and fiction to learn things.

permalink → /now/2026-04-16

— log —

full /log →

2026-04-16retired hum
2026-04-16capture iteration shipped
2026-04-16capture phase 1 on c1n RFI: 23 obligations extracted in one shot, no retries. agent correctly excluded the policy quotes (DAF CIO, EO on cybersecurity) and the govt-side liability disclaimers — better grounding than predicted. weaknesses: 18/23 classified as "administrative" (a slot not in the PRD enum), confidence flatlined at 0.85 across half the rows, multi-part questions bundled into single obligations (OBL-002 packs 5 company-info sub-questions), and confidence shipped as floats instead of the PRD's string enum without the validator catching it. agent invented its own taxonomy slot AND its own confidence type, and the schema validator silently accepted both. this is the failure mode the PRD called "json schema violations the agent confidently asserts are fine" — except it's worse, because the violation IS the schema now.
2026-04-16capture shipped — multi-phase capture analysis prototype, built through the loworbit pipeline. four phases with human checkpoints between, schema-validated json + grounding-validated citations enforced between phases, full anthropic transcripts saved per run. schema review caught the agent using supabase auth's auth.uid() in rls policies; doesn't apply with clerk, would have silently returned zero rows on every client read. fixed to match the hub's permissive-read + service-role-write pattern. first real session next.
2026-04-16capture went live
2026-04-16spawned capture

— in the workshop —

surfacea web-based compositor for building live, data-connected presentations from react components — then presenting or exporting them.
panela mock evaluation panel that will give you multiple (realistic and useful) perspectives on any sales document.
encyclopediaa shared horror wiki that generates itself as people read it.
drifta writing space that gives you one prompt per day, lets you write into it, and then throws your writing away when you leave. no saves.
capturemulti-phase capture analysis on a real solicitation, with checkpoints between phases

each links to /tools/[slug] for context and try-it.

— the method —

write a prd → spawn → agent builds → checker runs → auto-merge → deploy → i evaluate → i iterate, or scrap.

the agent runs fast. the interesting part is what i kept, what i cut, and what i scrapped between deploy and the next prd. that's where the judgment lives.

— what i'm trying to figure out —

▸which evals would actually tell you whether a frontier model can do real capture-grade work
▸how to keep an agent's state coherent across multi-phase analysis without losing the thread by phase three
▸what the checker should catch that it currently doesn't, read off the failures the agent shipped past it

writeups land here as the work does.

— some stuff i've done —

ai capture + proposals toolset @ aws

10,000+ docs processed. est. $2–4m in direct labor savings.

content management overhaul @ aws

supports $10b/yr in business. ~90% reduction in maintenance load.

global genai training @ aws

200+ people trained across sales, capture, and proposal teams.

genai challenge @ aws

60+ micro-apps produced. best of them shipped as a curated toolset.

design + layout team @ aws

launched a new team for custom, customer-focused proposal design. handed off, still running.

— contact —

linkedin ↗·email·design system·privacy