▲ loworbit.ai
a workshop in public
daniel kinney-spears
anduril · aws alum
every prototype below was built by an ai agent from a one-page prd. no agency. no templates. no hand-tuning. the code is in the open. the prompts are in the open. this is either the future or a cautionary tale. we'll see.
— now → 2026-04-16 —
full /now →april 2026 — building loworbit. building panel (ai evaluation rooms) and iterating on surface. new at anduril. working from seattle. reading a short stories and fiction to learn things.
permalink → /now/2026-04-16— log —
full /log →- 2026-04-16retired hum
- 2026-04-16capture iteration shipped
- 2026-04-16capture phase 1 on c1n RFI: 23 obligations extracted in one shot, no retries. agent correctly excluded the policy quotes (DAF CIO, EO on cybersecurity) and the govt-side liability disclaimers — better grounding than predicted. weaknesses: 18/23 classified as "administrative" (a slot not in the PRD enum), confidence flatlined at 0.85 across half the rows, multi-part questions bundled into single obligations (OBL-002 packs 5 company-info sub-questions), and confidence shipped as floats instead of the PRD's string enum without the validator catching it. agent invented its own taxonomy slot AND its own confidence type, and the schema validator silently accepted both. this is the failure mode the PRD called "json schema violations the agent confidently asserts are fine" — except it's worse, because the violation IS the schema now.
- 2026-04-16capture shipped — multi-phase capture analysis prototype, built through the loworbit pipeline. four phases with human checkpoints between, schema-validated json + grounding-validated citations enforced between phases, full anthropic transcripts saved per run. schema review caught the agent using supabase auth's auth.uid() in rls policies; doesn't apply with clerk, would have silently returned zero rows on every client read. fixed to match the hub's permissive-read + service-role-write pattern. first real session next.
- 2026-04-16capture went live
- 2026-04-16spawned capture
— in the workshop —
- surfacea web-based compositor for building live, data-connected presentations from react components — then presenting or exporting them.
- panela mock evaluation panel that will give you multiple (realistic and useful) perspectives on any sales document.
- encyclopediaa shared horror wiki that generates itself as people read it.
- drifta writing space that gives you one prompt per day, lets you write into it, and then throws your writing away when you leave. no saves.
- capturemulti-phase capture analysis on a real solicitation, with checkpoints between phases
each links to /tools/[slug] for context and try-it.
— the method —
write a prd → spawn → agent builds → checker runs → auto-merge → deploy → i evaluate → i iterate, or scrap.
the agent runs fast. the interesting part is what i kept, what i cut, and what i scrapped between deploy and the next prd. that's where the judgment lives.
— what i'm trying to figure out —
- ▸which evals would actually tell you whether a frontier model can do real capture-grade work
- ▸how to keep an agent's state coherent across multi-phase analysis without losing the thread by phase three
- ▸what the checker should catch that it currently doesn't, read off the failures the agent shipped past it
writeups land here as the work does.
— some stuff i've done —
ai capture + proposals toolset @ aws
10,000+ docs processed. est. $2–4m in direct labor savings.
content management overhaul @ aws
supports $10b/yr in business. ~90% reduction in maintenance load.
global genai training @ aws
200+ people trained across sales, capture, and proposal teams.
genai challenge @ aws
60+ micro-apps produced. best of them shipped as a curated toolset.
design + layout team @ aws
launched a new team for custom, customer-focused proposal design. handed off, still running.