eferro's random stuff: Encoding Experience into AI Skills

I'd been tweaking my augmented coding setup for months - adjusting CLAUDE.md rules, adding instructions for testing discipline, complexity management, incremental delivery. Things I've repeated to every team I've worked with, now repeated to AI agents. It worked, but it felt like writing the same email over and over.

Then I found Lada Kesseler's skill-factory.

What Skills Are (And Why They Matter)

If you use Claude Code, you already know about CLAUDE.md - a file where you put instructions that the agent reads at the start of every conversation. It works. But it has a problem: everything is always loaded. Your TDD guidelines, your Docker best practices, your refactoring workflow - all of it competing for the agent's limited context window, whether it's relevant or not.

Skills solve this differently. They're packaged knowledge that activates only when relevant. You type /mutation-testing and the agent gains deep expertise about finding weak tests through mutation analysis. You type /complexity-review and it becomes a technical reviewer that challenges your proposals against 30 dimensions of complexity. The rest of the time, that knowledge stays out of the way.

Think of it as progressive disclosure for AI context. The agent gets what it needs, when it needs it.

The Discovery: Lada Kesseler's Skill Factory

Lada Kesseler built the skill-factory - a repository with 315 commits of carefully crafted skills covering serious engineering ground: TDD, Nullables (James Shore's pattern for testing without mocks), approval tests, refactoring (using Llewellyn Falco's approach), hexagonal architecture, event modeling, collaborative design, and more.

These aren't toy prompts. The Nullables skill alone includes reference material for infrastructure wrappers, embedded stubs, output tracking, and three different architectural patterns. The approval-tests skill covers Java, Python, and Node.js with scrubbers, reporters, and inline patterns. This is deep, carefully structured knowledge.

Lada also co-created augmented-coding-patterns - a catalog of 43 patterns, 14 obstacles, and 9 anti-patterns for working effectively with AI coding tools. It's a collaboration between Lada Kesseler, Ivett Ordog, and Nitsan Avni. If you're doing augmented coding and haven't seen it, stop reading this and go look.

What I found wasn't just a collection of skills. It was an approach to sharing engineering knowledge with AI agents that I hadn't seen anywhere else.

The Fork as Extension

The natural next step wasn't to start from scratch - it was to fork and extend. Lada's skills already covered testing fundamentals, design patterns, and AI-specific workflows. What I noticed missing were the practices I kept explaining repeatedly: how to manage complexity, how to deliver incrementally, how to make sure tests actually catch bugs.

So I added 11 skills. Not because 16 wasn't enough, but because my particular set of problems needed particular solutions.

Testing rigor

test-desiderata - Kent Beck's 12 properties that make tests valuable. Not "does this test pass?" but "is this test isolated? composable? predictive? inspiring?" I was tired of AI generating tests that had coverage but no diagnostic power. This skill makes the agent evaluate tests against each property and suggest concrete improvements.

mutation-testing - The question code coverage can't answer: "Would my tests catch this bug?" Coverage tells you what your tests execute. Mutation testing tells you what they'd detect. I'd already written a blog post about this - now it's a reusable skill. The examples are in Python and JavaScript, but I'm also using it successfully with Go.

Delivering incrementally and managing complexity

This is where the skills chain together, and where things get interesting.

story-splitting - Detects linguistic red flags in requirements ("and", "or", "manage", "handle", "including") and applies splitting heuristics. It's the first pass: is this story actually three stories wearing a trenchcoat?

hamburger-method - When a story doesn't have obvious split points but still feels too big, this skill applies Gojko Adzic's Hamburger Method: slice the feature into layers, generate 4-5 implementation options per layer, then compose the thinnest possible vertical slices.

small-safe-steps - The implementation planner. Takes any piece of work and breaks it into 1-3 hour increments using the expand-contract pattern for migrations, schema changes, API changes. Core belief: risk grows faster than the size of the change.

complexity-review - My inner skeptic, encoded. Reviews technical proposals against 30 dimensions of complexity across 6 categories (data volume, interaction frequency, consistency requirements, resilience, team topology, operational burden). Pushes for the simplest viable approach. Use it when someone says "Kafka" and you want to ask "why not a queue?"

code-simplifier - Reduces complexity in existing code without changing behavior. The cleanup crew after a feature is done.

These five skills work as a pipeline: story-splitting -> hamburger-method -> small-safe-steps for delivery planning, with complexity-review as a gate before implementation and code-simplifier as a sweep after.

Practical tools and team workflows

thinkies - Kent Beck's creative thinking habits, turned into a skill. When you're stuck, it applies patterns like "What would I do if I had infinite resources?", "What's the opposite of my current approach?", "What would make this problem trivial?" It's less about code and more about unsticking your thinking.

traductor-bilingue - Technical translation between English and Spanish that keeps terms like "deploy", "pull request", "pipeline", and "staging" in English (because that's how Spanish-speaking dev teams actually talk). Small thing, but it saves constant corrections.

dockerfile-review - Reviews Dockerfiles for build performance, image size, and security issues.

modern-cli-design - Principles for building scalable CLIs: object-command architecture (noun-verb), LLM-optimized help text, JSON output, concurrency patterns.

A Skill in Action

To make this concrete, here's what the delivery planning pipeline looks like in practice.

Say you have a story: "As a user, I want to manage my notification preferences including email, SMS, and push notifications with scheduling and quiet hours."

Step 1 - You invoke /story-splitting. The agent immediately flags "manage", "including", and the conjunction "and" joining three notification types plus scheduling. It suggests splitting into at least 4 stories: one per notification channel plus quiet hours as a separate slice.

Step 2 - You take the first slice ("email notification preferences") and invoke /hamburger-method. It breaks the feature into layers (UI, API, business logic, persistence) and generates options for each. For the UI layer: (a) full settings page, (b) single toggle, (c) link to email with confirmation, (d) inline in profile. It composes the thinnest vertical slice: a single toggle with an API endpoint and a database flag.

Step 3 - You invoke /small-safe-steps on that thin slice. It produces a sequence of 1-3 hour steps: add the database column with a migration, add the API endpoint with tests, add the UI toggle, wire it together. Each step deployable independently.

No single skill does everything. They compose. That's the point.

How to Get Started

If you want to try these:

Fork the repo: github.com/lexler/skill-factory (or my fork if you want the extra skills)
Install skills: The repo includes a skills CLI tool. Run ./skills toggle to browse and select which skills to install into your Claude Code setup.
Use them: Type /skill-name in Claude Code. /mutation-testing to check your tests. /complexity-review to challenge a design. /small-safe-steps to plan your next implementation.
Make your own: The repo includes documentation and tooling for creating new skills. Fork it, add what you need, share it back.

Standing on Shoulders

The total is 329 commits, 27 skills across 6 categories. But the number that matters most is that Lada built 315 of those commits. I added 14. The original structure, the skill manager, the testing and design skills that form the foundation - that's all her work. What I did was extend it with the practices I personally find myself repeating.

This is how open source has always worked: someone builds something good, others extend it, and the whole thing becomes more useful than any individual could make it. With AI skills, the effect compounds differently - every skill that gets shared becomes available to every person using it, making good practices almost free.

Lada's augmented-coding-patterns site (with Ivett Ordog and Nitsan Avni) takes this even further - it's not just tooling but a shared vocabulary for how we work with AI. Skills, patterns, obstacles, anti-patterns: a growing body of community knowledge.

What knowledge do you find yourself repeating to your AI agents? What practices would you encode as skills?

The barrier to sharing isn't technical anymore. It's deciding to do it.

Páginas

Sunday, March 01, 2026

Encoding Experience into AI Skills