CLI Reference

Complete reference for Skillet command-line tools.

eval

Run evaluations against a baseline or with a skill loaded.

bash

skillet eval <name> [skill] [options]

Arguments

Argument	Required	Description
`name`	Yes	Eval set name, directory path, or single `.yaml` file
`skill`	No	Path to skill directory (omit for baseline)

The name argument accepts:

A name: looks in ~/.skillet/evals/<name>/
A directory path: loads all .yaml files recursively
A single .yaml file path

Options

Flag	Short	Type	Default	Description
`--samples`	`-s`	int	3	Number of iterations per eval
`--max-evals`	`-m`	int	all	Maximum evals to run (randomly sampled)
`--tools`		str	all	Comma-separated list of allowed tools
`--parallel`	`-p`	int	3	Number of parallel workers
`--skip-cache`		bool	false	Skip reading from cache (still writes)
`--trust`		bool	false	Skip confirmation for setup/teardown scripts
`--no-summary`		bool	false	Skip the failure summary LLM call

Examples

bash

# Baseline eval (no skill)
skillet eval browser-fallback

# Eval with skill
skillet eval browser-fallback ~/.claude/skills/browser-fallback

# Single file
skillet eval ./evals/my-skill/001.yaml skill/

# More samples for statistical confidence
skillet eval my-skill -s 5

# Random sample of 5 evals
skillet eval my-skill -m 5

# More parallelism
skillet eval my-skill -p 5

# Force fresh runs (ignore cache)
skillet eval my-skill --skip-cache

# Skip script confirmation prompts
skillet eval my-skill --trust

# Skip failure summary (saves one LLM call)
skillet eval my-skill --no-summary

# Restrict available tools
skillet eval my-skill --tools "Read,Write,Bash"

tune

Iteratively improve a skill until target pass rate or max rounds.

bash

skillet tune <name> <skill> [options]

Arguments

Argument	Required	Description
`name`	Yes	Eval set name or path
`skill`	Yes	Path to skill directory

Options

Flag	Short	Type	Default	Description
`--rounds`	`-r`	int	5	Maximum tuning rounds
`--target`	`-t`	float	100.0	Target pass rate percentage
`--samples`	`-s`	int	1	Samples per eval per round
`--parallel`	`-p`	int	3	Number of parallel workers
`--output`	`-o`	path	auto	Custom output path for results JSON

How It Works

Runs all evals against the current skill
Records pass rate and failure details
If pass rate < target, uses DSPy optimization to generate improved skill
Writes improved skill to disk
Repeats until target reached or max rounds

Results are saved to ~/.skillet/tunes/<name>/<timestamp>.json (or custom path with -o).

Examples

bash

# Default tuning (5 rounds, 100% target)
skillet tune conventional-comments ~/.claude/skills/conventional-comments

# Aim for 80% pass rate
skillet tune conventional-comments skill/ -t 80

# More rounds
skillet tune conventional-comments skill/ -r 10

# Custom output file
skillet tune conventional-comments skill/ -o results.json

# Multiple samples for more reliable pass rates
skillet tune conventional-comments skill/ -s 3

create

Create a new skill from captured evals.

bash

skillet create <name> [options]

Arguments

Argument	Required	Description
`name`	Yes	Eval set name (from `~/.skillet/evals/<name>/`)

Options

Flag	Short	Type	Default	Description
`--dir`	`-d`	path	`~`	Base directory for output
`--prompt`		str	none	Extra prompt to customize generation

Output is written to <dir>/.claude/skills/<name>/SKILL.md.

Examples

bash

# Create in home directory (default)
skillet create browser-fallback
# Creates ~/.claude/skills/browser-fallback/SKILL.md

# Create in current directory
skillet create browser-fallback -d .
# Creates ./.claude/skills/browser-fallback/SKILL.md

# Create in specific project
skillet create browser-fallback -d /path/to/project

# Add extra instructions
skillet create browser-fallback --prompt "Be concise, max 20 lines"

generate-evals

Generate candidate eval files from a SKILL.md.

bash

skillet generate-evals <skill> [options]

Arguments

Argument	Required	Description
`skill`	Yes	Path to skill directory or SKILL.md file

Options

Flag	Short	Type	Default	Description
`--output`	`-o`	path	auto	Output directory for candidate files
`--max`		int	5	Max evals per category
`--domain`	`-d`	str	all	Filter to specific domain(s): `triggering`, `functional`, `performance`

Domains

Each generated eval is tagged with a domain indicating what aspect of the skill it tests:

Domain	Description
`triggering`	Does the skill activate when it should (and stay silent when it shouldn't)?
`functional`	Does the skill produce correct output once triggered?
`performance`	Does the skill meet quality, latency, or efficiency expectations?

By default all domains are generated. Use --domain to filter to specific ones. The flag can be repeated to select multiple domains.

Examples

bash

# Generate from skill directory
skillet generate-evals ~/.claude/skills/browser-fallback

# Custom output location
skillet generate-evals skill/ -o ./my-evals/

# Limit to 3 per category
skillet generate-evals skill/ --max 3

# Only triggering evals
skillet generate-evals skill/ -d triggering

# Triggering and functional evals
skillet generate-evals skill/ -d triggering -d functional

lint

Lint a SKILL.md file for common issues.

bash

skillet lint <path> [options]

Arguments

Argument	Required	Description
`path`	Yes*	Path to a SKILL.md file (*not required with `--list-rules`)

Options

Flag	Type	Default	Description
`--no-llm`	bool	false	Skip LLM-assisted lint rules
`--list-rules`	bool	false	List all available rules and exit

Rules

Skillet ships with 14 built-in rules across four categories:

Naming

Rule	Severity	Description
`filename-case`	warning	Skill file must be named exactly `SKILL.md`
`folder-kebab-case`	warning	Skill folder name must be kebab-case
`name-kebab-case`	warning	Name field must be kebab-case
`name-matches-folder`	warning	Name field must match the parent folder name
`name-no-reserved`	error	Name must not contain `claude` or `anthropic`

Structure

Rule	Severity	Description
`frontmatter-valid`	warning	Frontmatter must include `name` and `description`
`frontmatter-delimiters`	error	YAML frontmatter must have `---` delimiters
`frontmatter-no-xml`	error	No XML angle brackets (`< >`) in frontmatter
`description-length`	warning	Description must be under 1,024 characters
`body-word-count`	warning	Skill body should be under 5,000 words
`no-readme`	warning	No `README.md` inside skill folder

Fields

Rule	Severity	Description
`field-license`	warning	Recommended `license` field should be present
`field-compatibility`	warning	Recommended `compatibility` field should be present
`field-metadata`	warning	Recommended `metadata` field should be present

Exit Codes

Code	Meaning
0	No issues found
1	One or more findings
2	Error (file not found, invalid input)

Examples

bash

# Lint a skill
skillet lint ~/.claude/skills/my-skill/SKILL.md

# Lint without LLM-assisted checks
skillet lint --no-llm path/to/SKILL.md

# List all available rules
skillet lint --list-rules

compare

Compare baseline vs skill results from cache.

bash

skillet compare <name> <skill>

Arguments

Argument	Required	Description
`name`	Yes	Eval set name or path
`skill`	Yes	Path to skill directory

Run skillet eval <name> and skillet eval <name> <skill> first to populate the cache.

Examples

bash

skillet compare browser-fallback ~/.claude/skills/browser-fallback

show

Inspect cached eval results without re-running any evals.

bash

skillet show <name> [options]

Arguments

Argument	Required	Description
`name`	Yes	Eval set name or path

Options

Flag	Short	Type	Default	Description
`--eval`	`-e`	str	none	Show detailed results for a specific eval file
`--skill`	`-s`	path	none	Show results with a skill loaded instead of baseline

Examples

bash

# Show cached baseline results
skillet show browser-fallback

# Show results with a skill
skillet show browser-fallback --skill ~/.claude/skills/browser-fallback

# Show detailed results for a specific eval
skillet show browser-fallback --eval 001.yaml

# Combine: specific eval with skill
skillet show browser-fallback --skill path/to/skill --eval 001.yaml

Environment Variables

Variable	Default	Description
`SKILLET_DIR`	`~/.skillet`	Base directory for evals, cache, and tune results

Exit Codes

Code	Meaning
0	Success
1	Error (invalid arguments, missing files, etc.)

CLI Reference ​

eval ​

Arguments ​

Options ​

Examples ​

tune ​

Arguments ​

Options ​

How It Works ​

Examples ​

create ​

Arguments ​

Options ​

Examples ​

generate-evals ​

Arguments ​

Options ​

Domains ​

Examples ​

lint ​

Arguments ​

Options ​

Rules ​

Naming ​

Structure ​

Fields ​

Exit Codes ​

Examples ​

compare ​

Arguments ​

Examples ​

show ​

Arguments ​

Options ​

Examples ​

Environment Variables ​

Exit Codes ​

CLI Reference

eval

Arguments

Options

Examples

tune

Arguments

Options

How It Works

Examples

create

Arguments

Options

Examples

generate-evals

Arguments

Options

Domains

Examples

lint

Arguments

Options

Rules

Naming

Structure

Fields

Exit Codes

Examples

compare

Arguments

Examples

show

Arguments

Options

Examples

Environment Variables

Exit Codes