Contributing to Velais Agent Skills
Quick Setup
git clone <repo-url> && cd velais-skills
uv sync # Core dependency (pyyaml)
uv sync --group dev # Optional: pre-commit + ruff
uv run pre-commit install # Optional: install git hooks
Before You Start
- Have a real use case. Skills solve repeatable problems. If you can’t name 2-3 concrete scenarios where this skill saves time, it’s not ready.
- Test it yourself first. Iterate on a single challenging task until Claude succeeds, then extract the winning approach into a skill.
- Check for overlap. Run
uv run scripts/detect_overlaps.pyto see if similar triggers exist. If something overlaps, consider improving the existing skill instead.
Submission Process
1. Scaffold Your Skill
Use the scaffolder to create a properly structured skeleton:
bash scripts/new_skill.sh your-skill-name --owner your-name --category workflow-automation
This creates:
skills/your-skill-name/
├── SKILL.md # Required — edit the TODO placeholders
├── tests/
│ ├── test-cases.yml # Required — add real test cases
│ └── evals.json # Optional — LLM eval definitions
You can also add optional directories: scripts/, references/, assets/.
2. Write Test Cases
Every skill must include tests/test-cases.yml. This file defines how the skill should trigger, what it should produce, and what it should NOT do.
Tip: Avoid vague assertions like “works correctly” or “functions as expected” — the validator flags these. Be specific about what the skill should do.
3. (Recommended) Add LLM Evals
Create tests/evals.json to enable automated LLM-based evaluation. Evals are optional but strongly recommended for complex skills.
4. Validate Locally
uv run scripts/validate_skill.py skills/your-skill-name
Score must be >= 70 to pass. Fix any errors before opening a PR.
5. Open a Pull Request
Use the PR template (auto-loaded). The template requires:
- Use case descriptions
- Evidence of testing
- Before/after comparison
- Owner assignment
6. Automated Review
CI runs automatically on every PR touching skills/**:
- Structural validation — file naming, YAML parsing, required files
- Content quality analysis — description quality, instruction clarity, trigger coverage
- Anti-slop checks — instruction coherence, trigger diversity, assertion specificity
- Security scan — secrets detection, injection patterns, reserved names
- Overlap detection — checks for trigger conflicts with existing skills
- Score tracking — compares against previous score if updating an existing skill
- Quality score — composite 0-100 score with per-dimension breakdown
Skills scoring below 70 are blocked from merge.
Updating an Existing Skill
- Bump
metadata.versionin frontmatter - Update test cases if behavior changed
- Note changes in PR description
- CI re-runs full quality gate and shows score diff