The Makefile is one of the most durable tools in software. It has survived 50 years because the model is right. But the implementation is broken for how we build today.
Every project has a Makefile — or a pile of shell scripts, or a package.json
overloaded with scripts, or a justfile, or a Taskfile.yml.
They all share the same fundamental problem: they describe implementation,
not intent.
You come back to a project after six months and find a deploy target
that is 40 lines of bash. You cannot tell what environment it targets, whether
it runs tests first, or what happens if the build fails. The commands are the documentation,
and the documentation is useless.
We use Makefiles as a record of what we figured out once. We encode the how because it was hard to figure out, and we don't want to do it again. But the how goes stale. The intent doesn't.
The other problem: in a world where AI writes most of the implementation code, why are we still hand-crafting shell pipelines for our own workflows? The most brittle part of the codebase — the glue scripts — is still written by hand.
A Vibefile looks and works like a Makefile. The dependency model is identical. Variables work. Targets work. The only thing that changes is the recipe — instead of shell commands, you write a plain-English description of what you want to happen.
# The Makefile for the AI era. # Same model. Plain English recipes. PROJECT = my-saas-app ENV = production seed: build "populate the database with realistic fake data for 10 users" ship: test build "run tests, build, and deploy to $(ENV) on fly.io" @require clean tests test: "run the full test suite and surface any failures" build: "compile and bundle the project for $(ENV)"
When you run vibe run ship, the CLI reads the Vibefile, collects
relevant context from your repo — file tree, package.json, configs, existing
scripts — and sends it to an LLM alongside the task description. The model
generates the correct shell commands for your specific repo and
executes them with live stdout streaming.
A Vibefile is the runbook. A new teammate can read it and understand everything the project does without running a single command.
Fifty years of practice. Targets, prerequisites, variables. The Vibefile keeps all of this — just replaces the brittle shell recipes.
Not every task needs the same thing. A simple build task just needs correct shell commands. A deployment might need to query live infrastructure and adapt. A shared workflow might already be solved. Vibefile supports all three.
AI reads the repo, generates the correct shell commands, executes them. Fast, stateless, works anywhere. The default mode for most tasks.
"quoted intent"AI becomes the executor — with real tool access to services like Fly.io, GitHub, or Postgres. Can inspect state, retry, and adapt mid-task.
@mcp fly-mcp, gh-mcpCommunity-published or local task definitions. Import a skill for common workflows and skip writing the prompt entirely.
skill: python-testThe trust model is deliberately different for each mode. Codegen is auditable — you can inspect the generated shell before it runs. Agent mode is more powerful but requires explicit MCP declarations. The Vibefile makes the execution model visible at a glance.
The real unlock is treating MCP servers as first-class citizens in the task graph.
A target that declares @mcp doesn't generate a shell script —
it spins up an agent with actual tool access to external services.
# @mcp turns a task description into an agentic loop # with real tool access — not just shell output deploy: build "deploy the current build to production, verify it came up" @mcp fly-mcp review: "open a PR, write a summary of changes, assign the team" @mcp github-mcp, linear-mcp db-backup: "snapshot production database and upload to the S3 bucket" @mcp aws-mcp, postgres-mcp # without @mcp — just generate and run shell migrate: build "run any pending database migrations safely"
When generated shell fails on line 3, you're stuck. An MCP-backed agent can inspect the error, retry with a different approach, and report back. The task description becomes a goal, not just a prompt.
Some workflows are solved problems. Deploying a Python package to PyPI,
running a Django migration, publishing a Docker image — the intent is always
the same, only the repo details change. A Skill is a directory containing
a SKILL.md file: structured instructions that tell the AI
exactly how to execute a class of task.
When a Vibefile target declares skill:, the CLI loads that
SKILL.md and uses it as the instruction set for the LLM call —
instead of sending a plain prompt you wrote. The skill author has already done
the prompt engineering. You just reference the skill by name.
# use a skill instead of writing a prompt test: skill: python-test deploy: test build skill: fly-deploy @mcp fly-mcp # mix a skill with extra instructions for this repo release: test skill: python-publish "also update the changelog and tag the commit"
--- name: python-test description: Run a Python test suite with pytest, surface failures, and report coverage. Use for any Python project with tests. --- 1. Detect the test runner (pytest, unittest) from pyproject.toml 2. Run tests with coverage enabled 3. Surface failures with file + line context 4. Exit non-zero if any tests fail
Skills are resolved in order: a skills/ folder in the repo
(project-specific), then ~/.vibe/skills/ (your personal library),
then a community registry at vibefile.dev/skills. A skill is just
a directory — readable, forkable, and improvable by anyone.
Every team that figures out "how do we deploy a Next.js app to Fly.io" can publish that as a skill. The next team skips the prompt engineering entirely. Skills get better as more people use and improve them.
Vibefile is model-agnostic. You declare which model to use in the Vibefile itself — it's not a secret, so it belongs in the file alongside the tasks. The API key is a different story: that never goes in the Vibefile, which lives in git.
# declare the model in the Vibefile — it's not a secret model: claude-sonnet-4-6 project: my-saas-app ship: test build "deploy to production on fly.io"
The API key is resolved from the environment, in this order: the
VIBE_API_KEY env var, then the provider-specific key
(ANTHROPIC_API_KEY, OPENAI_API_KEY, etc.) inferred
from the model name, then a ~/.vibeconfig global file.
The Vibefile itself never touches a key.
# global config — never committed default_model: claude-sonnet-4-6 anthropic_key: sk-ant-... openai_key: sk-...
Resolution order for everything is: CLI flag → env var → Vibefile → ~/.vibeconfig → built-in default.
This means you can always override at runtime without touching the file —
useful for CI, where you'd set ANTHROPIC_API_KEY as a secret
and let the Vibefile's declared model do the rest.
Set ANTHROPIC_API_KEY as a repo secret. The Vibefile's
declared model is picked up automatically. No extra config needed.
A cheap fast model for codegen tasks. A more capable model for complex
agentic tasks. Each target can declare its own model: override.
The AI generates correct commands on the first try because it understands your specific repo. That's the magic — and it only works if the context collector is smart about what to send.
Sending the entire codebase is slow and expensive. Sending too little means the model generates generic commands that don't match your stack. The collector needs to know what's relevant for each task.
# What the context collector sends for `vibe run deploy` file_tree: top-level structure (always) package_json: existing scripts, deps, engines fly_toml: detected because task = deploy dockerfile: detected because task = deploy existing_make: if present — steal known-good patterns git_status: uncommitted changes, current branch # What it does NOT send src/ not relevant to a deploy command tests/ only relevant for `vibe run test` node_modules/ never
The collector is task-aware: a seed task pulls in schema files and
seed script patterns. A test task pulls in the test config. A
deploy task pulls in infrastructure configs. This keeps prompts
small and results accurate.
The repo is live on GitHub. Join the Discord to shape the spec, discuss the design, and be first to test the CLI.