Why do bad LLM prompts waste tokens?

A short prompt forces the model to explore the repo, ask clarifying questions, and repeat context from earlier messages. Each turn re-sends chat history and file contents. Most of that input never needed to be in the window if the ticket, decisions, and docs were attached up front.

What is implementation.md from Lem AI?

implementation.md is a markdown file Lem AI writes on your git branch when the branch name matches a Jira or ClickUp ticket ID. It summarizes the ticket, related Slack threads, meeting notes, and linked documents so coding agents like Cursor or Claude can implement with full context.

How much can token waste cost a team?

Costs depend on model and volume, but teams routinely see multi-thousand-token overhead per feature from clarification loops, irrelevant file reads, and restated requirements. At API pricing, that is often tens of dollars per engineer per week—not counting wrong outputs you discard.

Does a better prompt replace implementation.md?

No. A better prompt helps, but it cannot include what you do not remember. implementation.md pulls live context from tools your team already uses, with citations. You still write a short prompt; the file carries the depth.

How do I get implementation.md on my branch?

Install the Lem AI CLI from getlem.ai, connect Slack, Jira, GitHub, and Confluence, then git checkout -b a branch that includes your ticket ID (for example feature/ENG-412-billing). Lem AI generates implementation.md on that branch. See the Implementation Agent feature page for the full workflow.

Your LLM can only be as good as your prompt

A one-line prompt like “fix the billing webhook” is not cheap. It is an invitation for your coding agent to burn thousands of tokens guessing scope, re-opening Slack, and re-reading files that were never tied to the ticket. Your LLM can only be as good as the context behind your prompt—and most prompts ship without that context.

Why do bad LLM prompts waste so many tokens?

Coding agents bill on input and output tokens. Input grows with every message, every file you @-mention, and every tool schema loaded into the session. When the prompt omits ticket scope, the model fills the gap by searching—and search in a large repo is one of the most expensive operations you can trigger.

Clarification loops: “Which Jira ticket?” “What was the acceptance criteria?”—each reply re-sends the whole thread.
Blind repo exploration: the agent opens dozens of files to find what your team already wrote in Confluence or Slack.
Repeated requirements: you paste the same Slack summary three times because it was never in the initial context.
Context rot: very long windows reduce recall; irrelevant code dilutes attention on the files that matter.
Verbose outputs: the model explains its plan at length before writing code—output tokens cost more than input on many APIs.

Industry write-ups on AI-assisted development describe a large share of agent input as low-signal for the task at hand. You do not need a perfect benchmark to feel it: watch a session where the model reads half the monorepo before touching the right package.

How much can vague coding prompts cost a team?

Think in sessions, not single requests. A feature that should take one grounded conversation often becomes five: discovery, clarification, wrong path, reset, retry. If each turn carries 50k–150k input tokens because history and files accumulate, you are paying for the same story repeatedly.

5 extra turns × 80k input tokens ≈ 400k tokens of repeated context per feature.
At $3 per million input tokens (typical mid-tier API list price), that is roughly $1.20 per feature before output—per engineer.
Ten engineers, four features a week: hundreds of dollars a week in avoidable input alone.
Wrong-first-attempt output: you pay twice—once for code you throw away, once for the fix.

The point is not to scare you with arithmetic. It is to treat the context window like a budget. Every token you add should earn its place. A thin prompt externalizes that cost to the model’s guessing.

What makes a good prompt for Cursor, Claude, or Copilot?

A good coding prompt is not clever wording. It is a pointer to verified context: what the ticket says, what was decided in Slack, what changed in related PRs, and what must not break. The model should start from evidence, not memory.

Ticket scope in plain language

Include the ticket ID, acceptance criteria, edge cases, and links to designs or API specs. If the work spans services, name them. If the ticket is wrong, fix the ticket first—LLMs amplify bad requirements efficiently.

Decisions and constraints

Paste or reference the Slack thread where someone said “we must keep backward compatibility for v1 clients.” Note security or compliance constraints. These lines are cheap in tokens and expensive when missing.

Stable, scoped file context

Prefer a short list of files and docs over “read the whole repo.” Better: attach one file that already merged ticket + threads + docs. That is the role of implementation.md.

What is implementation.md on a git branch?

implementation.md is a single markdown file Lem AI (getlem.ai) places on your branch when you check out work tied to a Jira or ClickUp ticket. It is not a replacement for your prompt—it is the attachment your prompt should reference.

Ticket title, description, and comments
Relevant Slack threads and decisions
Google Meet or other meeting notes linked to the work
Confluence, Google Drive, or Document Hub pages referenced in the ticket
Optional links to related GitHub PRs or prior changes

You prompt: “Implement according to implementation.md on this branch.” The model starts with the why, not a repo-wide treasure hunt.

Related on Lem AI

Lem AI Implementation Agent

Full product page: branch detection, context sources, CLI workflow, and how implementation.md is generated for Cursor, Claude, and Antigravity.

How does Lem AI build implementation.md from Jira or ClickUp?

Lem AI watches git branch creation. When the branch name includes a ticket key your workspace recognizes (for example feature/ENG-412-billing or eng-412-fix-webhook), the Implementation Agent pulls context from connected tools and writes implementation.md into the branch workspace.

Connect GitHub, Slack, Jira or ClickUp, Confluence, and Google Meet in getlem.ai.
Install the Lem AI CLI and link your repository to your organization workspace.
Create a branch whose name contains the ticket ID you are implementing.
Open implementation.md—review citations, then attach it to your coding agent.
Ship code with fewer clarification rounds because scope was loaded once, upfront.

The file updates the institutional memory graph too, so onboarding search and compliance workflows on the same platform can reuse that context later. For this article, the focus is token savings on implementation: one grounded file beats five vague prompts.

How to save tokens on your next feature branch

Treat these as habits, not hacks. They align with how Semrush and other SEO guides describe helpful content: answer the question first, then add detail, lists, and clear next steps.

Name branches after ticket IDs so Lem AI can auto-build implementation.md.
Start the agent session with: read implementation.md, then propose a plan in bullet points before editing code.
Do not @-entire folders unless the ticket truly spans them.
Compact or start a new session after scope changes—stale history is token debt.
Compare one feature with and without implementation.md; track turns until merge.

Teams that adopt a single context file per branch report fewer “what did we mean?” loops and shorter agent sessions—not because the model got smarter, but because the prompt finally pointed at the right information.

Bottom line

Your LLM can only be as good as your prompt. For engineering work, that means your prompt must surface ticket truth, team decisions, and docs—not a vibe. Lem AI’s implementation.md on getlem.ai does that assembly when you branch, so you spend tokens once on context instead of thousands on guesswork.

Read the Implementation Agent feature page for workflows and integrations, try a ticket-named branch on your next task, and measure the difference in turns and token usage. Better prompts help; grounded implementation files help more.