Precision: Define the Answer

Concept. Step one: precision. Runs ≠ correct, so before the model writes a line, you pin the precision somewhere: in the prompt, or in the SQL you write yourself. A vague prompt pins nothing, so the model picks one reading of your question and you cannot tell which. The spec decides which query is right.

Intuition. "Users who don't listen to Taylor Swift" is not one question. It has three reasonable readings that return different users, and the model commits to one without telling you which. Reading the result back takes knowing the SQL.

The LLMs and SQL page named three steps. This is the first: precision. The request looks exact. It has three readings:

"Find the users who don't listen to Taylor Swift."

  • Reading A: never played a Taylor song. Drop anyone who heard her even once. Mickey, Minnie, and Daffy each played Taylor, so all three are out, even though they also play the Beatles. Returns only Pluto.

  • Reading B: plays anything besides Taylor. Keep anyone with a single non-Taylor listen, so Mickey stays in, even though he streams Taylor daily. Returns all four users.

  • The Pluto question: a user with no listens at all. Both readings above sweep Pluto in. But should someone who plays nothing count as "doesn't listen to Taylor"? That is a third decision the sentence never settles.

Readings A and B split on Mickey, the user who plays both Taylor and other artists. Pluto is a separate fork. The English settles none of it; the spec has to.

In the Prompt, or in the SQL

Take the prompt first. Here is the same request written two ways:

The same request as two prompts. Left, grey: a vague one-liner, so the model can write Query A (LEFT JOIN, all four users) or Query B (NOT IN, only Pluto) and you can't tell which. Right, blue: a precise prompt with five numbered sections (engine, schema with join paths, definitions, edge cases, show your work) whose Definitions line fixes exactly Query B and {Pluto}.

Figure 1. A vague prompt (grey) says almost nothing, so the model can write Query A (LEFT JOIN, all four users) or Query B (NOT IN, only Pluto) and you cannot tell which. The precise prompt (blue) numbers what the model must not guess: engine, schema and join paths, definition, edge case, and a show-your-work trace. Its [Definitions] line fixes Query B and {Pluto}.

Every numbered line blocks a guess the model would otherwise make. Drop one and you hand it room to hallucinate. That keeps the prompt minimal and sufficient, not merely long:

  1. [Engine] pins the SQL dialect (!= vs <>, NULL and string behavior). Unspecified, the model picks one.
  2. [Schema] gives exact table and column names so it cannot invent one. The join paths matter most: the key is often implicit, and the model will assume a foreign key or join on the wrong column. Writing ON Listens.user_id = Users.user_id removes that guess.
  3. [Definitions] is the actual ask, the one line only you can supply.
  4. [Edge cases] keep the no-listen user (Pluto), whom a careless query silently drops.
  5. [Show your work] forces the step-by-step trace you check the result against.

Two is the easy case. One ambiguous phrase over two tables gives two readings. A real query joins four or five tables and carries several independent choices: inner vs left join, whether a NULL counts, dedup, what a COUNT counts, date boundaries. Each is its own fork, so k independent ambiguities give on the order of 2^k queries that all run and disagree.

Resolving those forks is the whole of precision. Three ways to do it:

  • Push. Write every answer into an English spec, as above.

  • Pull. Let the model interview you, one clarifying question per fork: inner join or left? Does a NULL count? Count rows or distinct users?

  • Write the SQL. Skip the prose. Every fork is a syntactic choice the query already makes (LEFT JOIN vs JOIN, COUNT(*) vs COUNT(DISTINCT user_id)), so it is precise by construction.

The spec and the interview only reconstruct in English what the SQL states outright. The long paragraph is the price of staying in English.

What you cannot do is leave the forks open. A vague prompt does not erase them, it just hands all k answers to the model to fill in silently. Something collapses the 2^k tree to one query no matter what. Your only choice is whether that something is you, in SQL or in English, or the model guessing alone.

For this request, that one decision is Query B over Query A. State it in the prompt, draw it out through the model's questions, or just write Query B. The answer comes from you. The model still has to turn it into SQL that runs, which is the next step.

Takeaways

  1. A vague prompt picks for you. One phrase, several queries that disagree, and the model chooses one silently.

  2. Precision lives in the prompt or the SQL. Push it into an English spec, let the model interview you, or just write the query, which is precise by construction.

  3. Define the answer before the model writes. Running the SQL and proving the result come next.


Next

Execution: Run It on Real Data → You defined the spec, Query B. Second step: execution. The model writes the SQL and iterates against a real database until it works.