Compact Schemas vs JSON Schema in LLM Prompts

When you need to describe a JSON structure inside an LLM prompt, JSON Schema is the obvious choice. It is a well-known spec with broad tooling support. But obvious does not mean optimal. JSON Schema was designed for validation, not for fitting inside a token budget.

What JSON Schema looks like in a prompt

Take a simple user object with a name, email, and optional phone number. In JSON Schema:

{
  "type": "object",
  "required": ["name", "email"],
  "properties": {
    "name": { "type": "string" },
    "email": { "type": "string" },
    "phone": { "type": "string" }
  }
}

That is 30+ tokens for three fields. Now imagine a response with 40 fields, nested objects, and arrays. JSON Schema for a real API payload regularly exceeds 500 tokens. The "type", "properties", and "required" boilerplate repeats at every level of nesting.

The compact alternative

A compact schema represents the same structure by replacing values with types directly:

{
  name: string,
  email: string,
  phone?: string
}

Same information. Around 12 tokens. The ? suffix marks optional fields, so there is no separate required array. Nesting works the same way: objects contain objects, arrays show a single representative item.

For a more realistic example, consider a paginated API response:

{
  data: [
    {
      id: number,
      title: string,
      author: {
        name: string,
        avatar_url?: string
      },
      tags: [string],
      published_at: string | null
    }
  ],
  meta: {
    page: number,
    total: number,
    per_page: number
  }
}

The equivalent JSON Schema for this structure is roughly 80 lines. The compact version above is 18 lines. Both describe exactly the same shape, but one costs four times the tokens.

When JSON Schema still wins

JSON Schema carries information that compact schemas do not. Patterns, enums, min/max constraints, $ref for reuse, and detailed format annotations like "format": "email" or "format": "date-time". If your LLM prompt requires the model to validate data against precise rules, JSON Schema gives you that precision.

But most prompts do not ask for validation. They ask the model to understand a shape: "Here is the structure of the API response. Extract the relevant fields." or "Generate a response matching this format." For comprehension tasks, the compact format carries everything the model needs.

Token cost in practice

A real-world REST API response (a GitHub issue, a Stripe invoice, a Shopify order) typically produces:

| Format | Token count | |--------|------------| | Raw JSON with values | 600-2000 | | JSON Schema | 300-800 | | Compact schema | 80-200 |

The compact schema is not just smaller than raw JSON. It is significantly smaller than JSON Schema itself. When you are working within a fixed context window, that difference means more room for instructions, examples, and conversation history.

Generating compact schemas

You can paste any JSON into the converter and get a compact schema in one click. For automation, the @maisondigital/jsontoschema npm package provides both a CLI and programmatic API:

curl -s https://api.example.com/users/1 | npx @maisondigital/jsontoschema

The output is ready to drop into a prompt template, a documentation file, or a CI snapshot.

Choosing the right format

Use JSON Schema when you need validation rules, tool interoperability, or spec compliance. Use compact schemas when you need a human or an LLM to quickly understand a JSON structure without burning tokens on boilerplate. They solve different problems, and the right choice depends on whether you are validating or communicating.

Try pasting a JSON payload into the converter to see how compact the output gets.