JSON output from Deepseek R1 and distills with llama.cpp

We're evaluating Deepseek R1 and its distills for our project on identifying and extracting evidence from literature. One part of our pipeline needs a strong reasoning model with JSON structured output.

Using llama-server's OpenAI compatible completion endpoint with response_format and JSON schema conflicts with the model's reasoning in <think> tags. Here's a workaround using the grammar functionality.

First, grab json_schema_to_grammar from the llama.cpp repo.

Next you can use the following:

def convert_schema_to_grammar(json_schema: dict) -> str:
    converter = json_schema_to_grammar.SchemaConverter(
        prop_order={},
        allow_fetch=False,
        dotall=False,
        raw_pattern=False
    )

    converter.visit(json_schema, 'json-schema')
    json_grammar = converter.format_grammar()

    # do the root gbnf grammar that handles the <think> and </think> tags
    # lack of less-than didn't hinder performance for us but ymmv!
    base_rules = """
root ::= "<think>" [^<]+ "</think>" [\\n]* json-schema
"""

    return base_rules + json_grammar

Then use the 'extra_body' parameter in your request:

client = OpenAI()

params = {
    "model": "...",
    "extra_body": {
        "grammar": grammar
    }
}

response = client.chat.completions.create(
    **params
)

Example below:

Provide a fictional name and address using the following JSON schema:

{
  "type": "object",
  "properties": {
    "name": {
      "type": "string"
    },
    "age": {
      "type": "integer"
    },
    "address": {
      "type": "object",
      "properties": {
        "street": {
          "type": "string"
        },
        "city": {
          "type": "string"
        },
        "zip": {
          "type": "string"
        }
      }
    }
  }
}

<think>
Okay, so the user wants me to provide a fictional name and address following a specific JSON schema. Let me break this down.

First, the schema has an object with name, age, and address. The address itself is another object containing street, city, and zip. I need to make sure each field is of the correct type: strings for name, street, city, zip, and an integer for age.

I should come up with a plausible name. Maybe something common like Emily Carter. Age should be an adult, say 32. For the address, I'll pick a street name, perhaps Maple Street, a number like 147. The city could be something like Riverton, and the zip code needs to be a 5-digit number, maybe 12345.

Wait, is Riverton a real city? I think there are places named Riverton in various states, but since it's fictional, it doesn't matter. Alternatively, I could make up a city name, but using a real one might be easier.

Putting it all together, I'll structure the JSON with these values. I need to ensure that the syntax is correct, with proper commas and brackets. Also, the keys should match exactly: name, age, address, street, city, zip.

I should double-check that all the types are correct. Name is a string, age is an integer, and each part of the address is a string. That should satisfy the schema requirements.

I think that's all. Time to put it together in the response.
</think>

{
  "name": "Emily Carter",
  "age": 32,
  "address": {
    "street": "147 Maple Street",
    "city": "Riverton",
    "zip": "12345"
  }
}

This approach lets you preserve model reasoning while enforcing JSON output structure. Enjoy!

Sadiq Jaffer

Blog

Sadiq Jaffer's blog

Posted Thu 30 January 2025

JSON output from Deepseek R1 and distills with llama.cpp