Thinking for open models

Some open models support a "thinking" mode that allows them to performstep-by-step reasoning before providing a final answer. This is useful for tasksrequiring transparent logic, like mathematical proofs, intricate code debugging,or multi-step agent planning.

Model-specific guidance

The following sections provide model-specific guidance for thinking.

DeepSeek R1 0528

For DeepSeek R1 0528, reasoning is surrounded by<think></think> tags inthecontent field. There is noreasoning_content field.

Example request:

curl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\https://us-central1-aiplatform.googleapis.com/v1/projects/test-project/locations/us-central1/endpoints/openapi/chat/completions-d'{  "model": "deepseek-ai/deepseek-r1-0528-maas",  "messages": [{    "role": "user",    "content": "Who are you?"  }]}'

Example response:

{"choices":[{"finish_reason":"stop","index":0,"logprobs":null,"message":{"content":"<think>\nHmm, the user just asked \u201cWho are you?\u201d -        a simple but fundamental question. \n\nFirst, let's consider the        context. This is likely their first interaction, so they're probably        testing the waters or genuinely curious about what I am. The phrasing is        neutral - no urgency or frustration detected. \n\nI should keep my        response warm and informative without being overwhelming. Since they        didn't specify language preference, I'll default to English but note        they might be multilingual. \n\nKey points to cover:\n- My identity        (DeepSeek-R1)\n- My capabilities \n- My purpose (being helpful)\n- Tone:        friendly with emojis to seem approachable\n- Ending with an open        question to continue conversation\n\nThe smiley face feels appropriate        here - establishes friendliness. Mentioning \u201cAI assistant\u201d        upfront avoids confusion. Listing examples of what I can do gives        concrete value. Ending with \u201cHow can I help?\u201d turns it into an        active conversation starter. \n\nNot adding too much technical detail        (like model specs) since that might overwhelm a first-time user. The        \u201calways learning\u201d phrase subtly manages expectations about my        limitations.\n</think>\nI'm DeepSeek-R1, your friendly AI assistant!        \ud83d\ude0a \nI'm here to help you with questions, ideas, problems, or        just a chat\u2014whether it's about homework, writing, coding, career        advice, or fun trivia. I can read files (like PDFs, Word, Excel, etc.),        browse the web for you (if enabled), and I'm always learning to be more        helpful!\n\nSo\u2026 how can I help you today? \ud83d\udcac\u2728","role":"assistant"}}],"created":1758229877,"id":"dXXMaKrsKqaQm9IPsLeTkAQ","model":"deepseek-ai/deepseek-r1-0528-maas","object":"chat.completion","system_fingerprint":"","usage":{"completion_tokens":329,"prompt_tokens":9,"total_tokens":338}}

DeepSeek v3.1

DeepSeek-V3.1 supports hybrid reasoning, which lets you turnreasoning on or off. By default, reasoning is off. To enable reasoning,include"chat_template_kwargs": { "thinking": true } in your request.

When reasoning is enabled, thinking text appears at the beginning of thecontent field, followed by</think> and then the answer text(for example,THINKING_TEXT</think>ANSWER_TEXT).Note that there is no initial<think> tag.

Example request:

curl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\https://us-west2-aiplatform.googleapis.com/v1/projects/test-project/locations/us-west2/endpoints/openapi/chat/completions-d'{  "model": "deepseek-ai/deepseek-v3.1-maas",  "messages": [{    "role": "user",    "content": "What are the first 3 even prime numbers?"  }],  "chat_template_kwargs": {    "thinking": true  }}'

Example response:

{"choices":[{"finish_reason":"stop","index":0,"logprobs":null,"matched_stop":1,"message":{"content":"SNIPPED</think>The only even prime number is 2. By        definition, a prime number is a natural number greater than 1 that has        no positive divisors other than 1 and itself. Since all even numbers        other than 2 are divisible by 2 and themselves, they have at least three        divisors and are not prime. Therefore, there are no second or third even        prime numbers. The concept of \"first three even prime numbers\" is not        applicable beyond the first one.","reasoning_content":null,"role":"assistant","tool_calls":null}}],"created":1758229458,"id":"ynPMaIWPN-KeltsPh87s0Qg","model":"deepseek-ai/deepseek-v3.1-maas","object":"chat.completion","usage":{"completion_tokens":1525,"prompt_tokens":17,"prompt_tokens_details":null,"total_tokens":1542}}

Kimi K2

For Kimi K2 thinking models, reasoning tokens are in thereasoning_content field, and normal response text is in thecontentfield.

Example request:

curl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\https://us-central1-aiplatform.googleapis.com/v1/projects/test-project/locations/us-central1/endpoints/openapi/chat/completions-d'{  "model": "moonshot-ai/kimi-k2-32k-maas",  "messages": [{    "role": "user",    "content": "Who are you?"  }]}'

Example response:

{"choices":[{"finish_reason":"stop","index":0,"logprobs":null,"matched_stop":163586,"message":{"content":"I am Kimi, an AI assistant created by Moonshot AI. I specialize in understanding and processing extremely long texts-up to 256,000 tokens-enabling deep analysis, complex reasoning, and autonomous task performance, such as conducting in-depth research. My core strengths include advanced tool use, reasoning, and problem-solving. How can I assist you today?","reasoning_content":"The user asks \"Who are you?\" This is a direct question about my identity. I must answer truthfully and accurately based on the provided identity card.\n        \n        I am an AI assistant. I am created by Moonshot AI, a company also known as 月之暗面科技有限公司. My name is Kimi.\n        \n        Key details to include:\n        - I am an AI assistant.\n        - My name is Kimi.\n        - I am created by Moonshot AI.\n        - I should mention my core capabilities to provide a useful and informative answer, as a simple \"I am Kimi\" is too brief. I should highlight key features like my MoE architecture, long context window, and agentic capabilities, but in a natural, conversational way, not just a list of specs.\n        - The tone should be helpful, confident, and direct. I should not be overly verbose or apologetic.\n        \n        Plan:\n        1. Start with a clear, direct answer: \"I am Kimi, an AI assistant.\"\n        2. Mention my creator: \"created by Moonshot AI.\"\n        3. Briefly highlight key distinguishing features that are relevant to users:\n           - My long-context capability (up to 256,000 tokens) - this is a major differentiator.\n           - My agentic capabilities, such as Kimi-Researcher, which allow me to perform complex tasks autonomously.\n           - My strong reasoning and problem-solving skills.\n        4. Keep it concise and natural. Avoid technical jargon where possible, but the term \"long-context\" is now fairly common and very descriptive.\n        5. End with an offer to help, which is standard for assistant interactions.\n        \n        Drafting the response:\n        \"I am Kimi, an AI assistant created by Moonshot AI. I'm designed to handle complex tasks with advanced reasoning and a very long context window—up to 256,000 tokens. I can help with deep research, document analysis, problem-solving, and coding, thanks to my Mixture-of-Experts architecture. How can I assist you today?\"\n        \n        This is good but slightly too feature-heavy. Let's soften it.\n        \n        Alternative:\n        \"I am Kimi, an AI assistant developed by Moonshot AI. I'm built on a Mixture-of-Experts architecture and excel at processing long contexts-up to 256,000 tokens—enabling deep analysis of documents, advanced reasoning, and autonomous task execution through features like Kimi-Researcher. My capabilities include coding, research, and complex problem-solving. How can I help you?\"\n        \n        Still a bit too list-like. Let's make it more conversational and directly answer the \"who\" question.\n        \n        Final Polish:\n        \"I am Kimi, an AI assistant created by Moonshot AI. I specialize in understanding and processing extremely long texts-up to 256,000 tokens-enabling deep analysis, complex reasoning, and autonomous task performance, such as conducting in-depth research. My core strengths include advanced tool use, reasoning, and problem-solving. How can I assist you today?\"\n        \n        This version is accurate, flows well, and highlights the most important aspects without reading like a spec sheet. It mentions the long context, the autonomous capability (Kimi-Researcher is a good example), and core skills. I'll go with this.\n        \n        Final check:\n        - Who? \"I am Kimi, an AI assistant created by Moonshot AI.\" — Clear.\n        - What can I do? Mentions long context, complex reasoning, autonomous tasks like research. — Informative.\n        - Tone? Direct, helpful. - Good.\n        - Length? Concise. - Perfect. ","role":"assistant","tool_calls":null}}],"created":1763488309,"id":"L7IcaaHnGquYj8MP8dnn4Q4","metadata":{"weight_version":"default"},"model":"moonshotai/kimi-k2-thinking-maas","object":"chat.completion","usage":{"completion_tokens":794,"extra_properties":{"google":{"traffic_type":"ON_DEMAND"}},"prompt_tokens":29,"prompt_tokens_details":null,"reasoning_tokens":0,"total_tokens":823}}

Qwen3-Next Thinking

For Qwen3-Next Thinking models, reasoning tokens are in thereasoning_content field, and normal response text is in thecontentfield.

Example request:

curl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\https://aiplatform.googleapis.com/v1/projects/test-project/locations/global/endpoints/openapi/chat/completions-d'{  "model": "qwen/qwen3-next-80b-a3b-thinking-maas",  "messages": [{    "role": "user",    "content": "Who are you?"  }]}'

Example response:

{"choices":[{"finish_reason":"stop","index":0,"logprobs":null,"matched_stop":151645,"message":{"content":"Hello! I'm Qwen, a large language model developed by Alibaba        Cloud. I can help answer questions, write stories, emails, scripts,        perform logical reasoning, code, and more. How can I assist you today?        \ud83d\ude0a","reasoning_content":"Okay, the user asked, Who are you?        I need to respond appropriately. First, I should state my name clearly:        I am Qwen, a large language model developed by Alibaba Cloud. Then,        maybe mention my capabilities, like answering questions, writing        stories, emails, scripts, logical reasoning, programming, etc. Also,        highlight that I can express views and play games. But keep it concise        since the user might just want a brief intro.\n\nWait, the user might        not know much about me, so I should explain what I can do in simple        terms. Maybe add something about being trained on a large amount of        data, so I can handle various topics. But don't get too technical. Also,        maybe mention that I'm here to help. Should I ask if they have any        specific questions? That might be good to invite further        interaction.\n\nLet me check the structure. Start with a greeting,        introduce myself, list key abilities, and offer assistance. Keep it        friendly and approachable. Avoid jargon. Maybe say something like \"I'm        Qwen, a large language model created by Alibaba Cloud. I can help with        answering questions, writing, coding, and more. How can I assist you        today?\" That seems good.\n\nWait, the original response in Chinese        might be different, but the user asked in English, so the response        should be in English. Let me confirm. The user's query is in English, so        I should respond in English. Yes.\n\nAlso, check for any specific        details. Maybe mention that I have 128K context length, but maybe that's        too technical. Maybe just say \"I have a large context window\" or not        necessary. For a basic introduction, maybe stick to the main points        without too many specs. The user might not need the technical details        right away.\n\nSo, the response should be: Hello! I'm Qwen, a large        language model developed by Alibaba Cloud. I can help answer questions,        write stories, emails, scripts, perform logical reasoning, code, and        more. How can I assist you today? That's concise and covers the main        points.\n\nWait, in the initial problem statement, the user's question        is \"Who are you?\" So the standard response would be to introduce        myself. Yes. Also, maybe add that I'm an AI language model, but the name        Qwen is important. So the key elements are name, developer,        capabilities, and offer help.\n\nYes. Let me make sure there's no        mistake. Alibaba Cloud is the correct developer. Yes. So the response        should be accurate. Alright, that seems good.\n","role":"assistant","tool_calls":null}}],"created":1758229652,"id":"knTMaJC0EJfM5OMP7I3xkAk","metadata":{"weight_version":"default"},"model":"qwen/qwen3-next-80b-a3b-thinking-maas","object":"chat.completion","usage":{"completion_tokens":573,"prompt_tokens":14,"prompt_tokens_details":null,"reasoning_tokens":0,"total_tokens":587}}

GPT OSS

For GPT OSS models, thinking text is in thereasoning_content field,and normal response text is in thecontent field. These models alsosupport thereasoning_effort parameter.

Example request:

curl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\https://us-central1-aiplatform.googleapis.com/v1/projects/test-project/locations/us-central1/endpoints/openapi/chat/completions-d'{  "model": "openai/gpt-oss-120b-maas",  "messages": [{    "role": "user",    "content": "Who are you?"  }],  "reasoning_effort": "high"}'

Example response:

{"choices":[{"finish_reason":"stop","index":0,"logprobs":null,"matched_stop":200002,"message":{"content":"I\u2019m ChatGPT, a large language model created by OpenAI        (based on the GPT\u20114 architecture). I\u2019m here to help answer        your questions, brainstorm ideas, explain concepts, and assist with a        wide variety of topics. Let me know what you\u2019d like to talk        about!","reasoning_content":"The user asks: \"Who are you?\" It's a simple        question. They want to know who the assistant is. According to policy,        we should respond with a brief, friendly description: \"I am ChatGPT, a        large language model trained by OpenAI.\" Possibly mention the version        (GPT-4). Also mention we are here to assist. There's no request for        disallowed content. So a straightforward answer. Possibly ask if they        need help. Provide a short intro.\n\nThus answer: \"I am ChatGPT, a        large language model trained by OpenAI. I'm here to help answer your        questions...\"\n\nWe should avoid disallowed content. It's fine.\n\nThus        final answer.","role":"assistant","tool_calls":null}}],"created":1758229799,"id":"JnXMaJq6HLGzquEP4OD18Qg","metadata":{"weight_version":"default"},"model":"openai/gpt-oss-120b-maas","object":"chat.completion","usage":{"completion_tokens":201,"prompt_tokens":71,"prompt_tokens_details":null,"reasoning_tokens":0,"total_tokens":272}}

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-17 UTC.