OpenAI Completion Client#

Refer to thetrtllm-serve documentation for starting a server.

SourceNVIDIA/TensorRT-LLM.

 1 2fromopenaiimportOpenAI 3 4client=OpenAI( 5base_url="http://localhost:8000/v1", 6api_key="tensorrt_llm", 7) 8 9response=client.completions.create(10model="TinyLlama-1.1B-Chat-v1.0",11prompt="Where is New York?",12max_tokens=20,13)14print(response)