OpenAI Completion Client#
Refer to thetrtllm-serve documentation for starting a server.
SourceNVIDIA/TensorRT-LLM.
1 2fromopenaiimportOpenAI 3 4client=OpenAI( 5base_url="http://localhost:8000/v1", 6api_key="tensorrt_llm", 7) 8 9response=client.completions.create(10model="TinyLlama-1.1B-Chat-v1.0",11prompt="Where is New York?",12max_tokens=20,13)14print(response)