- Notifications
You must be signed in to change notification settings - Fork250
Python client for Replicate
License
replicate/replicate-python
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is a Python client forReplicate. It lets you run models from your Python code or Jupyter notebook, and do various other things on Replicate.
The 1.0.0 release contains breaking changes:
- The
replicate.run()
method now returnsFileOutput
s instead of URL strings by default for models that output files.FileOutput
implements an iterable interface similar tohttpx.Response
, making it easier to work with files efficiently.
To revert to the previous behavior, you can opt out ofFileOutput
by passinguse_file_output=False
toreplicate.run()
:
output=replicate.run("acmecorp/acme-model",use_file_output=False)
In most cases, updating existing applications to calloutput.url
should resolve any issues. But we recommend using theFileOutput
objects directly as we have further improvements planned to this API and this approach is guaranteed to give the fastest results.
- Python 3.8+
pip install replicate
Before running any Python scripts that use the API, you need to set your Replicate API token in your environment.
Grab your token fromreplicate.com/account and set it as an environment variable:
export REPLICATE_API_TOKEN=<your token>
We recommend not adding the token directly to your source code, because you don't want to put your credentials in source control. If anyone used your API key, their usage would be charged to your account.
Alternative authentication
As ofreplicate 1.0.7 andcog 0.14.11 it is possible to pass aREPLICATE_API_TOKEN
via thecontext
as part of a prediction request.
TheReplicate()
constructor will now use this context when available. This grants cog models the ability to use the Replicate client libraries, scoped to a user on a per request basis.
Create a new Python file and add the following code, replacing the model identifier and input with your own:
>>>importreplicate>>>outputs=replicate.run("black-forest-labs/flux-schnell",input={"prompt":"astronaut riding a rocket like a horse"} )[<replicate.helpers.FileOutputobjectat0x107179b50>]>>>forindex,outputinenumerate(outputs):withopen(f"output_{index}.webp","wb")asfile:file.write(output.read())
replicate.run
raisesModelError
if the prediction fails.You can access the exception'sprediction
propertyto get more information about the failure.
importreplicatefromreplicate.exceptionsimportModelErrortry:output=replicate.run("stability-ai/stable-diffusion-3", {"prompt":"An astronaut riding a rainbow unicorn" })exceptModelErroraseif"(some known issue)"ine.prediction.logs:passprint("Failed prediction: "+e.prediction.id)
Note
By default the Replicate client will hold the connection open for up to 60 seconds while waitingfor the prediction to complete. This is designed to optimize getting the model output back to theclient as quickly as possible.
The timeout can be configured by passingwait=x
toreplicate.run()
wherex
is a timeoutin seconds between 1 and 60. To disable the sync mode you can passwait=False
.
You can also use the Replicate client asynchronously by prependingasync_
to the method name.
Here's an example of how to run several predictions concurrently and wait for them all to complete:
importasyncioimportreplicate# https://replicate.com/stability-ai/sdxlmodel_version="stability-ai/sdxl:39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b"prompts= [f"A chariot pulled by a team of{count} rainbow unicorns"forcountin ["two","four","six","eight"]]asyncwithasyncio.TaskGroup()astg:tasks= [tg.create_task(replicate.async_run(model_version,input={"prompt":prompt}))forpromptinprompts ]results=awaitasyncio.gather(*tasks)print(results)
To run a model that takes a file input you can pass eithera URL to a publicly accessible file on the Internetor a handle to a file on your local device.
>>>output=replicate.run("andreasjansson/blip-2:f677695e5e89f8b236e52ecd1d3f01beb44c34606419bcc19345e046d8f786f9",input={"image":open("path/to/mystery.jpg") } )"an astronaut riding a horse"
Replicate’s API supports server-sent event streams (SSEs) for language models.Use thestream
method to consume tokens as they're produced by the model.
importreplicateforeventinreplicate.stream("meta/meta-llama-3-70b-instruct",input={"prompt":"Please write a haiku about llamas.", },):print(str(event),end="")
Tip
Some models, likemeta/meta-llama-3-70b-instruct,don't require a version string.You can always refer to the API documentation on the model page for specifics.
You can also stream the output of a prediction you create.This is helpful when you want the ID of the prediction separate from its output.
prediction=replicate.predictions.create(model="meta/meta-llama-3-70b-instruct",input={"prompt":"Please write a haiku about llamas."},stream=True,)foreventinprediction.stream():print(str(event),end="")
For more information, see"Streaming output" in Replicate's docs.
You can start a model and run it in the background using async mode:
>>>model=replicate.models.get("kvfrans/clipdraw")>>>version=model.versions.get("5797a99edc939ea0e9242d5e8c9cb3bc7d125b1eac21bda852e5cb79ede2cd9b")>>>prediction=replicate.predictions.create(version=version,input={"prompt":"Watercolor painting of an underwater submarine"})>>>predictionPrediction(...)>>>prediction.status'starting'>>>dict(prediction){"id":"...","status":"starting", ...}>>>prediction.reload()>>>prediction.status'processing'>>>print(prediction.logs)iteration:0,render:loss:-0.6171875iteration:10,render:loss:-0.92236328125iteration:20,render:loss:-1.197265625iteration:30,render:loss:-1.3994140625>>>prediction.wait()>>>prediction.status'succeeded'>>>prediction.output<replicate.helpers.FileOutputobjectat0x107179b50>>>>withopen("output.png","wb")asfile:file.write(prediction.output.read())
You can run a model and get a webhook when it completes, instead of waiting for it to finish:
model=replicate.models.get("ai-forever/kandinsky-2.2")version=model.versions.get("ea1addaab376f4dc227f5368bbd8eff901820fd1cc14ed8cad63b29249e9d463")prediction=replicate.predictions.create(version=version,input={"prompt":"Watercolor painting of an underwater submarine"},webhook="https://example.com/your-webhook",webhook_events_filter=["completed"])
For details on receiving webhooks, seereplicate.com/docs/webhooks.
You can run a model and feed the output into another model:
laionide=replicate.models.get("afiaka87/laionide-v4").versions.get("b21cbe271e65c1718f2999b038c18b45e21e4fba961181fbfae9342fc53b9e05")swinir=replicate.models.get("jingyunliang/swinir").versions.get("660d922d33153019e8c263a3bba265de882e7f4f70396546b6c9c8f9d47a021a")image=laionide.predict(prompt="avocado armchair")upscaled_image=swinir.predict(image=image)
Run a model and get its output while it's running:
iterator=replicate.run("pixray/text2image:5c347a4bfa1d4523a58ae614c2194e15f2ae682b57e3797a5bb468920aa70ebf",input={"prompts":"san francisco sunset"})forindex,imageinenumerate(iterator):withopen(f"file_{index}.png","wb")asfile:file.write(image.read())
You can cancel a running prediction:
>>>model=replicate.models.get("kvfrans/clipdraw")>>>version=model.versions.get("5797a99edc939ea0e9242d5e8c9cb3bc7d125b1eac21bda852e5cb79ede2cd9b")>>>prediction=replicate.predictions.create(version=version,input={"prompt":"Watercolor painting of an underwater submarine"} )>>>prediction.status'starting'>>>prediction.cancel()>>>prediction.reload()>>>prediction.status'canceled'
You can list all the predictions you've run:
replicate.predictions.list()# [<Prediction: 8b0ba5ab4d85>, <Prediction: 494900564e8c>]
Lists of predictions are paginated. You can get the next page of predictions by passing thenext
property as an argument to thelist
method:
page1=replicate.predictions.list()ifpage1.next:page2=replicate.predictions.list(page1.next)
Output files are returned asFileOutput
objects:
importreplicatefromPILimportImage# pip install pillowoutput=replicate.run("stability-ai/stable-diffusion:27b93a2413e7f36cd83da926f3656280b2931564ff050bf9575f1fdf9bcd7478",input={"prompt":"wavy colorful abstract patterns, oceans"} )# This has a .read() method that returns the binary data.withopen("my_output.png","wb")asfile:file.write(output[0].read())# It also implements the iterator protocol to stream the data.background=Image.open(output[0])
Is afile-like object returned from thereplicate.run()
method that makes it easier to work with models that output files. It implementsIterator
andAsyncIterator
for reading the file data in chunks as well asread()
andaread()
to read the entire file into memory.
Note
It is worth noting that at this timeread()
andaread()
do not currently accept asize
argument to read up tosize
bytes.
Lastly, the URL of the underlying data source is available on theurl
attribute though we recommend you use the object as an iterator or use itsread()
oraread()
methods, as theurl
property may not always return HTTP URLs in future.
print(output.url)#=> "data:image/png;base64,xyz123..." or "https://delivery.replicate.com/..."
To consume the file directly:
withopen('output.bin','wb')asfile:file.write(output.read())
Or for very large files they can be streamed:
withopen(file_path,'wb')asfile:forchunkinoutput:file.write(chunk)
Each of these methods has an equivalentasyncio
API.
asyncwithaiofiles.open(filename,'w')asfile:awaitfile.write(awaitoutput.aread())asyncwithaiofiles.open(filename,'w')asfile:awaitforchunkinoutput:awaitfile.write(chunk)
For streaming responses from common frameworks, all support takingIterator
types:
Django
@condition(etag_func=None)defstream_response(request):output=replicate.run("black-forest-labs/flux-schnell",input={...},use_file_output=True)returnHttpResponse(output,content_type='image/webp')
FastAPI
@app.get("/")asyncdefmain():output=replicate.run("black-forest-labs/flux-schnell",input={...},use_file_output=True)returnStreamingResponse(output)
Flask
@app.route('/stream')defstreamed_response():output=replicate.run("black-forest-labs/flux-schnell",input={...},use_file_output=True)returnapp.response_class(stream_with_context(output))
You can opt out ofFileOutput
by passinguse_file_output=False
to thereplicate.run()
method.
constreplicate=replicate.run("acmecorp/acme-model",use_file_output=False);
You can list the models you've created:
replicate.models.list()
Lists of models are paginated. You can get the next page of models by passing thenext
property as an argument to thelist
method, or you can use thepaginate
method to fetch pages automatically.
# Automatic pagination using `replicate.paginate` (recommended)models= []forpageinreplicate.paginate(replicate.models.list):models.extend(page.results)iflen(models)>100:break# Manual pagination using `next` cursorspage=replicate.models.list()whilepage:models.extend(page.results)iflen(models)>100:breakpage=replicate.models.list(page.next)ifpage.nextelseNone
You can also find collections of featured models on Replicate:
>>>collections= [collectionforpageinreplicate.paginate(replicate.collections.list)forcollectioninpage]>>>collections[0].slug"vision-models">>>collections[0].description"Multimodal large language models with vision capabilities like object detection and optical character recognition (OCR)">>>replicate.collections.get("text-to-image").models[<Model:stability-ai/sdxl>, ...]
You can create a model for a user or organizationwith a given name, visibility, and hardware SKU:
importreplicatemodel=replicate.models.create(owner="your-username",name="my-model",visibility="public",hardware="gpu-a40-large")
Here's how to list of all the available hardware for running models on Replicate:
>>> [hw.skuforhwinreplicate.hardware.list()]['cpu','gpu-t4','gpu-a40-small','gpu-a40-large']
Use thetraining API to fine-tune models to make them better at a particular task. To see whatlanguage models currently support fine-tuning, check out Replicate'scollection of trainable language models.
If you're looking to fine-tuneimage models, check out Replicate'sguide to fine-tuning image models.
Here's how to fine-tune a model on Replicate:
training=replicate.trainings.create(model="stability-ai/sdxl",version="39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b",input={"input_images":"https://my-domain/training-images.zip","token_string":"TOK","caption_prefix":"a photo of TOK","max_train_steps":1000,"use_face_detection_instead":False },# You need to create a model on Replicate that will be the destination for the trained version.destination="your-username/model-name")
Thereplicate
package exports a default shared client. This client is initialized with an API token set by theREPLICATE_API_TOKEN
environment variable.
You can create your own client instance to pass a different API token value, add custom headers to requests, or control the behavior of the underlyingHTTPX client:
importosfromreplicate.clientimportClientreplicate=Client(api_token=os.environ["SOME_OTHER_REPLICATE_API_TOKEN"]headers={"User-Agent":"my-app/1.0" })
Warning
Never hardcode authentication credentials like API tokens into your code.Instead, pass them as environment variables when running your program.
About
Python client for Replicate
Resources
License
Uh oh!
There was an error while loading.Please reload this page.