How does my agent generate images and video?

Through fal.ai, an inference platform that runs generative media models, including the FLUX image family and a range of video, audio, and editing models, behind one API. Connect it by creating an API key with the API scope at fal.ai/dashboard/keys and adding it to Operator's Environment as FAL_KEY. The exact name matters, since the fal skill and client both look for FAL_KEY by default.

What kinds of media can the agent make?

Text to image, image to image, inpainting that edits one region while leaving the rest alone, upscaling, background removal, short video, and text to speech. So it can produce a header image for a post, a few variations of a product photo, a logo cleaned up and enlarged, or a short clip, then hand you the result or attach it to whatever it is working on.

Why does my generated image link stop working?

Results come back as links to files hosted on fal, and those links expire, so to keep an image, tell the agent to save it to its files rather than leaving it in the chat. If a generation fails with an authorization error, the key was almost certainly created with the wrong scope or under a different team, both of which you fix on the keys page. Choose the API scope, not ADMIN.

Operator.io | Operator Guide: Generate Media with OpenClaw and fal.ai

Q: How much does fal.ai cost?

Generation is billed by the output, and you only pay for successful results, never for time spent waiting in the queue or for a request that errors out. As a rough anchor, FLUX Schnell runs around $0.003 per megapixel and FLUX Dev around $0.025 per image, while premium and video models cost more. Image models bill per image or per megapixel and video bills per second or per clip. Current rates are on the fal pricing page.

When OpenClaw is drafting a post or putting together a project, it helps if it can make the visuals too. fal.ai is an inference platform that runs hundreds of generative media models behind one API, including FLUX Schnell for fast text to image, FLUX Dev for higher quality stills, and a range of video, audio, and editing models from providers like Kling, MiniMax, and others. Connecting it gives your Operator.io agent a way to produce media on demand instead of asking you to go find a picture. The same platform hosts models from Black Forest Labs, Stability AI, and open weight alternatives, so you are not locked to one image generator.

fal.ai logo

Text to image models take a natural language prompt and return a picture. fal wraps each model as a REST endpoint the agent calls with a JSON payload. Fast models like FLUX Schnell respond synchronously over fal.run. Heavier models and video go through fal's queue at queue.fal.run, and the agent polls until the result is ready. You rarely need to think about which path a request takes, but it explains why a quick logo comes back in a breath and a video clip takes longer.

Cole Fortman's walkthrough runs real image, video, and 3D jobs on fal.ai, a quick way to see the model gallery and the request shape the agent uses before you create a key.

What you get

The fal skill lets OpenClaw run generative media models on fal.ai across the categories you would expect:

Text to image, turning a description into a picture.
Image to image, reworking a photo you already have along a direction you give.
Inpainting, editing one region while leaving the rest alone.
Upscaling a small image, and stripping the background off a product shot.
Short video, and text to speech.

So the agent can produce a header image for a post, a few variations of a product photo, a logo cleaned up and enlarged, or a short clip, then hand you the result or attach it to whatever it is working on.

Two things shape how a request runs. Fast models like FLUX Schnell return the image inline while you wait through the synchronous endpoint. Heavier ones, including FLUX Dev and anything that makes video, go to the queue, and the agent checks back until the result is ready rather than holding the line open. Cost follows the model and the kind of output:

Model	Billing unit	Rough cost
FLUX Schnell	per megapixel, rounded up	about $0.003, so a 1024 by 1024 image is a fraction of a cent
FLUX Dev	per image	about $0.025
Video models	per second of output	higher by an order of magnitude

Before you start

You need a fal.ai account. Generation is billed by the output you produce, so add a payment method if you plan to do more than experiment with the included credit. The useful part of fal's model is that you only pay for successful results, never for time spent waiting in the queue or for a request that errors out. The pricing documentation explains billing units per model type and includes a programmatic pricing API if you want to estimate costs before running a batch.

Operator also ships a Replicate skill for models hosted on Replicate's platform. fal is the default for media generation in this guide because its model gallery covers image, video, and audio in one place with consistent API shapes, but either platform works for the same workflows if you prefer Replicate's catalog.

Step 1: Create an API key

Go to fal.ai/dashboard/keys and generate a key. When it asks for a scope, choose API, which is the right level for running models. The ADMIN scope is only needed for deploying your own models, which you do not need here. Copy the key when it appears. The full reference is in the fal authentication docs.

Step 2: Add the key to Operator

In your Operator dashboard, open Environment and add a variable named FAL_KEY with your key as the value. That exact name matters, because the fal skill and the fal client both look for FAL_KEY by default. It is encrypted and shown once, and the skill is already installed, so you are finished as soon as you save.

Step 3: Ask for something

Connect Telegram on the channels page if you have not, then describe what you want:

Make me three square images of a minimalist desk setup for the blog post you just drafted.

Here's a photo of our product. Put it on a clean white background and upscale it for the store page.

The agent picks a model that fits, runs it on fal.ai, and sends the result back in your channel or saves it to its files for later. It can write a post and produce the header image in the same thread, or draft a social update and attach a fresh graphic before it goes out.

Two things change once that becomes routine. Generation runs on fal's infrastructure, so the prompt you write and any reference image you hand the agent leave your Operator instance and travel to the model the way any API call does. That is fine for most work, and a reason to pause before you feed it a private photo or a prompt that carries details you would not paste into someone else's tool.

When a generated image then goes out to a social account, Instagram, TikTok, and YouTube now read the provenance metadata that many generators embed and can attach an AI label on their own, whatever you mark yourself, so an image the agent made may show up tagged as AI wherever it lands. If a post calls for a real photograph, say so and give the agent the picture rather than having it generate one.

If you care about speed over quality, say "use FLUX Schnell." If you need a polished still for a landing page, say "use FLUX Dev" or name the resolution you want. Leaving the choice to the agent works fine for drafts, but naming the model saves money when Schnell is enough.

Good to know

Pricing depends on the model and the size of what you ask for. Image models bill per image or per megapixel of output, so a larger picture costs proportionally more, and video bills per second or per clip. As a rough anchor, FLUX Schnell runs around $0.003 per megapixel and FLUX Dev around $0.025 per image, while premium and video models cost more; the current rates for any model are on the fal pricing page and each model's page in the gallery. Because the heavier models cost more and take longer, it is worth naming the one you want when you have a preference rather than leaving the choice to the agent.

Two practical notes. Results come back as links to files hosted on fal, and those links expire, so if you want to keep an image, tell the agent to save it to its files rather than leaving it in the chat. And if a generation fails with an authorization error, the key was almost certainly created with the wrong scope or under a different team, both of which you fix on the keys page.

Generate images and video with fal.ai

What you get

Before you start

Step 1: Create an API key

Step 2: Add the key to Operator

Step 3: Ask for something

Good to know

Frequently asked questions

How does my agent generate images and video?

What kinds of media can the agent make?

How much does fal.ai cost?

Why does my generated image link stop working?

Connect Pipedream to your agent and reach the apps you use

Set up your Telegram channel on Operator

How to use Composio with OpenClaw to automate anything

Generate images and video with fal.ai

What you get

Before you start

Step 1: Create an API key

Step 2: Add the key to Operator

Step 3: Ask for something

Good to know

Frequently asked questions

How does my agent generate images and video?

What kinds of media can the agent make?

How much does fal.ai cost?

Why does my generated image link stop working?

Keep reading

Connect Pipedream to your agent and reach the apps you use

Set up your Telegram channel on Operator

How to use Composio with OpenClaw to automate anything