Gradio Deployment Guide#

📖 Overview#

Lightx2v is a lightweight video inference and generation engine that provides a web interface based on Gradio, supporting both Image-to-Video and Text-to-Video generation modes.

For Windows systems, we provide a convenient one-click deployment solution with automatic environment configuration and intelligent parameter optimization. Please refer to the One-Click Gradio Startup (Recommended) section for detailed instructions.

Gradio English Interface

📁 File Structure#

LightX2V/app/
├── gradio_demo.py          # English interface demo
├── gradio_demo_zh.py       # Chinese interface demo
├── run_gradio.sh          # Startup script
├── README.md              # Documentation
├── outputs/               # Generated video save directory
└── inference_logs.log     # Inference logs

This project contains two main demo files:

gradio_demo.py - English interface version
gradio_demo_zh.py - Chinese interface version

🚀 Quick Start#

Environment Requirements#

Follow the Quick Start Guide to install the environment

Recommended Optimization Library Configuration#

✅ Flash attention
✅ Sage attention
✅ vllm-kernel
✅ sglang-kernel
✅ q8-kernel (only supports ADA architecture GPUs)

Install according to the project homepage tutorials for each operator as needed.

📥 Model Download#

Models can be downloaded with one click through the frontend interface, with two download sources provided: HuggingFace and ModelScope. You can choose according to your situation. You can also refer to the Model Structure Documentation to download complete models (including quantized and non-quantized versions) or download only quantized/non-quantized versions.

wan2.1 Model Directory Structure#

models/
├── wan2.1_i2v_720p_lightx2v_4step.safetensors                   # Original precision
├── wan2.1_i2v_720p_scaled_fp8_e4m3_lightx2v_4step.safetensors   # FP8 quantization
├── wan2.1_i2v_720p_int8_lightx2v_4step.safetensors              # INT8 quantization
├── wan2.1_i2v_720p_int8_lightx2v_4step_split                    # INT8 quantization block storage directory
├── wan2.1_i2v_720p_scaled_fp8_e4m3_lightx2v_4step_split         # FP8 quantization block storage directory
├── Other weights (e.g., t2v)
├── t5/clip/xlm-roberta-large/google    # text and image encoder
├── vae/lightvae/lighttae               # vae
└── config.json                         # Model configuration file

wan2.2 Model Directory Structure#

models/
├── wan2.2_i2v_A14b_high_noise_lightx2v_4step_1030.safetensors        # high noise original precision
├── wan2.2_i2v_A14b_high_noise_fp8_e4m3_lightx2v_4step_1030.safetensors    # high noise FP8 quantization
├── wan2.2_i2v_A14b_high_noise_int8_lightx2v_4step_1030.safetensors   # high noise INT8 quantization
├── wan2.2_i2v_A14b_high_noise_int8_lightx2v_4step_1030_split         # high noise INT8 quantization block storage directory
├── wan2.2_i2v_A14b_low_noise_lightx2v_4step.safetensors         # low noise original precision
├── wan2.2_i2v_A14b_low_noise_fp8_e4m3_lightx2v_4step.safetensors     # low noise FP8 quantization
├── wan2.2_i2v_A14b_low_noise_int8_lightx2v_4step.safetensors    # low noise INT8 quantization
├── wan2.2_i2v_A14b_low_noise_int8_lightx2v_4step_split          # low noise INT8 quantization block storage directory
├── t5/clip/xlm-roberta-large/google    # text and image encoder
├── vae/lightvae/lighttae               # vae
└── config.json                         # Model configuration file

📝 Download Instructions:

Model weights can be downloaded from HuggingFace:
- Wan2.1-Distill-Models
- Wan2.2-Distill-Models
Text and Image Encoders can be downloaded from Encoders
VAE can be downloaded from Autoencoders
For xxx_split directories (e.g., wan2.1_i2v_720p_scaled_fp8_e4m3_lightx2v_4step_split), which store multiple safetensors by block, suitable for devices with insufficient memory. For example, devices with 16GB or less memory should download according to their own situation.

Startup Methods#

Method 1: Using Startup Script (Recommended)#

Linux Environment:

# 1. Edit the startup script to configure relevant paths
cd app/
vim run_gradio.sh

# Configuration items that need to be modified:
# - lightx2v_path: Lightx2v project root directory path
# - model_path: Model root directory path (contains all model files)

# 💾 Important note: Recommend pointing model paths to SSD storage locations
# Example: /mnt/ssd/models/ or /data/ssd/models/

# 2. Run the startup script
bash run_gradio.sh

# 3. Or start with parameters
bash run_gradio.sh --lang en --port 8032
bash run_gradio.sh --lang zh --port 7862

Windows Environment:

# 1. Edit the startup script to configure relevant paths
cd app\
notepad run_gradio_win.bat

# Configuration items that need to be modified:
# - lightx2v_path: Lightx2v project root directory path
# - model_path: Model root directory path (contains all model files)

# 💾 Important note: Recommend pointing model paths to SSD storage locations
# Example: D:\models\ or E:\models\

# 2. Run the startup script
run_gradio_win.bat

# 3. Or start with parameters
run_gradio_win.bat --lang en --port 8032
run_gradio_win.bat --lang zh --port 7862

Method 2: Direct Command Line Startup#

pip install -v git+https://github.com/ModelTC/LightX2V.git

Linux Environment:

English Interface Version:

python gradio_demo.py \
    --model_path /path/to/models \
    --server_name 0.0.0.0 \
    --server_port 7862

Chinese Interface Version:

python gradio_demo_zh.py \
    --model_path /path/to/models \
    --server_name 0.0.0.0 \
    --server_port 7862

Windows Environment:

English Interface Version:

python gradio_demo.py ^
    --model_path D:\models ^
    --server_name 127.0.0.1 ^
    --server_port 7862

Chinese Interface Version:

python gradio_demo_zh.py ^
    --model_path D:\models ^
    --server_name 127.0.0.1 ^
    --server_port 7862

💡 Tip: Model type (wan2.1/wan2.2), task type (i2v/t2v), and specific model file selection are all configured in the Web interface.

📋 Command Line Parameters#

Parameter	Type	Required	Default	Description
`--model_path`	str	✅	-	Model root directory path (directory containing all model files)
`--server_port`	int	❌	7862	Server port
`--server_name`	str	❌	0.0.0.0	Server IP address
`--output_dir`	str	❌	./outputs	Output video save directory

💡 Note: Model type (wan2.1/wan2.2), task type (i2v/t2v), and specific model file selection are all configured in the Web interface.

🎯 Features#

Model Configuration#

Model Type: Supports wan2.1 and wan2.2 model architectures
Task Type: Supports Image-to-Video (i2v) and Text-to-Video (t2v) generation modes
Model Selection: Frontend automatically identifies and filters available model files, supports automatic quantization precision detection
Encoder Configuration: Supports selection of T5 text encoder, CLIP image encoder, and VAE decoder
Operator Selection: Supports multiple attention operators and quantization matrix multiplication operators, system automatically sorts by installation status

Input Parameters#

Prompt: Describe the expected video content
Negative Prompt: Specify elements you don’t want to appear
Input Image: Upload input image required in i2v mode
Resolution: Supports multiple preset resolutions (480p/540p/720p)
Random Seed: Controls the randomness of generation results
Inference Steps: Affects the balance between generation quality and speed (defaults to 4 steps for distilled models)

Video Parameters#

FPS: Frames per second
Total Frames: Video length
CFG Scale Factor: Controls prompt influence strength (1-10, defaults to 1 for distilled models)
Distribution Shift: Controls generation style deviation degree (0-10)

🔧 Auto-Configuration Feature#

The system automatically configures optimal inference options based on your hardware configuration (GPU VRAM and CPU memory) without manual adjustment. The best configuration is automatically applied on startup, including:

GPU Memory Optimization: Automatically enables CPU offloading, VAE tiling inference, etc. based on VRAM size
CPU Memory Optimization: Automatically enables lazy loading, module unloading, etc. based on system memory
Operator Selection: Automatically selects the best installed operators (sorted by priority)
Quantization Configuration: Automatically detects and applies quantization precision based on model file names

Log Viewing#

# View inference logs
tail -f inference_logs.log

# View GPU usage
nvidia-smi

# View system resources
htop

Welcome to submit Issues and Pull Requests to improve this project!

Note: Please comply with relevant laws and regulations when using videos generated by this tool, and do not use them for illegal purposes.

Gradio Deployment Guide

Contents