Synthetic Data Generation

generate() Function

The core function for generating synthetic images with automatic object detection and labeling.

Overview

The generate() function is the main method of the DetectionDataset class. It combines image generation with object detection to create labeled datasets automatically from text prompts.

How It Works

1

Generate Images

Create images from text prompts

2

Detect Objects

Use OwlViT for object detection

3

Filter Labels

Keep only specified labels

4

Save Dataset

Export in YOLO or COCO format

Function Signature

python
def generate(
self,
prompt: str,
num_images: int = 1,
labels: List[str] = None,
output_type: str = "yolo",
overwrite: bool = False,
**kwargs
) -> None:

Parameters

prompt (str) - Required

Text description of the images you want to generate. Be specific and descriptive for better results.

python
# Good prompts
prompt = "a cat sitting on a wooden chair in a living room"
prompt = "a person walking a dog in a park with trees"
prompt = "a red car parked on a street with buildings"
# Avoid vague prompts
prompt = "animals" # Too vague
prompt = "stuff" # Not descriptive enough

num_images (int) - Default: 1

Number of images to generate. Higher numbers take more time but provide more diversity.

python
# Generate single image
detection_dataset.generate("a cat", num_images=1)
# Generate multiple images
detection_dataset.generate("a cat", num_images=5)
# Generate large batch
detection_dataset.generate("a cat", num_images=20)

labels (List[str]) - Optional

List of object labels to detect and keep. Only objects matching these labels will be included in the dataset.

python
# Single label
detection_dataset.generate("a cat", labels=["cat"])
# Multiple labels
detection_dataset.generate("a cat on a chair", labels=["cat", "chair"])
# No labels (detect all objects)
detection_dataset.generate("a cat on a chair") # Will detect all objects

output_type (str) - Default: "yolo"

Format for saving the dataset. Choose between YOLO and COCO formats.

python
# YOLO format (individual .txt files)
detection_dataset.generate("a cat", output_type="yolo")
# COCO format (single .json file)
detection_dataset.generate("a cat", output_type="coco")

overwrite (bool) - Default: False

Whether to overwrite existing files. Set to True to replace existing datasets.

python
# Append to existing dataset
detection_dataset.generate("a cat", overwrite=False)
# Replace existing dataset
detection_dataset.generate("a cat", overwrite=True)

Basic Examples

Simple Single Image Generation

python
from cvpal.generate import DetectionDataset
# Initialize dataset
detection_dataset = DetectionDataset()
# Generate single image
detection_dataset.generate(
prompt="a cat looking at the camera",
num_images=1,
labels=["cat"],
output_type="yolo"
)

Multiple Images with Multiple Labels

python
# Generate multiple images with multiple objects
detection_dataset.generate(
prompt="a person walking a dog in a park",
num_images=5,
labels=["person", "dog"],
output_type="yolo",
overwrite=False
)

COCO Format Export

python
# Generate dataset in COCO format
detection_dataset.generate(
prompt="a car parked on a street with buildings",
num_images=10,
labels=["car", "building"],
output_type="coco",
overwrite=False
)

Advanced Usage

Batch Processing

Generate multiple datasets with different prompts:

python
# Generate multiple datasets
prompts = [
"a cat sitting on a chair",
"a dog running in a park",
"a person riding a bicycle"
]
for i, prompt in enumerate(prompts):
detection_dataset.generate(
prompt=prompt,
num_images=3,
labels=["cat", "dog", "person"][i:i+1], # Single label per prompt
output_type="yolo",
overwrite=False
)

Quality Control Workflow

Generate, check quality, and clean up dataset:

python
# Generate initial dataset
detection_dataset.generate(
prompt="a cat on a chair",
num_images=10,
labels=["cat", "chair"],
output_type="yolo"
)
# Check for empty images
empty_images = detection_dataset.isnull()
print(f"Empty images: {len(empty_images)}")
# Remove empty images
if len(empty_images) > 0:
detection_dataset.dropna()
# Visualize samples
detection_dataset.show_samples(num_samples=3)

Output Structure

YOLO Format

Each image gets a corresponding .txt file with normalized coordinates:

text
# Example: image001.jpg -> image001.txt
# Format: class_id center_x center_y width height
0 0.5 0.3 0.2 0.4 # cat at center
1 0.7 0.8 0.3 0.2 # chair at bottom right

COCO Format

Single JSON file with comprehensive metadata:

json
{
"images": [...],
"annotations": [...],
"categories": [
{"id": 0, "name": "cat"},
{"id": 1, "name": "chair"}
]
}

Best Practices

✅ Effective Prompts

  • • Be specific and descriptive
  • • Include context and environment
  • • Mention object positions
  • • Use consistent terminology
  • • Avoid ambiguous descriptions

⚠️ Common Pitfalls

  • • Vague prompts lead to poor detection
  • • Too many objects in single image
  • • Inconsistent label naming
  • • Not checking output quality
  • • Overwriting without backup

Troubleshooting

No Objects Detected

Issue: Generated images have no detected objects.

Solution: Make prompts more specific, check label names, or lower detection threshold.

Poor Image Quality

Issue: Generated images are blurry or unrealistic.

Solution: Use more descriptive prompts, increase inference steps, or try different model.

Memory Issues

Issue: Out of memory errors during generation.

Solution: Reduce batch size, use smaller image dimensions, or process in smaller chunks.