Core Features

Synthetic Generation

Generate synthetic datasets with automatic object detection and labeling using AI models.

Overview

cvPal's synthetic generation feature combines state-of-the-art image generation models (DALL-E 3, Stable Diffusion) with automatic object detection (OWL-ViT) to create fully annotated datasets. Simply provide a text prompt and labels, and cvPal will generate images with precise bounding box annotations.

How It Works

Generate Image

AI model creates image from text prompt

Detect Objects

OWL-ViT identifies objects in image

Create Labels

Generate bounding box annotations

Save Dataset

Export in YOLO or COCO format

Basic Usage

Get started with synthetic dataset generation using the DetectionDataset class:

Simple Example

python

from cvpal.generate import DetectionDataset

# Initialize with Stable Diffusion (default)
dataset = DetectionDataset(model="stable-diffusion")

# Generate a dataset
dataset.generate(
    prompt="a cat sitting on a chair",
    num_images=5,
    labels=["cat", "chair"],
    output_type="yolo"
)

📝 Parameters

prompt - Text description for image generation
num_images - Number of images to generate
labels - List of object classes to detect
output_type - "yolo" or "coco" format
height/width - Image dimensions (default: 512x512)

🎯 Output

• Generated images in images/ folder
• Corresponding labels in labels/ folder
• Dataset configuration in data.yaml
• Optional COCO annotations in annotations.json

Model Selection

Choose between different image generation models based on your needs:

Stable Diffusion (Recommended)

Free, local processing with high customization options. Best for most use cases.

python

# Stable Diffusion - No API key required
dataset = DetectionDataset(model="stable-diffusion")

# Customize generation parameters
dataset.generate(
    prompt="a dog playing in the park",
    num_images=10,
    labels=["dog", "park", "tree"],
    height=768,
    width=768,
    seed=42,
    output_type="yolo"
)

DALL-E 3

Premium quality images with excellent prompt understanding. Requires OpenAI API key.

python

# DALL-E 3 - Requires OpenAI API key
dataset = DetectionDataset(
    model="dalle",
    openai_api_key="your-openai-api-key"
)

# Generate high-quality images
dataset.generate(
    prompt="a professional photo of a cat in a business suit",
    num_images=3,
    labels=["cat", "suit"],
    output_type="coco"
)

Advanced Features

Dataset Management

Manage and extend your generated datasets with built-in utilities:

Add Labels to Existing Dataset

python

# Add new labels to existing images
dataset.add_labels(labels=["person", "car"])

# This will:
# - Run detection on all existing images
# - Add new annotations to label files
# - Update data.yaml with new classes

Quality Control

python

# Check for images with no detections
dataset.isnull()

# Remove images with no detections
dataset.dropna()

# Visualize samples
dataset.show_samples(num_samples=5, annotation_type="yolo")

Batch Processing

Generate multiple datasets efficiently with parallel processing:

python

# Generate multiple datasets
prompts = [
    "a cat sitting on a chair",
    "a dog playing with a ball",
    "a bird flying over trees"
]

for i, prompt in enumerate(prompts):
    dataset = DetectionDataset(model="stable-diffusion")
    dataset.generate(
        prompt=prompt,
        num_images=5,
        labels=["animal", "object"],
        output_type="yolo",
        overwrite=True
    )
    print(f"Generated dataset {i+1}/3")

Output Formats

Choose the output format that best fits your training pipeline:

YOLO Format

Each image gets a corresponding .txt file with normalized coordinates.

text

# image_001.txt
0 0.5 0.3 0.2 0.4  # cat: class_id x_center y_center width height
1 0.7 0.6 0.15 0.3  # chair: class_id x_center y_center width height

Best for: YOLOv5, YOLOv8, custom detection models

COCO Format

Single JSON file with comprehensive metadata and annotations.

json

{
  "images": [{"id": 1, "file_name": "image_001.jpg", "width": 512, "height": 512}],
  "annotations": [{"id": 1, "image_id": 1, "category_id": 1, "bbox": [100, 50, 200, 150], "area": 30000}],
  "categories": [{"id": 1, "name": "cat"}]
}

Best for: Detectron2, MMDetection, COCO evaluation

Best Practices

✅ Effective Prompts

• Be specific about object types and positions
• Include environmental context
• Use descriptive adjectives
• Mention lighting and style preferences
• Avoid ambiguous descriptions

Good: "a black cat sitting on a wooden chair in a living room"
Bad: "cat and chair"

🎯 Label Strategy

• Use consistent naming conventions
• Include all objects you want to detect
• Consider hierarchical labels
• Test detection threshold (0.1-0.3)
• Validate annotations manually

Example: ["person", "car", "traffic_light", "road"]

Performance Tips

🚀 Speed

• Use GPU acceleration
• Reduce inference steps
• Lower detection threshold
• Use parallel processing
• Consider smaller image sizes

💾 Memory

• Process images in batches
• Use torch.float16 precision
• Clear GPU cache regularly
• Monitor memory usage
• Use CPU fallback if needed

🎨 Quality

• Use higher resolution images
• Increase inference steps
• Fine-tune detection threshold
• Use diverse prompts
• Validate with manual inspection

Troubleshooting

No Objects Detected

If OWL-ViT doesn't detect objects in generated images:

• Lower the detection threshold (try 0.05-0.1)
• Use more specific labels in your prompt
• Ensure labels match objects in the image
• Try different image generation models

Poor Image Quality

For better image generation results:

• Increase inference steps (50-100)
• Use higher resolution (768x768 or 1024x1024)
• Improve prompt specificity
• Consider using DALL-E 3 for premium quality

Memory Issues

If you encounter GPU memory errors:

• Reduce batch size or image count
• Use smaller image dimensions
• Enable CPU fallback
• Clear GPU cache between generations

Supported Models Dataset Merging

Synthetic Generation

Overview

How It Works

Generate Image

Detect Objects

Create Labels

Save Dataset

Basic Usage

Simple Example

📝 Parameters

🎯 Output

Model Selection

Stable Diffusion (Recommended)

DALL-E 3

Advanced Features

Dataset Management

Add Labels to Existing Dataset

Quality Control

Batch Processing

Output Formats

YOLO Format

COCO Format

Best Practices

✅ Effective Prompts

🎯 Label Strategy

Performance Tips

🚀 Speed

💾 Memory

🎨 Quality

Troubleshooting

No Objects Detected

Poor Image Quality

Memory Issues

Table of Contents