Synthetic Data Generation

DetectionDataset Class

The main class for generating synthetic datasets with object detection capabilities.

Overview

The DetectionDataset class is the core component of cvPal's synthetic data generation system. It combines image generation with object detection to create labeled datasets automatically.

Key Features

Image Generation

Create images from text prompts

Object Detection

Automatically detect and label objects

Dataset Export

Export in YOLO or COCO format

Initialization

Create a new DetectionDataset instance with your preferred model:

Basic Initialization

python

from cvpal.generate import DetectionDataset

# Initialize with Stable Diffusion (default)
detection_dataset = DetectionDataset()

# Initialize with specific model
detection_dataset = DetectionDataset(model="stable-diffusion")

# Initialize with DALL-E (requires API key)
detection_dataset = DetectionDataset(
    model="dall-e",
    openai_api_key="your-api-key-here"
)

📝 Parameters

model - "stable-diffusion" or "dall-e"
openai_api_key - Required for DALL-E model

🎯 Supported Models

• stable-diffusion - Free, local processing
• dall-e - Premium quality, API required

Main Methods

generate() - Main Generation Function

Generate synthetic images with automatic object detection and labeling:

python

# Generate synthetic dataset
detection_dataset.generate(
    prompt="a cat looking at the camera",
    num_images=5,
    labels=["cat"],
    output_type="yolo",
    overwrite=False
)

add_labels() - Add Labels to Dataset

Add additional labels to existing dataset:

python

# Add labels to existing dataset
detection_dataset.add_labels(["dog", "person", "car"])

show_samples() - Visualize Samples

Display generated samples with bounding boxes:

python

# Show sample images
detection_dataset.show_samples(num_samples=3)

Quality Control Methods

isnull() - Check for Empty Detections

Identify images with no detected objects:

python

# Check for empty detections
empty_images = detection_dataset.isnull()
print(f"Found {len(empty_images)} images with no detections")

dropna() - Remove Empty Images

Remove images that have no detected objects:

python

# Remove images with no detections
detection_dataset.dropna()
print("Removed images with no detections")

Complete Example

A complete workflow using the DetectionDataset class:

python

from cvpal.generate import DetectionDataset

# 1. Initialize the dataset
detection_dataset = DetectionDataset(model="stable-diffusion")

# 2. Generate initial dataset
detection_dataset.generate(
    prompt="a cat sitting on a chair",
    num_images=10,
    labels=["cat", "chair"],
    output_type="yolo",
    overwrite=False
)

# 3. Add more labels
detection_dataset.add_labels(["person", "dog"])

# 4. Generate more diverse images
detection_dataset.generate(
    prompt="a person walking a dog in the park",
    num_images=5,
    labels=["person", "dog"],
    output_type="yolo",
    overwrite=False
)

# 5. Check for quality issues
empty_images = detection_dataset.isnull()
if len(empty_images) > 0:
    print(f"Found {len(empty_images)} empty images")
    detection_dataset.dropna()

# 6. Visualize results
detection_dataset.show_samples(num_samples=5)

print("Dataset generation complete!")

Best Practices

✅ Recommended Workflow

• Start with small batches (5-10 images)
• Use descriptive, specific prompts
• Check quality with show_samples()
• Remove empty images with dropna()
• Use consistent label names
• Save progress frequently

⚠️ Common Issues

• Vague prompts lead to poor detection
• Too many objects in single image
• Inconsistent label naming
• Not checking for empty images
• Overwriting existing datasets

Synthetic Generation generate() Function

DetectionDataset Class

Overview

Key Features

Image Generation

Object Detection

Dataset Export

Initialization

Basic Initialization

📝 Parameters

🎯 Supported Models

Main Methods

generate() - Main Generation Function

add_labels() - Add Labels to Dataset

show_samples() - Visualize Samples

Quality Control Methods

isnull() - Check for Empty Detections

dropna() - Remove Empty Images

Complete Example

Best Practices

✅ Recommended Workflow

⚠️ Common Issues

Table of Contents