Quickstart

Get up and running with cvpal in just a few minutes. This guide will walk you through the basic functionality.

Basic Import

Start by importing the main modules:

python
from cvpal import generate, preprocessing

Generate Synthetic Images

Create synthetic images from text prompts for data augmentation:

python
# Generate 10 images of cats
images = generate.synthetic_images("a cat sitting on a chair", 10)
# Generate with specific style
images = generate.synthetic_images(
"a dog running in a park",
5,
style="photorealistic"
)

Merge Datasets

Combine multiple datasets into a single unified dataset:

python
# Merge multiple datasets
merged_dataset = preprocessing.merge_datasets([
"path/to/dataset1/images",
"path/to/dataset2/images"
])
print(f"Merged dataset contains {len(merged_dataset)} images")

Manage Labels

Replace, remove, or remap labels in your dataset:

python
# Replace labels
preprocessing.replace_labels(
"path/to/dataset",
{"person": "pedestrian", "car": "vehicle"}
)
# Remove specific labels
preprocessing.remove_labels("path/to/dataset", ["background", "noise"])

Generate Reports

Analyze your dataset and generate comprehensive reports:

python
# Count label occurrences
label_counts = preprocessing.count_labels("path/to/dataset")
print("Label distribution:", label_counts)
# Generate detailed report
report = preprocessing.generate_report("path/to/dataset")
print(f"Total images: {report['total_images']}")
print(f"Total labels: {report['total_labels']}")

Complete Example

Here's a complete workflow example:

python
from cvpal import generate, preprocessing
# 1. Generate synthetic images for data augmentation
synthetic_images = generate.synthetic_images(
"person walking on street",
50,
style="photorealistic"
)
# 2. Merge existing datasets
merged_dataset = preprocessing.merge_datasets([
"path/to/street_dataset",
"path/to/pedestrian_dataset"
])
# 3. Standardize labels
preprocessing.replace_labels(
merged_dataset,
{"person": "pedestrian", "car": "vehicle"}
)
# 4. Generate dataset report
report = preprocessing.generate_report(merged_dataset)
print(f"Final dataset contains {report['total_images']} images")
print(f"Label distribution: {report['label_counts']}")

What's Next?

Now that you have the basics down, explore the Features section to learn about advanced functionality and the API Reference for detailed documentation.