Image Tagging with AI Gen

Portfolio Image
Portfolio Image

Project Information

Project Overview

The Image Tagging with AI Gen project delivers an advanced vision-language extraction pipeline designed for Pernod Ricard to automate detailed product tagging at scale. It replaces costly manual outsourcing by reducing tagging costs by 10x–50x while improving accuracy and consistency. The system supports multiple use cases, including high-volume product photo tagging (30+ attributes) and complex presentation screenshot tagging (200+ attributes). Success depends on careful setup with the client to define tags precisely and structure them into effective prompts and accepted values.

Key Features & Functionalities

Vision-Language Extraction

Leverages OpenAI GPT-4o for multi-modal extraction from images and videos, including OCR for embedded text.

Template-Driven Flexibility

Customizable JSON templates define fields, accepted values, and prompts for seamless adaptation to new tagging needs.

Intelligent Prompt Splitting

Automatically splits large extraction tasks to respect API token and image limits while maintaining field-level precision.

Double-Check & Self-Healing

Supports double-extraction with conflict resolution and automatic retries for missing or invalid fields.

Rule-Based Post-Processing

Applies YAML-defined logic to enforce business rules, derive new fields, and normalize outputs to match analytical needs.

Scalable Architecture

Modular design with batch pipelines, parameterized runs, and extensive configuration for production-scale deployment.

Architecture & Workflow

The pipeline is built with modular components that load configuration from YAML/JSON files, handle images and videos via dynamic keyframe extraction, and construct prompts tailored to each field’s accepted values. It integrates OpenAI’s GPT-4o vision models with optional OCR engines for rich text extraction. Post-processing logic enforces business rules and ensures standardized CSV output ready for downstream analytics.

The system is designed to be highly adaptable: new fields, categories, and rules can be added without changing code, making it ideal for dynamic marketing and product cataloging requirements.

Results & Business Impact

By replacing manual outsourcing with AI-driven tagging, Pernod Ricard achieved significant cost savings (10x–50x reduction) while maintaining high accuracy. The solution supports scalable, repeatable data labeling workflows that align with internal standards and marketing needs. Careful collaboration with the client's teams ensures effective tag definition and prompt design for optimal results.