Image Tagging with AI Gen
Project Information
- Category: AI for Business
- Client: Pernod Ricard
- Project date: 2025
- Project URL: github.com/albiche/ai_image_tagging
Project Overview
The Image Tagging with AI Gen project delivers an advanced vision-language extraction pipeline designed for Pernod Ricard to automate detailed product tagging at scale. It replaces costly manual outsourcing by reducing tagging costs by 10x–50x while improving accuracy and consistency. The system supports multiple use cases, including high-volume product photo tagging (30+ attributes) and complex presentation screenshot tagging (200+ attributes). Success depends on careful setup with the client to define tags precisely and structure them into effective prompts and accepted values.
Key Features & Functionalities
Vision-Language Extraction
Leverages OpenAI GPT-4o for multi-modal extraction from images and videos, including OCR for embedded text.
Template-Driven Flexibility
Customizable JSON templates define fields, accepted values, and prompts for seamless adaptation to new tagging needs.
Intelligent Prompt Splitting
Automatically splits large extraction tasks to respect API token and image limits while maintaining field-level precision.
Double-Check & Self-Healing
Supports double-extraction with conflict resolution and automatic retries for missing or invalid fields.
Rule-Based Post-Processing
Applies YAML-defined logic to enforce business rules, derive new fields, and normalize outputs to match analytical needs.
Scalable Architecture
Modular design with batch pipelines, parameterized runs, and extensive configuration for production-scale deployment.
Architecture & Workflow
The pipeline is built with modular components that load configuration from YAML/JSON files, handle images and videos via dynamic keyframe extraction, and construct prompts tailored to each field’s accepted values. It integrates OpenAI’s GPT-4o vision models with optional OCR engines for rich text extraction. Post-processing logic enforces business rules and ensures standardized CSV output ready for downstream analytics.
The system is designed to be highly adaptable: new fields, categories, and rules can be added without changing code, making it ideal for dynamic marketing and product cataloging requirements.
Results & Business Impact
By replacing manual outsourcing with AI-driven tagging, Pernod Ricard achieved significant cost savings (10x–50x reduction) while maintaining high accuracy. The solution supports scalable, repeatable data labeling workflows that align with internal standards and marketing needs. Careful collaboration with the client's teams ensures effective tag definition and prompt design for optimal results.