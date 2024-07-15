Pinterest is working on its own AI text-to-image generation technology, but with a twist compared to other applications. According to the Pinterest Engineering team, their ‘Canvas’ model is designed to generate backgrounds for product images without altering the main product itself.

This approach requires more specialised training. While most AI models generate images by matching text descriptions with visual outputs, product images usually lack detailed background descriptions. Thus, Pinterest has developed a method to separate and manipulate the background and foreground with simple commands.

As per Pinterest, “Training Pinterest Canvas gives us a strong base model that understands what objects look like, what their names are, and how they are typically composed into scenes. However, as previously stated, our goal is training models that can visualise or reimagine real ideas or products in new contexts.”

Pinterest plans to use its existing database of product images to establish standard framing, placement, and background types, facilitating AI-generated backgrounds. This process is complex but results in high accuracy.

The system uses a segmentation model to separate foreground and background, enhanced by detailed captions from a visual language model. They train a LoRA on all UNet layers for efficient fine-tuning and fine-tune on highly-engaged promoted product images to match Pinterest aesthetics.

Ultimately, this system allows brands to choose background styles easily by describing their preferences, with Pinterest's model providing suitable options for product shots. This concept is currently being tested with select ad partners.