MajjiKishore

For this project, I explored how to use AI to automate the creation of high-quality advertising images. The goal was to build a tool that could take separate images of a product, a model, and a background, and then use AI to merge them into a single, photorealistic fashion advertisement.

My Development Approach

I designed a multi-step pipeline that uses Google's Gemini API to understand the input images and then generate a new one based on a combined vision.

1. Deconstructing the Scene

My first challenge was to teach the AI to "see" and understand the individual components of a potential ad. I used the Gemini API's vision capabilities to analyze the images I provided:

It extracts a detailed description of the product (e.g., "a red leather handbag with a gold chain strap").

It describes the model's appearance and pose.

It captures the mood and setting of the background image.

2. Crafting the Perfect Prompt

Once I had the text descriptions, the next step was to combine them into a single, effective instruction for the image generator. This was a fun prompt engineering challenge. I wrote a Python script that intelligently weaves the descriptions together into a cohesive narrative, like:

This ensured the final output was not just a random combination but a thoughtfully constructed scene.

3. Bringing the Image to Life

With the detailed prompt ready, I fed it back into the Gemini API to generate the final image. The script automatically saves the generated image and a JSON file containing all the descriptions used to create it. This makes it easy to track how different prompts affect the final result.

To keep the project clean and maintainable, I organized the code logically: a main.py script to run the process, utils.py for the core functions (like description and prompt generation), and a simple config.py to manage the API key.

Reference: https://github.com/majjikishore007/ImageGeneration

AI-Powered Advertising Image Generation

My Development Approach

1. Deconstructing the Scene

2. Crafting the Perfect Prompt

3. Bringing the Image to Life