Llama3.2 Vision Image to text Ai Automation
Definition:
This Ai Automation template automates the use of the Clean-Ui interface for Llama 3.2 11B Vision model release by Meta. You can now create SEO friendly Alt descriptions for all your images as our tests with an RTX 3090 GPU card gives us a speed of up to 3,600 Image to text generation per day. This can also be used to analyze images on the fly or other use cases. The minimum GPU RAM you need is 12GB but hopefully in future updates this requirement will go down.

Requirements: Before using this template you need to :
1) Install the Clean-Ui Interface: https://github.com/ThetaCursed/clean-ui
2) After installing make sure you cd into the folder using this command line: cd clean-ui
3) Activate the virtual environment using this command line: .\venv-ui\Scripts\activate
4) Launch the interface using this command line: python clean-ui.py
5) Only start this template when the interface is ready via this URL: http://127.0.0.1:7860

How to configure this template:

1) Update the file path of the directory where your images are

2) Optional: You can update the prompt to get better or different results

This template scrapes the following data properties:
Image Path
Description
This template uses the following commands & functions:
Watch Video Demo :