Apple has released an AI model named MGIE, short for MLLM-Guided Image Editing, which revolutionizes image editing through natural language instructions. Developed in collaboration with researchers from the University of California, Santa Barbara, MGIE employs multimodal large language models (MLLMs) to interpret user commands and execute pixel-level manipulations on images. This marks a significant stride for Apple in the AI arena, showcasing its potential contribution to the field.
Unlike conventional photo editing software, where users manipulate images manually, MGIE allows users to describe desired edits in plain language, simplifying the editing process. Users can effortlessly instruct MGIE to perform various tasks, such as cropping, resizing, flipping, and applying filters to images, solely through text prompts.
The model’s capabilities extend beyond basic edits; it can handle more complex tasks like modifying specific objects within an image or enhancing its overall quality. By leveraging MLLMs, MGIE translates user instructions into clear and concise guidance for the editing process, ensuring accurate and efficient edits.
For instance, instructing MGIE to “make the sky more blue” results in an increase in the saturation of the sky region by a specified percentage. Moreover, the model can perform global optimizations such as adjusting brightness, contrast, sharpness, and color balance, as well as applying artistic effects like sketching and painting.
MGIE’s ease of use is further underscored by its availability as an open-source project on GitHub, allowing users to access the code, data, and pre-trained models. Additionally, a web demo hosted on Hugging Face Spaces enables users to experiment with MGIE’s capabilities online.
Beyond its practical utility, MGIE represents a significant advancement in instruction-based image editing, offering a seamless interface for users to express their creativity. It exemplifies Apple’s commitment to advancing AI research and development, aligning with CEO Tim Cook’s vision of integrating more AI features into Apple devices.
While MGIE signals a breakthrough in assistive AI for creative tasks, experts emphasize the need for ongoing refinement and improvement in multimodal AI systems. Nevertheless, the release of MGIE heralds a promising era where AI serves as an indispensable tool in unleashing human creativity.