Leverage AI
Posts
Build A Voice Activated Telegram AI Agent In N8N, OpenAI Breaks The Internet With Advanced Image Generation, Amazon Nova Act AI Browser Agent

Build A Voice Activated Telegram AI Agent In N8N, OpenAI Breaks The Internet With Advanced Image Generation, Amazon Nova Act AI Browser Agent

Owain Lewis
April 01, 2025

Hey friend,

Lots of exciting stuff this week. Today, we’ll explore:

How to build a voice activated AI agent in N8N
How to create Studio Ghibli style images that are taking over the internet.
Amazon’s Nova Act AI browser agent
Gemini 2.5 Pro
The newest update to Ideogram (my favourite AI image generator)

Let's dive in.

Tutorial: Build A Voice Activated Telegram AI Agent In N8N

Last week I published a tutorial showing how to build your own AI agent in N8N that you can talk to via Telegram. This pattern can be used to build any chat and voice activated AI agent.

If you have any topics you’d like me to cover on the channel feel free to reply to this email.

AI News

Here’s my pick of AI news this week:

1. OpenAI Breaks The Internet With Advanced Image Generation In ChatGPT

OpenAI just baked some seriously impressive image generation right into ChatGPT with GPT-4o, letting you create and tweak visuals directly through conversation. The results are grabbing attention, especially for its ability to handle text within images.

Example image generated with GPT-4o.

What this is: Image generation is now a native capability built directly into the GPT-4o language model, enabling seamless, conversation-based image creation and editing.
Why it matters: This tight integration allows for better contextual understanding from your chat history and makes refining images or maintaining style consistency across multiple generations much easier than with separate tools.
Interesting detail: GPT-4o is now particularly good at accurately rendering text within images, making it great for creating diagrams, signs, logos, or even complex visuals like UI mockups and ad campaigns based on uploaded product photos.

images in chatgpt are wayyyy more popular than we expected (and we had pretty high expectations).
rollout to our free tier is unfortunately going to be delayed for awhile.
— Sam Altman (@sama)
8:55 PM • Mar 26, 2025

2. Amazon Unveils Nova Act AI Browser Agent

Amazon just unveiled Nova Act, an AI agent built to autonomously handle tasks for you directly within your web browser. It's now out as a research preview in the US, marking Amazon's big push into AI that takes action.

Nova Act is basically an AI assistant that lives in your browser, capable of tackling multi-step jobs like shopping online, filling out forms, or scheduling appointments all on its own.
This is Amazon's major entry into the competitive AI agent arena, shifting the focus from chatbots to AI that actually does things, and it's set to eventually power features in the upcoming Alexa+.
It's powered by five distinct Nova models, boasts a 94% score on a key web benchmark (beating some rivals), and Amazon even claims its models are 75% cheaper than competing options.

3. Google Launches Gemini 2.5 Pro: Its Most Advanced AI

Google just rolled out Gemini 2.5 Pro, their smartest AI model yet, boosting its ability to reason through problems and work with all sorts of data like text, code, and video. It's designed to think more deeply and handle way more information than before.

Gemini 2.5 Pro is Google's new flagship AI, capable of processing text, images, audio, video, and code together, and it's specifically built to analyse information step-by-step for better accuracy.
This matters because it sets new performance records on tough reasoning benchmarks in areas like math and science, and its huge 1 million token context window lets it analyse entire codebases or massive documents without losing track.
Beyond benchmarks, it's already #1 on the human preference leaderboard (LMArena), boasts strong coding abilities, and is available now via Google AI Studio and the Gemini app for Advanced users, with plans for wider access.

Note: I have been seriously impressed by this new model for my own use cases!!

Ideogram Launches Ideogram 3.0 With Style References

Ideogram (my favourite AI image generator) just rolled out Ideogram 3.0, a major update to its AI image generator that seriously boosts creative control and image quality. It makes generating precise, high-quality visuals easier, especially for design and marketing tasks.

What this is: An advanced AI tool for creating impressive images from text prompts, featuring significantly improved photorealism and much clearer text rendering within the generated visuals.
Why it matters: This update gives users unprecedented control over the style and aesthetic, especially with the new Style References feature, making sophisticated image generation more accessible to everyone.
Interesting detail: You can now upload up to three reference images to guide the AI's style, allowing for precise aesthetic matching for things like branding or specific creative projects.

How To Use The New Chat GPT Image Generator

Everyone is having fun with the new Chat GPT image generator creating Studio Ghibli style images. To use it:

Go to chatgpt.com
Select GPT-4o as the model
You can upload an image and modify it or create your own prompts.
Once you have generated an image, you can continue to refine it.

Example: “Create an image of a cat in Studio Ghibli style”

Your Opinion Matters

What did you think of today’s email? Your feedback helps me create better emails for you!

Got more feedback or want me to cover a specific topic? Reply to this email and let me know.

Owain