AI: Beyond the Buzz
Posts
🎙️ Tiny voice model beats giants

🎙️ Tiny voice model beats giants

+ Life like chatbots

Tim Moylan
25 Apr • Reading Time: 8 minutes

Hey folks,

Ever sit there poking the keyboard, trying to word a prompt just right? Flip the script: tell OpenAI what you want and let it craft the perfect prompt first. Give me the perfect prompt for a punchy LinkedIn post on dark-mode UX, keep it casual” Then watch it spin up a polished prompt you can feed right back in. Cleaner wording, sharper output, zero guesswork.

Let’s dive in..

_{Read Time: 5 minutes}

🎙️ A Tiny Open-Source TTS Model That Screams and Laughs

Two undergrads. One still in the military. Zero funding.
One ridiculous goal: build a TTS model that rivals NotebookLM Podcast, ElevenLabs Studio, and Sesame CSM.
Somehow… we pulled it off. Here’s how 👇
— Toby Kim (@_doyeob_)
11:43 PM • Apr 21, 2025

🐝 The Buzz: There’s a new text-to-speech model in town, and it’s not from a tech giant—it’s from a two-person team out of Korea. Meet Dia, a surprisingly powerful open-source TTS model that’s already making people question why ElevenLabs and OpenAI need so many resources. It's fast, it’s expressive, and yeah… it can scream, cough, and laugh better than most of us on a Monday morning.

Emotion on Lock: Dia doesn’t just read text—it performs it. It handles emotional delivery, speaker turns, and nonverbal cues like (laughs), (sighs), and even (screams) with scary-good realism.
Real-Time Friendly: You can run it on a single GPU with just 10GB of VRAM. That means it’s not just powerful—it’s actually usable in live apps without needing a server farm.
Totally Open-Source: Released under Apache 2.0, Dia is free for commercial use. The model and code are up on GitHub and Hugging Face for anyone who wants to build with it or poke around under the hood.
Tiny but Mighty: At just 1.6B parameters, it punches way above its weight class—especially compared to proprietary models that are bloated, locked down, or both.

💡 Takeaway: Dia proves that great TTS doesn't need to be huge, expensive, or gatekept. It's fast, expressive, open, and built by two people who clearly know what they’re doing. If you're building voice into anything—or just want an AI that can yell convincingly—this one’s worth checking out.

🧑‍🦰 Character.AI’s AvatarFX: Your AI Chatbot Just Got a Face

🐝 The Buzz: Character.AI just dropped AvatarFX, a new tool that turns static images into full-on animated, talking chatbots. We’re talking lip-syncing, facial expressions, even head and body movement.

One Image = Full Personality: Upload a single photo or character illustration, and AvatarFX will animate it with speech, gestures, and surprisingly natural movement, you might do a double take.
Styles for Days: You’re not stuck with just human avatars either. It works with everything from cartoon animals to fantasy characters to realistic portraits.
Powered by Diffusion + TTS: It’s running on a diffusion-based model with Character.AI’s own text-to-speech system, which makes the animations feel smooth without melting your GPU.

💡 Takeaway: AvatarFX feels like the next logical step in making AI characters feel alive. It’s fun, weirdly personal, and opens up a ton of creative potential for storytellers, roleplayers, and anyone who’s ever wanted their favourite character to talk back to you.

💻️ How to put yourself in a glass bottle

The next GPT trend is inserting yourself in a glass bottle in Anime-style Chibi figures

Go to ChatGPT
Upload a photo of yourself
Prompt the following

Generate a portrait image of a detailed, all-glass gashapon capsule held between two fingers. Inside the capsule is a miniature version of myself, chibi-style and life-size, with the same face as the person in the photo. The chibi character is wearing the same outfit as seen in the uploaded photo.

🐝 AI buzz bits

🖼️ OpenAI has launched its upgraded image generator, gpt-image-1, for developers, allowing seamless integration into apps and services, all via the API.

✨ Microsoft 365 Copilot's Wave 2 spring release introduces advanced AI features, including intuitive search, agent collaboration, and creative tools, allowing businesses to enhance productivity through seamless human-agent teamwork and personalized insights.

👁️ xAI's Grok chatbot introduces Grok Vision, enabling users to visually interact with their environment by asking questions about objects seen through their smartphone camera, similar to existing features in Google’s Gemini and ChatGPT.

📰 The Washington Post has teamed up with OpenAI to integrate its content into ChatGPT, enabling users to access summaries, quotes, and links from the publication.

🖼️ Generative Art

cyberpunk funny cool beaver, frontal view, in latex suit wearing glowing minimalistic glasses, duft punk style, studio key light, professional photosession, detailed costume, isolated black background, black and yellow colors

AI productivity tools
🔥 Take your workflow to the next level.

Omniflow - AI-driven platform that streamlines product development from ideation to delivery with advanced automation and insights.
404Tomb - A digital graveyard for forgotten startups, honoring lost projects and their aspirations without sorrow.
Webdraw - An innovative platform to explore, remix, and build AI apps using various models for creative projects.
Cluely - AI-driven teleprompter that empowers sales teams with real-time conversation support for winning calls.

For a full list of 1000+ AI tools, visit our Supertool Directory

🤘 TGIF

I hope you enjoyed the AI buzz today.

👉️ We need your feedback below to make better content.

👍️ Refer our newsletter to your friends and help us grow.

Cheers, Tim

What did you think of todays newsletter?

This helps me make things better.