PLAY PODCASTS
#40 - SPECIAL: GPT-X, Diffusion, and our Multimodal Future Part II
Episode 40

#40 - SPECIAL: GPT-X, Diffusion, and our Multimodal Future Part II

Multimodal by Bakz T. Future

September 22, 20221h 15mExplicit

Audio is streamed directly from the publisher (mcdn.podbean.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

What could an AI creative tool like DALL-E 2 or Midjourney look like in the next few years?

This video explores the full set of future possibilities upon us!

 

GPT-X, Diffusion, and our Multimodal Future (Part II) BOOK: https://docs.google.com/document/d/1TeRMqOsGX8kiK_mNiCt4yxO26Yjk6BdM/

 

Timestamps:

00:00 - Intro
05:02 - Multimodal Model Characteristics
11:26 - Creating a new image
15:56 - Prompt writing capabilities - Prompt Formatting
16:50 - Prompt autocompletion, intelligent suggestions
19:57 - Built in prompt unbundling
21:57 - Configure Recombinants
23:45 - Recombinant blending modes
25:07 - Configure Recombinant Unbundling Details
26:12 - Recombinant Layers & Node based
28:05 - Recombinant Import Bench
29:48 - Natural Language Image editing - refresh individual objects
31:44 - Edit mode - Natural Langauge Edit Prompts
32:37 - Edit prompt realtime feedback 
33:05 - Natural language hex color editing support
34:35 - Advanced Natural Language Edit Prompt Changes
35:53 - Edit prompt intensity (UI controls)
36:59 - Multimodal Object Transformations
37:16 - Lighting & Camera Controls
38:00 - Advanced Editing Tools (Fix AI Weirdness and more)
39:27 - Edit images via markup (markup prompt)
40:43 - Logical Variations
43:57 - Realtime Collaboration
44:28 - Recombinant Collaborators
46:06 - Built in Music Player
48:45 - Creative Hyperparameters
49:20 - Text Capabilities - Magic Text
50:52 - Text Capabilities - Magic Text Fill
52:41 - Latent Variation Scrubbing
53:34 - Explore Alternatives
56:34 - Productivity and commercialization
58:06 - Analysis & Feedback
59:16 - Advanced Multimodal Capabilities
1:03:45 - Offline Capabilities
1:11:11 - Advanced Workflow Support
1:11:58 - Closing thoughts

 

GPT-X, DALL-E, and our Multimodal Future (Part I):

https://www.youtube.com/playlist?list=PLza3gaByGSXjUCtIuv2x9fwkx3K_3CDmw

 

Links:

DALL-E 2 Unbundling

https://bakztfuture.substack.com/p/dall-e-2-unbundling

DALL-E 2: Recombinant Art & Design

https://bakztfuture.substack.com/p/dall-e-2-recombinant-art-and-design

DALL-E 2 - Unofficial Natural Language Image Editing, Art Critique Survey

https://bakztfuture.substack.com/p/dall-e-2-unofficial-natural-language-b14

 

Please note I do not represent, have any affiliation with, nor do I speak on behalf of OpenAI.

 

Subscribe to the Multimodal Podcast!

 

Spotify - https://open.spotify.com/show/7qrWSE7ZxFXYe8uoH8NIFV

Apple Podcasts - https://podcasts.apple.com/us/podcast/multimodal-by-bakz-t-future/id1564576820

Google Podcasts -  https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkLnBvZGJlYW4uY29tL2Jha3p0ZnV0dXJlL2ZlZWQueG1s

Stitcher - https://www.stitcher.com/show/multimodal-by-bakz-t-future

Other Podcast Apps (RSS Link) - https://feed.podbean.com/bakztfuture/feed.xml

 

Connect with me:


YouTube - https://www.youtube.com/bakztfuture

Substack Newsletter - https://bakztfuture.substack.com

Twitter - https://www.twitter.com/bakztfuture

Instagram - https://www.instagram.com/bakztfuture

Github - https://www.github.com/bakztfuture