Show HN: Generative Fill with AI and 3D
42 by olokobayusuf | 5 comments on Hacker News.
Hey all, You've probably seen projects that add objects to an image from a style or text prompt, like InteriorAI (levelsio) and Adobe Firefly. The prevalent issue with these diffusion-based inpainting approaches is that they don't yet have great conditioning on lighting, perspective, and structure. You'll often get incorrect or generic shadows; warped-looking objects; and distorted backgrounds. What is Fill 3D? Fill 3D is an exploration on doing generative fill in 3D to render ultra-realistic results that harmonize with the background image, using industry-standard path tracing, akin to compositing in Hollywood movies. How does it work? 1. Deproject: First, deproject an image to a 3D shell using both geometric and photometric cues from the input image. 2. Place: Draw rectangles and describe what you want in them, akin to Photoshop's Generative Fill feature. 3. Render: Use good ol' path tracing to render ultra-realistic results. Why Fill 3D? + The results are insanely realistic (see video in the github repo, or on the website). + Fast enough: Currently, generations take 40-80 seconds. Diffusion takes ~10seconds, so we're slower, but for the level of realism, it's pretty good. + Potential applications: I'm thinking of virtual staging in real estate media, what do you think? Check it out at https://fill3d.ai + There's API access! :D + Right now, you need an image of an empty room. Will loosen this restriction over time. Fill 3D is built on Function ( https://fxn.ai ). With Function, I can run the Python functions that do the steps above on powerful GPUs with only code (no Dockerfile, YAML, k8s, etc), and invoke them from just about anywhere. I'm the founder of fxn. Tell me what you think!! PS: This is my first Show HN, so please be nice :)
42 by olokobayusuf | 5 comments on Hacker News.
Hey all, You've probably seen projects that add objects to an image from a style or text prompt, like InteriorAI (levelsio) and Adobe Firefly. The prevalent issue with these diffusion-based inpainting approaches is that they don't yet have great conditioning on lighting, perspective, and structure. You'll often get incorrect or generic shadows; warped-looking objects; and distorted backgrounds. What is Fill 3D? Fill 3D is an exploration on doing generative fill in 3D to render ultra-realistic results that harmonize with the background image, using industry-standard path tracing, akin to compositing in Hollywood movies. How does it work? 1. Deproject: First, deproject an image to a 3D shell using both geometric and photometric cues from the input image. 2. Place: Draw rectangles and describe what you want in them, akin to Photoshop's Generative Fill feature. 3. Render: Use good ol' path tracing to render ultra-realistic results. Why Fill 3D? + The results are insanely realistic (see video in the github repo, or on the website). + Fast enough: Currently, generations take 40-80 seconds. Diffusion takes ~10seconds, so we're slower, but for the level of realism, it's pretty good. + Potential applications: I'm thinking of virtual staging in real estate media, what do you think? Check it out at https://fill3d.ai + There's API access! :D + Right now, you need an image of an empty room. Will loosen this restriction over time. Fill 3D is built on Function ( https://fxn.ai ). With Function, I can run the Python functions that do the steps above on powerful GPUs with only code (no Dockerfile, YAML, k8s, etc), and invoke them from just about anywhere. I'm the founder of fxn. Tell me what you think!! PS: This is my first Show HN, so please be nice :)