After playing with AI Avatars (like many of us I guess around here), I started to wonder if we could instead bring real value to people by producing affordable professional head-shots using a combination of Dreambooth and ControlNet.
Obviously it's only the beginning and there are still many imperfections, but the foundational tech behind this (Dreambooth and ControlNet) are only respectively 6 months and 1.5 month old, and already delivers pretty amazing results.
I came up with this little service "Virtual Face" and I'm looking for feedback if some of you are willing to try it (you can use the HUNTER50 coupon to get 50% off, can't make it free to try yet since the running costs are still non-negligible).
Cheers, Pierre
So now even our profile pictures aren't really us, but just some pseudo-reality version of who we think we are. And I know I know, people will argue that makeup/airbrushing/photoshop/facetune has been going on forever, but at some point I feel like we cross the line where it's no longer "reality with some touchups", but instead it's "complete fantasy made to mimic reality".
I just feel like AI is shooting us head first down this state where reality and fantasy are evermore difficult to differentiate, and I don't like the implications.
However, rationally, it's not like humanity has had picture perfect representations of themselves for very long. For most of our evolution, we relied on paintings & sculptures, which it was up to the artist (or the commissioner) to decide on how 'real' they were, and they were almost always "complete fantasy made to mimic reality".
From this lens, the use of unedited real photos of you was the strange period of time, not this AI age we seem to be headed into.
Maybe that helps put your mind a little at ease, maybe it just confuses you more (definitely the latter for me)
It's a similar concern to the updated TikTok "makeup filter" that made the rounds recently, which basically is extremely difficult to detect as a filter. I thought especially poignant was a photographer who was saying she gets these beautiful women in to do portraits, and then when they look behind the camera to view their untouched portraits, they're aghast at how "ugly" they are, because the "filtered" version of themselves has started competing with the real version in their own head.
This shit just fucks with everyone's brain long term, in an unhelpful way, in my opinion.
For most of our evolution we relied on a little thing known as "real life". The idea of the masses routinely representing their identity through imagery is firmly an artifact of the internet age. The UK didn't even have photos on driver's licenses until 1998.
It all started when we started to argue with my family members on which avatars looked more like me. It made me realize that we were much more sensitive than I thought about our self-image. Me and my partner would pick different pictures in a set of 10 samples ^^ as if we had two slightly different perceptions of reality.
Now, I changed my mind slightly and tend to see these models as an another type of compression of information. Almost like a new censor of data.
Further… I think this is much more akin to those silly “make an anime avatar of yourself” apps than anything remotely resembling a headshot taken by a photographer.
For example, there is an example of a woman with medium length brown hair and the example “headshots” are… not good. It looks like her eye color is different in every picture, the black and white example looks kind of rotoscoped, and the example in the rotation immediately after the black and white has incredibly messed up eyes. One of her irises is all wonky and non-human looking. It’s a great example of why it’s important for a human to supervise and correct SD generated things like faces.
People have made their profile pictures a "pseudo-reality version" of who they think they are since the dawn of profile pictures IMHO.
But now, seeing what AI has on the horizon, I'm honestly just like "I need to go for a fucking walk, upon which I will throw my phone into the river."
Who cares if it's fantasy? We crossed that line _long_ ago with the examples you already brought up. The only difference is that this is new.
[0]https://bogdannovykov.substack.com/p/death-of-reality
Oh, you also add choosing the clothes, the hair style and your posture/facial expression to the mix.
It (the genre of image we're talking about)'s all unnatural and meant to portray people in a light somebody wants to portray them in. In fact that's most of "photography" is.
https://www.youtube.com/watch?v=-lblzHvKhRc
It does look a little plastique, but it is also (deliberately an extreme case).
If you compare reality today to reality at any other time, reality is pretty great.
If I was dedicated I think there are some free solutions for training a model on your own face out there but you might have an edge in convenience. Ten dollars seems too steep for that though.
It reads to me like you cherrypicked the two best examples and even those aren't great, whether that's true or not.
Edit: Woof, I didn't realize you could get bigger versions of the examples. They were mostly fine as thumbnails but blown up to ~500x500 the the eyes and mouth are rough.
Deleted Comment
i would also consider a trial pic for free tied to email.
Deleted Comment
Dead Comment
These days you don't even need to suss out the negative prompts, you can use a negative text embedding (bad-hands, easynegative) to get good quality images.
Dreambooth is practically ancient now. You don't need to lug around huge converged models trained on a few images and a few tags. You can download a much smaller LORA and include it in your prompt and it just werks.
Most programmers have never made a LORA and have never used Stable Diffusion. For 99.9% of people outside the programmer class, this is absolutely an "advanced algorithm."
Just because you understand what it's doing under the hood, and just because it can be abstracted into a handful of discrete steps, doesn't mean it isn't advanced.
GP isn't saying "this product would never work" (which is reminiscent of the classic Dropbox comment), GP is saying "this isn't 'advanced' and you can do it yourself" which is really helpful and exactly the sort of thing I love to see on HN.
Where do you think we are?
>this is absolutely an "advanced algorithm."
When someone puts that on his product's page, it's assumed that he developed said algorithm, or at least had a hand in it. Instead, here the real product is the pipeline and not the "algorithm". The product would do better if it was honest about what it did.
And like I mentioned, Dreambooth is pretty "old" now. The service could probably be much cheaper if the OP moved to using LORAs, and it would give better results, because it wouldn't clash with the tokens in the underlying model, and could be used with any model.
Guide to LORA:
https://imgur.com/a/mrTteIt
https://old.reddit.com/r/StableDiffusion/comments/11vw5k3/lo...
Sure it’s "easy" to do it by hand, on a few persons, tweaking the parameters. Doing it at scale at good quality is still a challenge.
Bots can do the same: “generate a photo of me holding my driver’s license, I need it for Tinder verification”.
Unfortunately, when technology like that becomes more common, people are definitely going to be tricked, and feel tricked. It's going to be awkward going on a date and the person looks nothing like what you expected... Awkward on both ends.
At some point... The online reality might become so fake, with disinformation generated by bots, generated images and videos, fake dating profiles (that are actually also just bots)... I wonder if it's going to spur some kind of reactionary movement. A movement to disconnect from the internet completely, or at least so socialize more in real life. Maybe some bars and pubs where you have to check in your cellphone at the door. Offline cafes.
[0] The new TikTok makeup filters in action: https://www.youtube.com/watch?v=tw2euJzOk60
1. The "contact us" link is broken, which doesn't inspire confidence.
2. It's important to me that you don't leak or retain my data, and while I know you say you don't retain it, I will just have to take that on faith and I'm not sure I'm willing to do that.
3. Your examples include novelty shots like "pharaoh" and "superman" etc. This branding is confusing. If you're targeting people who want professional shots for their LinkedIn etc then why would you offer jokey picture styles?
If you can solve these issues then you can count me in as a customer.
> Questions about these Terms should be sent to the Company at [insert company email or contact information].
I'd appreciate it if the app allowed users to input height/weight for accuracy, didn't generate very high contrast images for some styles (like the hacker), and let users add training data to their profile over time. Finally, producing 20 images of each style in one go might not be very cost-effective on your side. I think the Dall-E and Midjourney approach where they generate one image and let you generate more variations could make this more economical for you.
There are many caveats with that usage (not everyone built same) but I would agree this is an interesting idea and definetly should be explored.