Readit News logoReadit News
woodson commented on Euro firms must ditch Uncle Sam's clouds and go EU-native   theregister.com/2026/01/3... · Posted by u/jamesblonde
krzyk · 10 days ago
What's wrong in using Linux. It is an open source project with origins in Finland and still lead by a Fin.
woodson · 10 days ago
…who lives in Oregon, in the US.
woodson commented on Qwen3-TTS family is now open sourced: Voice design, clone, and generation   qwen.ai/blog?id=qwen3tts-... · Posted by u/Palmik
whinvik · 19 days ago
Haha something that I want to try out. I have started using voice input more and more instead of typing and now I am on my second app and second TTS model, namely Handy and Parakeet V3.

Parakeet is pretty good, but there are times it struggles. Would be interesting to see how Qwen compares once Handy has it in.

woodson · 19 days ago
This is about speech to text, not speech recognition.
woodson commented on EU–INC – A new pan-European legal entity   eu-inc.org/... · Posted by u/tilt
GardenLetter27 · 20 days ago
Unfortunately this does not override employment and tax laws - so you still cannot hire someone as an FTE in Paris, from a startup in Berlin for example (without them being a freelancer, or you opening a payroll / tax office in France).

But hopefully we can move towards that - standardised taxation (especially VAT and corporation tax would help massively here), the abolition of notaries, standardised requirements for document certification, and EU-wide digital ID so no need to fly in and sign in person.

woodson · 20 days ago
You still need a tax accountant in France to register the FTE and file paperwork with the tax office and social insurance.
woodson commented on EU–INC – A new pan-European legal entity   eu-inc.org/... · Posted by u/tilt
prasoon2211 · 20 days ago
I founded a UG and a GmbH in 2024. It took me 3 months total including visits to the notary (who charges a non-insignificant sum for their services).

I did this as a subsidiary for a US company and literally had to email and call people every few days to move the process along (mostly, it was the banks who somehow expected us to be a multi-national company and wanted to charge an arm and a leg just to let us open a bank account. Most banks outright refused us).

When the notary finally filed the paperwork to the court, the court replied after a few weeks with additional clarifications for which we had to go AGAIN to the notary to do the whole song and dance of them chanting at us in German at 1000 words per minute.

Everything took painfully long and delayed investment for while. People have absolutely no idea how painful it is to merely have the incorporated entity available. Then, it takes a few weeks to get your tax ID - this is when you can start employing people / accepting payments etc.

woodson · 20 days ago
The bank issues/refusals may have something to do with FATCA. If you have anything to do with the US in terms of taxes, many EU banks don’t want you as their customer. If it’s a subsidiary of a foreign company, then a lot of paperwork is required to prove that the foreign owners actually exist.
woodson commented on Show HN: Yolobox – Run AI coding agents with full sudo without nuking home dir   github.com/finbarr/yolobo... · Posted by u/Finbarr
woodson · a month ago
This is basically a devcontainer, right?

Deleted Comment

woodson commented on Sopro TTS: A 169M model with zero-shot voice cloning that runs on the CPU   github.com/samuel-vitorin... · Posted by u/sammyyyyyyy
nateb2022 · a month ago
> That's not what happens in zero-shot voice cloning

It is exactly what happens. You are confusing the task (classification vs. generation) with the learning paradigm (zero-shot).

In the voice cloning context, the class is the speaker's voice (not observed during training), samples of which are generated by the machine learning model.

The definition applies 1:1. During inference, it is predicting the conditional probability distribution of audio samples that belong to that unseen class. It is "predict[ing] the class that they belong to," which very same class was "not observed during training."

You're getting hung up on the semantics.

woodson · a month ago
Jeez, OP asked what it means in this context (zero-shot voice cloning), where you quoted a generic definition copied from Wikipedia. I defined it concretely for this context. Don't take it as a slight, there is no need to get all argumentative.
woodson commented on Sopro TTS: A 169M model with zero-shot voice cloning that runs on the CPU   github.com/samuel-vitorin... · Posted by u/sammyyyyyyy
nateb2022 · a month ago
> This generic answer from Wikipedia is not very helpful in this context.

Actually, the general definition fits this context perfectly. In machine learning terms, a specific 'speaker' is simply a 'class.' Therefore, a model generating audio for a speaker it never saw during training is the exact definition of the Zero-Shot Learning problem setup: "a learner observes samples from classes which were not observed during training," as I quoted.

Your explanation just rephrases the very definition you dismissed.

woodson · a month ago
From your definition:

> a learner observes samples from classes which were not observed during training, and needs to predict the class that they belong to.

That's not what happens in zero-shot voice cloning, which is why I dismissed your definition copied from Wikipedia.

woodson commented on Sopro TTS: A 169M model with zero-shot voice cloning that runs on the CPU   github.com/samuel-vitorin... · Posted by u/sammyyyyyyy
coder543 · a month ago
Why wouldn’t that be one-shot voice cloning? The concept of calling it zero shot doesn’t really make sense to me.
woodson · a month ago
I don't disagree, but that's what people started calling it. Zero-shot doesn't make sense anyway, as how would the model know what voice it should sound like (unless it's a celebrity voice or similar included in the training data where it's enough to specify a name).
woodson commented on Sopro TTS: A 169M model with zero-shot voice cloning that runs on the CPU   github.com/samuel-vitorin... · Posted by u/sammyyyyyyy
woodson · a month ago
Does the 169M include the ~90M params for the Mimi codec? Interesting approach using FiLM for speaker conditioning.

u/woodson

KarmaCake day875June 15, 2009
About
Machine learning, NLP, speech recognition, speaker recognition.
View Original