Readit News logoReadit News
lunixbochs commented on Compressing Text into Images   shkspr.mobi/blog/2024/01/... · Posted by u/edent
lunixbochs · 2 years ago
I did a silly experiment to compress word embeddings with jpeg - to see how it collapses semantically as you decrease the quality.

https://bochs.info/vec2jpg/

This was a very basic experiment. I expect you could perform the DCT more intelligently on the vector dimensions instead of trying to pack the embeddings into pixels, and get higher quality semantic compression.

Deleted Comment

lunixbochs commented on StableLM: A new open-source language model   stability.ai/blog/stabili... · Posted by u/davidbarker
lhl · 2 years ago
FYI, I'm running lm-eval now w/ the tests Bellard uses (lambada_standard, hellaswag, winogrande, piqa,coqa) on the biggest 7B an 40GB A100 atm (non-quantized version, requires 31.4GB) so will be directly comparable to what various LLaMAs look like: https://bellard.org/ts_server/

(UPDATE: run took 1:36 to complete run, but failed at the end with a TypeError, so will need to poke and rerun).

I'll place results in my spreadsheet (which also has my text-davinci-003 results): https://docs.google.com/spreadsheets/d/1kT4or6b0Fedd-W_jMwYp...

Deleted Comment

lunixbochs commented on Box64 – Linux Userspace x86_64 Emulator Targeted at ARM64 Linux Devices   github.com/ptitSeb/box64... · Posted by u/varbhat
parasti · 2 years ago
Slightly off-topic, but the author also made gl4es, a library that basically allows all kinds of OpenGL apps to run on modern devices. Shameless plug: gl4es is what allowed me to port Neverball to the browser.

https://neverball.github.io

lunixbochs · 2 years ago
> made gl4es

Neverball was working in the original glshim project before ptitseb forked it to gl4es. (Not to discount the significant work he's put in since, including the ES2 backend)

lunixbochs commented on Numen: Voice Control for Handsfree Computing   numenvoice.com/... · Posted by u/memorable
theusus · 3 years ago
I tried this and the speech recognition is really poor.
lunixbochs · 3 years ago
The Talon model is fairly accurate, but it can be confusing for new users to use the command system correctly. I posted a sibling reply about this, but the most common reason for Talon users to complain about the recognition is that they are in the strict "command mode" and say things that aren't actually commands.

If you encounter what feels like poor recognition in Talon, I recommend enabling Save Recordings and zipping+sharing some examples on the Slack and asking for advice.

The current command set is definitely harder to learn than a system designed for chat/email where "what you say is what you get", but it's much more powerful for tasks like programming once you learn it.

I'm dubious about what kind of general command accuracy Numen is able to get with the Vosk models, as Vosk to my understanding is more designed for natural language than commands.

lunixbochs commented on Numen: Voice Control for Handsfree Computing   numenvoice.com/... · Posted by u/memorable
orbisvicis · 3 years ago
The talon demonstration from the last link was inspiring, but it works in the exact opposite fashion that I would have imagined. The code-development examples are command-based, with a command to enter phrase mode. I'd have expected with technology such as tree-sitter and IntelliJ etc, that by parsing the syntax tree of current computer language for completions, development could occur completely in phrase mode with only a few commands for handling unknown inputs such as new variable names.

I'm curious if anyone has ever tried implementing the latter, or compared the two approaches. I'm sure there would be many obstacles I haven't considered.

lunixbochs · 3 years ago
Fixed commands are fast, precise, and predictable.

Assuming you mean speaking in natural language, that's slower to say, and likely less precise and predictable if you want to be able to just say "anything" any have a result.

You need a command system either way. If you want to express some precise intention, you need to understand what the command system will do.

There is a combined "mixed mode" system I've been testing in the talon beta where you can use both phrases and commands without switching modes.

lunixbochs commented on Numen: Voice Control for Handsfree Computing   numenvoice.com/... · Posted by u/memorable
unshavedyak · 3 years ago
Wow eyetracking is not something i thought of.. and now i want it.

I wonder if we could replace mouse with eyetracking? I wouldn't expect it to be accurate enough though, give micro movements that eyes do.. and in general erratic movements.. but i'd love to be wrong.

lunixbochs · 3 years ago
Talon's eye tracking functions as a mouse replacement. Is there a specific demo you'd like to see? I can record one.
lunixbochs commented on Numen: Voice Control for Handsfree Computing   numenvoice.com/... · Posted by u/memorable
yewenjie · 3 years ago
Last time I checked Talon's models were very bad at recognizing my voice. Does it support better models now, for example OpenAI's Whisper?
lunixbochs · 3 years ago
Depending on when that was: in 2018 the free model was the macOS speech engine, in 2019 it was a fast but relatively weak model, and as of late 2021 it's a much stronger model. I'm currently working on the next model series with a lot more resources than I had before.

It's also worth saying that if you only tried things out briefly, there are a handful of reasons recognition may have seemed worse. Talon uses a strict command system by default, because that improves precision and speed for trained users, but the tradeoff there is it's more confusing for people who haven't learned it yet.

For example, Talon isn't in "dictation mode" by default, so you need to switch to that if you're trying to write email-like text and don't want to prefix your phrases with a command like "say".

The timeout system may also be confusing at first. When you pause, Talon assumes you were done speaking and tries to run whatever you said. You can mitigate this by speaking faster or increasing the timeout.

The default commands (like the alphabet) may also just not be very good for some accents, and that will be the case for any speech engine - you will likely need to change some commands if they're hard to enunciate in your accent.

I recommend joining the slack [1] and asking there if you want more specific feedback. I definitely want to support many accents and even have some users testing Talon with other spoken languages.

[1] https://talonvoice.com/chat

lunixbochs commented on Numen: Voice Control for Handsfree Computing   numenvoice.com/... · Posted by u/memorable
orbisvicis · 3 years ago
I don't know what type of speech each dataset represents, but the talon results are extremely impressive... I assume it wasn't trained on at least some subset (depending on the train/test split) of this data?
lunixbochs · 3 years ago
A handful of the datasets I tested are fully held out (I have reason to believe none of the models have trained on them), and talon was trained on none of the dev or test data of any of the datasets in question.

Due to whisper's weakly supervised training on a large amount of automatically scraped data and reliance on a bigger language model, it's far more likely whisper had seen some of the test data before.

u/lunixbochs

KarmaCake day1768April 25, 2012
About
@lunixbochs // hn@bochs.info // talonvoice.com - don't hurt your hands!
View Original