newhouseb (u/newhouseb)

newhouseb commented on Unconventional Uses of FPGAs voltagedivide.com/2024/03... · Posted by u/voxadam

newhouseb · 2 years ago

A more extreme version of the ADC/DAC projects mentioned here is my (old) project that built a Bluetooth transceiver using just an FPGA and an antenna (and a low pass filter if you don't want the FCC coming after you): https://github.com/newhouseb/onebitbt

Tl;dr: if you blast in/out the right 5GHz binary digital signal from a SERDES block on an FPGA connected to a short wire you can talk to your phone!

newhouseb commented on Comcast squeezing 2Gbps internet speeds through decades-old coaxial cables engadget.com/comcast-star... · Posted by u/bookofjoe

zahma · 2 years ago

Surprised it’s actually symmetrical. Though I’ve never understood why, it’s always perplexed and irritated me that they typically offer upload bandwidth at a tenth of download. In the day and age of remote work, you’d think this would be illegal.

newhouseb · 2 years ago

The short but unfulfilling answer is that this is because the DOCSIS standard historically has allocated a much broader frequency range for DL than UL. Unlike other forms of communication (like cell) that can use similar frequency ranges for transmit and receive, DOCSIS tends to slice something like < 70Mhz for UL and >70Mhz to 1Ghz for DL (I'm probably remembering the details incorrectly, trust Google over me!). Switching the frequency ranges often requires different circuitry and therefore different hardware.

I would guess that unlike fiber -- rarely saturated in a consumer context -- people have ~always wanted more than what cable can provide and thus the operators needed to be strategic about the allocation of bandwidth between DL and UL, hence the asymmetry.

newhouseb commented on I made a transformer to predict a simple sequence manually vgel.me/posts/handmade-tr... · Posted by u/lukastyrychtr

teddykoker · 2 years ago

A related line of work is "Thinking Like Transformers" [1]. They introduce a primitive programming language, RASP, which is composed of operations capable of being modeled with transformer components, and demonstrate how different programs can be written with it, e.g. histograms, sorting. Sasha Rush and Gail Weiss have an excellent blog post on it as well [2]. Follow on work actually demonstrated how RASP-like programs could actually be compiled into model weights without training [3].

[1] https://arxiv.org/abs/2106.06981

[2] https://srush.github.io/raspy/

[3] https://arxiv.org/abs/2301.05062

newhouseb · 2 years ago

Huge fan of RASP et al. If you enjoy this space, might be fun to take a glance at some of my work on HandCrafted Transformers [1] wherein I hand-pick the weights in a transformer model to do long-handed addition similar to how humans learn to do it in gradeshcool.

[1] https://colab.research.google.com/github/newhouseb/handcraft...

newhouseb commented on Your WiFi Can See You mrereports.substack.com/p... · Posted by u/nunodio

newhouseb · 2 years ago

I'm a little skeptical of the paper in question: https://arxiv.org/pdf/2301.00250.pdf

If you look at the training and test set info:

> We report results on two protocols: (1) Same layout: We train on the training set in all 16 spatial layouts, and test on remaining frames. Following [31], we randomly select 80% of the samples to be our training set, and the rest to be our testing set. The training and testing samples are different in the person’s location and pose, but share the same person’s identities and background. This is a reasonable assumption since the WiFi device is usually installed in a fixed location. (2) Different layout: We train on 15 spatial layouts and test on 1 unseen spatial layout. The unseen layout is in the classroom scenarios.

Depending on how they selected various frames -- let's just say it was random -- the model could have learned something to effect of "this RF pattern is most similar to these two other readings I'm familiar with" (from the surrounding frames) and can therefore just interpolate between the resulting poses associated with those RF patterns (that the model has compressed/memorized into trained weights).

If you look at the meshes between the image ground truth and the paper's results, you'll see that they are strikingly similar. I find this also suspect because WiFi-band RF interacts a lot more with water than with clothes and so you would expect the outline/mesh to get the "meat-bag" parts of you correct but not be able to guess the contours of baggy clothes. That is... unless it has memorized them from the training set.

newhouseb commented on Guidance: A guidance language for controlling large language models github.com/guidance-ai/gu... · Posted by u/bx376

mmoskal · 2 years ago

It updates token logits (probabilities) after every token before sampling. I don't think this is very common yet.

newhouseb · 2 years ago

Right, there are many folks (dozens of us!) yelling about logit processors and building them into various frameworks.

The mostly widely accessible form of this is probably BNF grammar biasing in llama.cpp: https://github.com/ggerganov/llama.cpp/blob/master/grammars/...

newhouseb commented on Apple Sep 2023 Event: “Wonderlust” apple.com/apple-events/... · Posted by u/jlaneve

maxclark · 3 years ago

We’re in the same boat. USB-C mandatory going forward.

I kinda hate that the new MacBook Air went back to a MagSafe charger for this reason.

newhouseb · 3 years ago

You still charge through the USB-C / Thunderbolt ports (and use a USB-C to USB-C cable with the provided power brick to do so). The MagSafe plug is just for convenience (and perhaps slightly higher wattage than is normally specced for USB-C on the beefier MBPs).

newhouseb commented on Show HN: LLMs can generate valid JSON 100% of the time github.com/normal-computi... · Posted by u/remilouf

druskacik · 3 years ago

In this case (multiple choice generation), if one of the possible outputs does no match the regex, you can just exclude it from generation.

I am trying to think of an example where "answer prefix might have been extremely unlikely to yield a valid response, but the technique ( ... ) constructs a valid response from it regardless", which might really cause a problem. But to no luck. Anyone has any idea? This could potentially be an interesting research question.

newhouseb · 3 years ago

An example from an earlier comment of mine on a different thread (assuming I've understood correctly):

> let's say we had a grammar that had a key "healthy" with values "very_unhealthy" or "moderately_healthy." For broccoli, the LLM might intend to say "very_healthy" and choose "very" but then be pigeonholed into saying "very_unhealthy" because it's the only valid completion according to the grammar.

That said, you can use beam search to more or less solve this problem by evaluating the joint probability of all tokens in each branch of the grammar and picking the one with the highest probability (you might need some more nuance for free-form strings where the LLM can do whatever it wants and be "valid").

newhouseb commented on Show HN: LLMs can generate valid JSON 100% of the time github.com/normal-computi... · Posted by u/remilouf

2bitencryption · 3 years ago

it still blows my mind that OpenAI exposes an API with Functions calling, and yet does not guarantee the model will call your function correctly, in fact, it does not even guarantee the output will be valid JSON.

When this is, really, a solved problem. I've been using github.com/microsoft/guidance for weeks, and it genuinely, truly guarantees correct output, because it simply does not sample from tokens that would be invalid.

It just seems so obvious, I still have no clue why OpenAI does not do this. Like, why fuss around with validating JSON after the fact, when you can simply guarantee it is correct in the first place, by only sampling tokens if they conform to the grammar you are trying to emit?

newhouseb · 3 years ago

I think this is likely a consequence of a couple of factors:

1. Fancy token selection w/in batches (read: beam search) is probably fairly hard to implement at scale without a significant loss in GPU utilization. Normally you can batch up a bunch of parallel generations and just push them all through the LLM at once because every generated token (of similar prompt size + some padding perhaps) takes a predictable time. If you stick a parser in between every token that can take variable time then your batch is slowed by the most complex grammar of the bunch.

2. OpenAI appears to work under the thesis articulated in the Bitter Lesson [i] that more compute (either via fine-tuning or bigger models) is the least foolish way to achieve improved capabilities hence their approach of function-calling just being... a fine tuned model.

[i] http://www.incompleteideas.net/IncIdeas/BitterLesson.html

newhouseb commented on Llama: Add grammar-based sampling github.com/ggerganov/llam... · Posted by u/davepeck

svc0 · 3 years ago

I think it should be noted that this enforces grammatical constraints on the model's generated text, but it doesn't do anything to properly align the content. This would be useful if you needed to ensure a server delivered well-formatted JSON, but it I suspect it wont solve a lot of alignment issues with current language generation. For example current iterations of Llama and GPT often do not label markdown code-blocks correctly. Using grammar-based sampling, you could enforce that it labels code blocks but you couldn't enforce correct labeling since this is context-dependent. You also couldn't invent a novel domain-specific language without aligning against that language and expect good output.

newhouseb · 3 years ago

Also important to call out that anytime you have a freeform string it's pretty much an open invitation for the LLM to go completely haywire and run off into all sorts of weird tangents. So these methods are best used with other heuristics to bias sampling once you get to free-form text territory (i.e. a repetition penalty etc)

newhouseb commented on Llama: Add grammar-based sampling github.com/ggerganov/llam... · Posted by u/davepeck

contravariant · 3 years ago

> A more advanced feature not commonly used is to also enable back-tracking if the AI gets stuck and can’t produce a valid output.

Technically that part is mandatory if you don't just want it to produce an output but to make it produce an output that correctly matches the temperature (i.e. one that you could have gotten by randomly sampling the LLM until you got a correct one). Randomly picking the next tokens that isn't grammatically incorrect works but oversamples paths where most of the options are invalid. The ultimate example of this is that it can get stuck at a branch with probability 0.

From a probabilistic standpoint what you'd need to do is not just make it backtrack but make it keep generating until it generates a grammatically correct output in one go.

Maybe there is something clever that can be done to avoid regenerating from the start? What you'd need to achieve is that a token that has a x% probability of leading to an incorrect output also has x% probability to be erased.

newhouseb · 3 years ago

The way LLMs work is they output probabilities for every _token_, so you don't really need to backtrack you can just always pick a token that matches the provided grammar.

That said, you might want to do something like (backtracking) beam-search which uses various heuristics to simultaneously explore multiple different paths because the semantic information may not be front-loaded, i.e. let's say we had a grammar that had a key "healthy" with values "very_unhealthy" or "moderately_healthy." For broccoli, the LLM might intend to say "very_healthy" and choose "very" but then be pigeonholed into saying "very_unhealthy" because it's the only valid completion according to the grammar.

That said, there are a lot of shortcuts you can take to make this fairly efficient thanks to the autoregressive nature of (most modern) LLMs. You only need to regenerate / recompute from where you want to backtrack from.