pseudonom- (u/pseudonom-)

pseudonom- commented on Why are we templating YAML? (2019) leebriggs.co.uk/blog/2019... · Posted by u/spiros

lamontcg · 2 years ago

> I agree that YAML templating is kind of insane, but I will never understand why we don't stop using fake languages and simply use a real language.

The problem is language nerds write languages for other language nerds.

They all want it to be whatever the current sexiness is in language design and want it to be self-hosting and be able to write fast multithreaded webservers in it and then it becomes conceptually complicated.

What we need is like a "Logo" for systems engineers / devops which is a simple toy language that can be described entirely in a book the size of the original K&R C book. It probably needs to be dynamically typed, have control structures that you can learn in a weekend, not have any threading or concurrency, not be object oriented or have inheritance and be functional/modular in design. And have a very easy to use FFI model so it can call out to / be called from other languages and frameworks.

The problem is that language nerds can't control themselves and would add stuff that would grow the language to be more complex, and then they'd use that in core libraries and style guides so that newbies would have to learn it all. I myself would tend towards adding "each/map" kinds of functions on arrays/hashmaps instead of just using for loops and having first class functions and closures, which might be mistakes. There's that immutable FP language for configuration which already exists (i can't google this morning yet) which is exactly the kind of language which will never gain any traction because >95% of the people using templated YAML don't want to learn to program that way.

pseudonom- · 2 years ago

Dhall is the FP config language you're thinking of, I think.

pseudonom- commented on Generalized K-Means Clustering github.com/derrickburns/g... · Posted by u/derrickrburns

minimaxir · 2 years ago

I built a pipeline to automatically cluster and visualize large amounts of text documents in a completely unsupervised manner:

- Embed all the text documents.

- Project to 2D using UMAP which also creates its own emergent "clusters".

- Use k-means clustering with a high cluster count depending on dataset size.

- Feed the ChatGPT API ~10 examples from each cluster and ask it to provide a concise label for the cluster.

- Bonus: Use DBSCAN to identify arbitrary subclusters within each cluster.

It is extremely effective and I have a theoetical implementation of a more practical use case to use said UMAP dimensionality reduction for better inference. There is evidence that current popular text embedding models (e.g. OpenAI ada, which outputs 1536D embeddings) are way too big for most use cases and could be giving poorly specified results for embedding similarity as a result, in addition to higher costs for the entire pipeline.

pseudonom- · 2 years ago

Funny, I did almost the exact same thing: https://github.com/colehaus/hammock-public. Though I project to 3D and then put them in an interactive 3D plot. The other fun little thing the interactive plotting enables is stepping through a variety of clustering granularities.

pseudonom- commented on Helen Toner shares her side wsj.com/tech/ai/helen-ton... · Posted by u/Leary

sertbdfgbnfgsd · 2 years ago

No, and I don't think it particularly matters. In practice, the people who signal they'll make their friends rich will have the most backers. The board is just there to give the appearance of checks and balances. If these two claims sound outrageous consider that this is exactly how it's playing out.

But you seem to have something in mind. Tell me.

pseudonom- · 2 years ago

OpenAI was formed as a nonprofit with a specific charter ("OpenAI’s mission is to ensure that artificial general intelligence (AGI)—by which we mean highly autonomous systems that outperform humans at most economically valuable work—benefits all of humanity. We will attempt to directly build safe and beneficial AGI, but will also consider our mission fulfilled if our work aids others to achieve this outcome.") and the capped-profit entity under which daily operations occur formed years later with the claim that it was instrumentally useful for pursuing that charter. The capped-profit entity remains a subsidiary of the nonprofit. The board in the dispute is the board of the overseeing nonprofit.

So there are many particulars that mean pattern matching to a standard board dispute will lose something. I think it's likely many of the primary actors have, at various times, had non-strictly-pecuniary motives. That one side won doesn't mean that the other side was always a farce.

pseudonom- commented on Helen Toner shares her side wsj.com/tech/ai/helen-ton... · Posted by u/Leary

sertbdfgbnfgsd · 2 years ago

> The board’s mandate is to “humanity,” not investors.

Somebody put her there, told her "you know, say it's for humanity or something" and she actually believed it.

No, it's for people to get rich.

pseudonom- · 2 years ago

Are you aware of the history and governance structure of OpenAI?

pseudonom- commented on NixOS and Flakes Book: An unofficial book for beginners (free) nixos-and-flakes.thiscute... · Posted by u/beeburrt

menthe · 2 years ago

Quite the contrary. Home-manager is literally the only thing that's worth using Nix. Anything beyond that is far too esoteric, unsupported, non backward compatible and continuously broken.

My dotfiles managed by nix(-darwin) and home-manager breaks every time I update my pins, and I find myself having to bisect which commit introduced the issues. Given that, I just don't see how that would scale to a full OS, let alone to a team at work. 1000% better simpler with understandable Dockerfile and Kubernetes YAML manifests, or with Ansible YAML. At least every folk can StackOverflow and ChatGPT it to a working state, and have it work for a considerable amount of time without further maintenance.

pseudonom- · 2 years ago

Unless I'm misunderstanding something, this is precisely why I don't use Home Manager. I've literally never had my NixOS setup break over the course of many years.

pseudonom- commented on A closer look at BookCorpus, a key dataset in machine learning towardsdatascience.com/di... · Posted by u/Kaibeezy

lukev · 2 years ago

I wonder how much better an LLM could be given even better training data.

For example, the total number of tokens contained in the physical and digital collections of a moderately-sized university library is (probably) equal to or on par with the size of the training data for GPT 3.5.

What would happen if you could train just on that? I know we're using huge training sets, but how much of it is just junk from the internet?

(There should be some representative junk in the dataset, but nowhere near the majority.)

pseudonom- · 2 years ago

https://arxiv.org/abs/2306.11644 is along these lines.

pseudonom- commented on Fixing for loops in Go 1.22 go.dev/blog/loopvar-previ... · Posted by u/todsacerdoti

philosopher1234 · 2 years ago

What stance against PLT are you referring to?

pseudonom- · 2 years ago

Probably quotes like:

"It must be familiar, roughly C-like. Programmers working at Google are early in their careers and are most familiar with procedural languages, particularly from the C family. The need to get programmers productive quickly in a new language means that the language cannot be too radical."

And not including sum types despite having a sum-type-shaped hole in the language (`if err != nil`).

And some of the discussion about "why no generics" seemed kind of divorced from existing PL knowledge on the topic.

pseudonom- commented on How Is LLaMa.cpp Possible? finbarr.ca/how-is-llama-c... · Posted by u/birriel

ripvanwinkle · 2 years ago

Thank you! Is there a sweet spot with quantization. how much can you quantize for given model type and size and still be useful.

pseudonom- · 2 years ago

Tim Dettmers recently (https://www.manifold1.com/episodes/ai-on-your-phone-tim-dett...):

"But what we found with these neural networks is, if you use 32 bits, they're just fine. And then you use 16 bits, and they're just fine. And then with eight bits, you need to use a couple of tricks and then it's just fine.

And now we find if you can go to four bits, and for some networks, that's much easier. For some networks, it's much more difficult, but then you need a couple more tricks. And so it seems they're much more robust."

pseudonom- commented on Llama from scratch, or how to implement a paper without crying blog.briankitano.com/llam... · Posted by u/bkitano19

ripvanwinkle · 2 years ago

Thank you! What is batch normalization doing and how does it help

pseudonom- · 2 years ago

There are other mechanisms for dealing with vanishing and exploding gradients. I (maybe wrongly?) think of batch normalization as being most distinctively about fighting internal covariate shift: https://machinelearning.wtf/terms/internal-covariate-shift/

pseudonom- commented on Llama from scratch, or how to implement a paper without crying blog.briankitano.com/llam... · Posted by u/bkitano19

bravura · 2 years ago

Overall, a good sense of fundamental principles demonstrated.

Particularly:

"Use .shape religiously. assert and plt.imshow are your friends." Thank you. You should always assert pre and post conditions of shape. (Do bear or typeguard allow you to do this using decorators?)

Some nits:

"Before you even look at the paper, pick a small, simple, and fast model that you've done in the past. Then make a helper function to evaluate the model qualitatively." Don't you mean quantitatively? So that you establish a numerical baseline against which you can compare the more advanced method.

"Start by picking apart different components of the paper, and then implementing them one-by-one, training and evaluating as you go." Can you be precise what you mean here? A lot of work is like: "Okay we tried 10 changes things [for unspecified reasons], some major and some minor, to get our final thing, and here's an ablation study to show how much we lose if we remove each piece." If you would say: "Implement the meat first (the major architectural change fundamental to the work, i.e. the ablation study line-item all the way at the bottom with no seasoning or spices on it)" then yeah, that's a good place to start. But you can't start with a broccoli recipe, switch to a meat recipe, and taste it halfway before it's done cooking and you haven't flipped it, you're not going to learn much. This sort of advance is better framed as: "Evaluate each time you make an atomic change to the approach, prioritizing changes in the order that had the most impact in the ablation study from easiest to hardest, respecting the DAG in which certain changes can be made."

pseudonom- · 2 years ago

> (Do bear or typeguard allow you to do this using decorators?)

You can push some of this directly into Python type annotations thanks to https://peps.python.org/pep-0646/.

e.g.

  @overload
  def mean(a: ndarray[float, Dim1, *Shape], axis: Literal[0]) -> ndarray[float, *Shape]: ...
  @overload
  def mean(a: ndarray[float, Dim1, Dim2, *Shape], axis: Literal[1]) -> ndarray[float, Dim1, *Shape]: ...