ebolyen (u/ebolyen) - Readit News

ebolyen commented on “The Mind in the Wheel” lays out a new foundation for the science of mind experimental-history.com/... · Posted by u/CharlesW

ebolyen · 3 months ago

If anyone is interested in a more formal descriptions of these control-loops, with more testable mechanisms, check out the concept of reward-taxis. Here are two neat papers that I think are more closely related than might initially appear:

"Is Human Behavior Just Running and Tumbling?": https://osf.io/preprints/psyarxiv/wzvn9_v1 (This used to be a blog post, but its down, so here's a essentially identical preprint.) A scale-invariant control-loop such as chemotaxis may still be the root algorithm we use, just adjusted for a dopamine gradient mediated by the prefrontal cortex.

"Give-up-itis: Neuropathology of extremis": https://www.sciencedirect.com/science/article/abs/pii/S03069... What happens when that dopamine gradient shuts down?

ebolyen commented on A Tiny Boltzmann Machine eoinmurray.info/boltzmann... · Posted by u/anomancer

itissid · 4 months ago

IIUC, we need gibbs sampling(to compute the weight updates) instead of using the gradient based forward and backward passes with today's NNetworks that we are used to. Any one understand why that is so?

ebolyen · 3 months ago

Not an expert, but I have a bit of formal training on Bayesian stuff which handles similar problems.

Usually Gibbs is used when there's no directly straight-forward gradient (or when you are interested in reproducing the distribution itself, rather than a point estimate), but you do have some marginal/conditional likelihoods which are simple to sample from.

Since each visible node depends on each hidden node and each hidden node effects all visible nodes, the gradient ends up being very messy, so its much simpler to use Gibbs sampling to adjust based on marginal likelihoods.

ebolyen commented on How to Average in Prolog (2017) storytotell.org/how-to-av... · Posted by u/todsacerdoti

danilafe · 4 months ago

This is a strange article to me. I've not seen any class that teaches Prolog place these constraints (use recursion / don't add new predicates) or even accidentally have the outcome of "making prolog look tedious". What's the joke here?

That aside, I wonder if getting the reverse solution (sum(?, 10)) is better served by the straightforward or the tedious approach. I suspect both would work just the same, but I'd be curious if anyone knows otherwise.

ebolyen · 4 months ago

It's been a long time since I took a class like this, but I definitely had a similar experience to the author.

Ideas like fold and map where _never_ mentioned in lisp (to exaggerate, every function had to have the recursive implementation with 5 state variables and then a simpler form for the initial call), at no point did higher-order functions or closures make an appearance while rotating a list by 1 and then 2 positions.

The treatment of Prolog was somehow worse. Often the code only made any sense once you reversed what the lecturer was saying, realizing the arrow meant "X given Y" not "X implies Y", at which point, if you could imagine the variables ran "backwards" (unification was not explained) the outcome might start to seem _possible_. I expect the lecturer was as baffled by their presentation as we were.

In general, it left the rest of the class believing quite strongly that languages other than Java were impossible to use and generally a bad move. I may have been relatively bitter in the course evaluation by the end.

ebolyen commented on Conda: A package management disaster? pyherald.com/articles/16_... · Posted by u/osdotsystem

mbreese · 8 months ago

I wish you luck with tracking down versions of software used when you're writing papers... especially if you're using multiple conda environments. This is pretty much the example used in the article -- version mismatches.

But, I think this illustrates the problem very well.

Conda isn't just used for Python. It's used for general tools and libraries that Python scripts depend on. They could be C/C++ that needs to be compiled. It could be a Cython library. It could be...

When you're trying to be a package manager that operates on-top of the operating system's package manager, you're always going to have issues. And that is why Conda is such a mess, it's trying to do too much. Installation issues are one of the reason why I stopped writing so many projects in Python. For now, I'm only doing smaller scripts in Python. Anything larger than a module gets written in something else.

People here have mentioned Rust as an example of a language with a solid dependency toolchain. I've used more Go, which similarly has had dependency management tooling from the begining. By and large, these languages aren't trying to bring in C libraries that need to be compiled and linked into Python accessible code (it's probably possible, but not the main use-case).

For Python code though, when I do need to import a package, I always start with a fresh venv virtual environment, install whatever libraries are needed in that venv, and then always run the python from that absolute path (ex: `venv/bin/python3 script.py`). This has solved 99% of my dependency issues. If you can separate yourself from the system python as much as possible, you're 90% of the way there.

Side rant: Which, is why I think there is a problem with Python to begin with -- *nix OSes all include a system level Python install. Dependencies only become a problem when you're installing libraries in a global path. If you can have separate dependency trees for individual projects, you're largely safe. It's not very storage efficient, but that's a different issue.

ebolyen · 8 months ago

> I wish you luck with tracking down versions of software used when you're writing papers... especially if you're using multiple conda environments.

How would you do this otherwise? I find `conda list` to be terribly helpful.

As a tool developer for bioinformaticians, I can't imagine trying to work with OS package managers, so that would leave vendoring multiple languages and libraries in a home-grown scheme slightly worse and more brittle than conda.

I also don't think it's realistic to imagine that any single language (and thus language-specific build tools or pkg manager) is sufficient. Since we're still using fortran deep in the guts of many higher level libraries (recent tensor stuff is disrupting this a bit, but it's not like openBLAS isn't still there as a default backend).

ebolyen commented on World’s oldest tree? Genetic analysis traces evolution of iconic Pando forest nature.com/articles/d4158... · Posted by u/pseudolus

Buttons840 · 10 months ago

My guess at a definition: All parts connected, having the same DNA, and supporting each other by sharing nutrients.

ebolyen · 10 months ago

This is true of corals, and they are often considered "colonial" organisms instead of an individual.

That said, I don't think anyone who studies biology is particularly concerned with hard-line definitions, as nature tends to eschew them every chance it has.

I think Pando and corals being considered "modular bodyplans/habits" is perhaps a more useful concept than individual or clone.

ebolyen commented on Don't DRY Your Code Prematurely testing.googleblog.com/20... · Posted by u/thunderbong

Stratoscope · a year ago

Good point. This may be a case where domain knowledge is helpful.

One of the reasons they brought me in on this project is that besides knowing how to wrangle data, I'm also an experienced pilot. So I had a good intuitive sense of the meaning and purpose of the data.

The part of the data that was identical is the description of the airspace boundaries. Pilots will recognize this as the famous "upside down wedding cake". But it's not just simple circles like a wedding cake. There are all kinds of cutouts and special cases.

Stuff like "From point A, draw an arc to point B with its center at point C. Then track the centerline of the San Seriffe River using the following list of points. Finally, from point D draw a straight line back to point A."

The FAA would be very reluctant to change this, for at least two reasons:

1. Who will provide us the budget to make these changes?

2. Who will take the heat when we break every client of this data?

ebolyen · a year ago

I see, so it's a procedural language that is well understood by those who fly (not just some semi-structured data or ontology). This is a great example of the advantage of domain experience. Thanks for sharing!

ebolyen commented on Don't DRY Your Code Prematurely testing.googleblog.com/20... · Posted by u/thunderbong

TeMPOraL · a year ago

In such case I think I'd go for an internal-DRYing + copy-on-write approach. That is, two identical classes or entry points, one for each format; internally, they'd share all the common code. Over time, if something changes in one format but not the other, that piece of code gets duplicated and then changed, so the other format retains the original code, which it now owns.

ebolyen · a year ago

I like that approach.

ebolyen commented on Don't DRY Your Code Prematurely testing.googleblog.com/20... · Posted by u/thunderbong

Stratoscope · a year ago

Sometimes it's best to be DRY right from the start.

Several years ago, I did some contract work for a company that needed importers for airspace data and various other kinds of data relevant to flying.

In the US, the Federal Aviation Administration (FAA) publishes datasets for several kinds of airspace data. Two of them are called "Class Airspace" and "Special Use Airspace".

The guy who wrote the original importers for these treated them as completely separate and unrelated data. He used an internal generic tool to convert the FAA data for each kind of airspace into a format used within the company, and then wrote separate C++ code, thousands of lines of code each.

Thing is, the data for these two kinds of airspace is mostly identical. You could process it all with one common codebase, with separate code for only the 10% of the data that is different between the two formats.

When I asked him about this, he said, "I have this philosophy that says if you only have two similar things, it's best to write separate code for each. Once you get to a third, then you can think about refactoring and making some common code."

That is a good philosophy! I have often followed it myself.

But in this case, it was obvious that the two data formats were mostly the same, and there was never going to be a third kind of almost-identical airspace, only the two. So we had twice the code we needed.

ebolyen · a year ago

I don't know, that sounds like a complex kind of ingest which could be arbitrarily subtle and diverge over time for legal and bureaucratic reasons.

I would kind of appreciate having two formats, since what are the odds they would change together? While there may never be a 3rd format, a DRY importer would imply that the source generating the data is also DRY.

ebolyen commented on TypeScript: Branded Types prosopo.io/articles/types... · Posted by u/arbol

LadyCailin · a year ago

That’s just plain encapsulation, if I understand you correctly. Branding, on the other hand, prevents complex types from being confused.

ebolyen · a year ago

Branding works on primitive types as well, which is I think the most interesting use case.

I would also agree that it's harder to confuse complex types as any single instance of a type is unlikely to overlap once you have a few fields.

ebolyen commented on Distributed Authorization osohq.com/post/distribute... · Posted by u/AnhTho_FR

ebolyen · a year ago

Slightly tangential, but is there any hope of seeing Polar return as a (maintained) open source system?

I absolutely love the concept of using a logic language for authorization, and I think Polar's aesthetic qualities make it significantly more approachable for most people (over Prolog/Datalog).

But even without the authorization problem, Polar is just... really nice looking. It would be awesome to be able to use it as its own language outright.