Krapivin made this breakthrough by being unaware of Yao's conjecture.
The developer of Balatro made an award winning deck builder game by not being aware of existing deck builders.
I'm beginning to think that the best way to approach a problem is by either not being aware of or disregarding most of the similar efforts that came before. This makes me kind of sad, because the current world is so interconnected, that we rarely see such novelty with their tendency to "fall in the rut of thought" of those that came before. The internet is great, but it also homogenizes the world of thought, and that kind of sucks.
> I'm beginning to think that the best way to approach a problem is by either not being aware of or disregarding most of the similar efforts that came before. This makes me kind of sad, because the current world is so interconnected, that we rarely see such novelty with their tendency to "fall in the rut of thought" of those that came before. The internet is great, but it also homogenizes the world of thought, and that kind of sucks.
I think this is true only if there is a novel solution that is in a drastically different direction than similar efforts that came before. Most of the time when you ignore previous successful efforts, you end up resowing non-fertile ground.
Right, the other side is when you end up with rediscoveries of the same ideas. The example that comes to my mind is when a medical researcher found the trapezoidal rule for integration again[1].
there might be a decent amount of "survivorship bias" too. meaning you only hear of the few events where someone starts from first principles and actually finds, be it luck or smarts, a novel solution which improves on the status quo, but i'd argue there are N other similar situations where you don't end up with a better solution.
That being said, I so disagree with just taking the "state of the art" as written in stone, and "we can't possibly do better than library x" etc.
In 1973 Clifford Cock solved the problem of public keys first time in history that no one in GCHQ managed to solve in the past 3 years. He jolted down the solution in half hour after hearing about it then wondered why is it such a big thing for everyone else. A fresh view unclouded by prejudice can make all the difference.
I'm not sure this is true, though I think it looks true.
I think the issue is that when a lot of people have put work into something you think that the chances of success yourself are low. This is a pretty reasonable belief too. With the current publish or perish paradigm I think this discourages a lot of people from even attempting. You have evidence that the problem is hard and even if solvable, probably is timely, so why risk your entire career? There are other interesting things that are less risky. In fact, I'd argue that this environment in of itself results in far less risk being taken. (There are other issues too and I laid out some in another comment) But I think this would look identical to what we're seeing.
Right. FWIW, Feynman predicted that physics would become rather boring in this regard, because physics education had become homogenized. This isn't to propose a relativism, but rather that top-down imposed curricula may do a good deal of damage to the creativity of science.
That being said, what we need is more rigorous thinking and more courage pursuing the truth where it leads. While advisors can be useful guides, and consensus can be a useful data point, there can also be an over-reliance on such opinions to guide and decide where to put one's research efforts, what to reevaluate, what to treat as basically certain knowledge, and so on. Frankly, moral virtue and wisdom are the most important. Otherwise, scientific praxis degenerates into popularity contest, fitting in, grants, and other incentives that vulgarize science.
In my experience the best approach is to first try to solve the problem without having read the prior work, then read the prior work, then improve your approach based on the prior work.
If you read the prior work too early to you get locked into existing mindsets. If you never read it then you miss important things you didn’t thought of.
Even if your approach is less good than the prior work (the normal case) you gain important insights into why the state of the art approach is better by comparing it with what you came up with.
A decade ago I read this same advice in "The Curmudgeon's Guide to Practicing Law": spend at least a little time trying to solve the problem before you look to how other's have solved it. One benefit is that occasionally you may stumble on a better method. But the more common benefits is that it helps develop your problem-solving skills and it primes you to understand and appreciate existing solutions.
> If you read the prior work too early to you get locked into existing mindsets.
I agree, though in some cases coming up with your own ideas first can result in you becoming attached to them, because they are your own. It is unlikely for this to happen if you read the prior work first.
Though I think overall reading the prior work later is probably still a good idea, but with the intention not to become too impressed with whatever you come up before.
> The developer of Balatro made an award winning deck builder game by not being aware of existing deck builders.
He was aware of deck builders and was directly inspired by Luck be a Landlord, but he was not aware of just how massive the genre is.
Direct quote from the developer:
> The one largest influence on Balatro was Luck Be a Landlord. I watched Northernlion play for a few videos and loved the concept of a non-fanatsy themed score attach roguelike a ton, so I modified the card game I was working on at the time into a roguelike.
> I cut myself off from the genre at that point intentionally, I wanted to make my own mistakes and explore the design space naively just because that process is so fun. I hear the comparison to Slay the Spire a lot but the truth is that I hadn't played that game or seen footage of it when I designed Balatro, not until much later.
“They’re cheering for you,” she said with a smile.
“But I could never have done it,” [Milo] objected, “without everyone else’s help.”
“That may be true,” said Reason gravely, “but you had the courage to try;
and what you can do is often simply a matter of what you will do.”
“That’s why,” said King Azaz, “there was one very important thing about your quest
that we couldn’t discuss until you returned.”
“I remember,” said Milo eagerly. “Tell me now.”
“It was impossible,” said the king, looking at the Mathemagician.
“Completely impossible,” said the Mathemagician, looking at the king.
“Do you mean … ,” said the bug, who suddenly felt a bit faint.
“Yes, indeed,” they repeated together, “but if we’d told you then, you might not have gone …
and, as you’ve discovered, so many things are possible just as long as you don’t know they’re impossible.”
A professor I had in college, whose first published result was from a piece of homework he turned in where he incidentally solved an open question about bound on a problem, had a curious habit.
I ended up failing and taking his course again (because I had A Lot going on in college), and thus, noticed something.
Each semester, on one of the assignments in the latter half of the class, he assigned one problem out of, perhaps, 30 in the problem set, where as written, it was actually an open problem, and then a day or two before they were due, he'd send out an "oops, my bad" revised version.
I suspect that this was not an accident, given that it always happened only once.
> A professor I had in college, whose first published result was from a piece of homework he turned in where he incidentally solved an open question about bound on a problem, had a curious habit.
A related anecdote is about George Dantzig (perhaps best known for the simplex algorithm):
> During his study in 1939, Dantzig solved two unproven statistical theorems due to a misunderstanding. Near the beginning of a class, Professor Spława-Neyman wrote two problems on the blackboard. Dantzig arrived late and assumed that they were a homework assignment. According to Dantzig, they "seemed to be a little harder than usual", but a few days later he handed in completed solutions for both problems, still believing that they were an assignment that was overdue. Six weeks later, an excited Spława-Neyman eagerly told him that the "homework" problems he had solved were two of the most famous unsolved problems in statistics.
I think going one layer lower - the fundamental issue is that the internet drives people to unrealistic perceptions of the competence of others. Think about all of the undeniably brilliant people that have been involved in software over the past 40 years, and how many of them used hash tables in performance critical environments. Let alone mathematicians and others using them in applied domains. And you think there's something fundamental that all of these people just somehow missed?
The argument of 'if that's such a good idea, why wouldn't somebody have just done it already?' seems to have grown exponentially with the advent of the internet. And I think it's because the visibility of competence of other's became so much more clear. For those who lived through e.g. Carmack's Golden Age you knew you were never going to be half the coder he was, at least based on the image he successfully crafted. That 'slight' at the end is not to say he wasn't a brilliant developer or even perhaps the best in the world at his peak, but rather that brilliance + image crafting creates this Gargantuan beast of infallibility and exceptionalism that just doesn't really exist in reality. I think it's from this exact phenomena that you also get the practical fetishism of expertise.
Picking two examples out of all people approaching problems, while ignoring wasted effort and failures to make progress because of not understanding current knowledge, is an absolutely terrible reason to approach from ignorance.
The biggest gains in theory and in practice are far more often obtained by masters of craft, giving much more weight to attacking problems from a position of knowledge.
In fact, even in this case, this progress required that the author was aware of very recent results in computer science, was thinking deeply about them, and most likely was scouring the literature for pieces to help. The “Tiny Pointers” paper is mentioned directly.
Well said. I really dislike the narrative here, that ignorance is something that leads to discovery. One, the poster gives two examples, as if there's something we should gain for such a small sample. In addition to that, one of the examples isn't valid. The student's former professor is a co-author of the "Tiny Pointers" [1] paper that he was reading and working through. And, it was a conjecture, I don't see how someone should think that it would mean it's impossible.
I would rather, instead of thinking ignorance is a key ingredient to discovery, that instead it's the willingness to try things.
A similar idea came up in Veritasium's latest video today. Training AI by DeepMind to predict protein folding greatly improved by withholding the most evident information about a protein's primary structure — its linear polypeptide chain — within the The Structure Module step. [0]
After asking ChatGPT not to agree with me that your comment and these two different approaches to solving problems are the alike, it concluded there still might be similarities between the two.
It’s important to think outside the box, and that’s easier when you’re not aware of the box, but we also stand on the shoulders of giants, and are doomed to repeat history if we don’t learn from it. As usual, things aren’t clear-cut.
Plus, their choice of bringing up Balatro wasn't correct either. The developer DID play other deck builders, just not that many. Particularly, they played Slay the Spire which is the genre's most influential "Giant" and the entire reason for the game's progression structure (Small fights leading to a known big fight with particular anti-strategy gimmicks).
> It’s important to think outside the box, and that’s easier when you’re not aware of the box
I don't think so. If you are not aware of the box, there is a much greater chance for you to be well within the box and not realizing it. By that, I mean that either you rediscovered something, or you are wrong in the same way as many others before you. By chance, you may find something new and unexpected, but that's more of an exception than the rule.
I’ve been working on and off for years on a scrabble endgame solver; it uses all these techniques from chess like transposition tables, Negamax with alpha beta pruning, NegaScout, aspiration search and so on. There’s a French person who built his own endgame solver and this solver is significantly faster than mine, even with all of the optimizations that I’ve put into it. He is kind of secretive about it because it’s closed source and he makes some money on it, but we’ve talked a bit about it, compared some positions and we’ve determined that his move generation algorithm is actually not asoptimized as mine. But he can still solve the endgame faster despite seeing fewer positions, which implies to me that he’s doing a significantly better job of pruning the tree.
But when we try to talk details, I asked him for example do you use minimax with alphabeta pruning and he told me like “I’m not sure if I am using minimax or what that is :(“ .. I ask him to describe what he does, he essentially describes minimax with pruning. I’ve sorta figured out that he must be doing some very intelligent version of an aspiration search. It’s really eye-opening because he doesn’t have any of this training. He’s never seen any related algorithms, he’s just figuring all this out on his own.
I think of Andre Geim as a great example of balancing the two. I couldn't find the exact quote, but he said something to the effect of "when I enter a new field, I make sure I learn the basics so I don't spend all my time making dumb mistakes. But I don't get so into it that I get stuck in the mindshare."
I'll also say I think that diversity in approaches is more important than One Right Way. Some people need to set out on their own, while others spend decades refining one technique. Both have led to extraordinary results!
Many problems are abstract and so we have to build "cartoon" models of what's going on, trying to distill the essence of the problem down to a simple narrative for what the shape of the problem space is and where the limitations are. That often works but backfires when the cartoon is wrong or some assumptions are violated about when the cartoon description works.
Results like this are pretty rare, nowadays, and I suspect this happened because the problem was niche enough or some new idea has had time to ferment that could be applied to this region. This seems like a pretty foundational result, so maybe I'm wrong about that for this case.
A lot of progress is made when there's deeper knowledge about the problem space along with some maturity for when these cartoon descriptions are invalid.
If we achieved local maximum at something, the only way to progress is to make a big leap that brings you out of it. The trouble is that most of such big leaps are unsuccessful. For every case like you are describing there are probably hundreds or thousands of people who tried to do it and ended up with something worse than the status quo.
This reminds me of the Neal Stephenson article "Innovation Starvation" from 2011:
>A number of engineers are sitting together in a room, bouncing ideas off each other. Out of the discussion emerges a new concept that seems promising. Then some laptop-wielding person in the corner, having performed a quick Google search, announces that this “new” idea is, in fact, an old one—or at least vaguely similar—and has already been tried. Either it failed, or it succeeded. If it failed, then no manager who wants to keep his or her job will approve spending money trying to revive it. If it succeeded, then it’s patented and entry to the market is presumed to be unattainable, since the first people who thought of it will have “first-mover advantage” and will have created “barriers to entry.” The number of seemingly promising ideas that have been crushed in this way must number in the millions. What if that person in the corner hadn’t been able to do a Google search?
>In a world where decision-makers are so close to being omniscient, it’s easy to see risk as a quaint artefact of a primitive and dangerous past (…) Today’s belief in ineluctable certainty is the true innovation-killer of our age
I believe Ramanujan did the same with various maths problems. The Cambridge undergrad course sprinkles a few unsolved problems in the practice questions just in case someone does this again.
There's a reason the phrase "beginner's luck" exists. I'm not sure the naïveté and success are causally related so much as they might be coincident.
Could knowing about prior research skew one's perspective and tarnish novel thought? Sure. But we don't know. Maybe we'd have an even better Balatro if the creator knew about some other deck builders. Maybe we wouldn't, we don't know. We cannot prove the counterfactual.
On the opposite extreme, there are examples of thinkers whose success stemmed from knowing much about one domain or much about many domains and integrating (Luhmann, Goethe, Feynman, Von Neumann etc.). In the general case, I think we are probably much better off promoting knowledge and study, and not ignorance and chance.
That said, I do think we should retain our willingness to play and to try things that are "out of bounds" with respect to the existing accumulated knowledge. We should live informed lives, but play and explore like unschooled children.
> the authors have also learned of several other hash tables that make use of the same high-level idea in different settings [7, 9].
At least part of the result was already known, and the fact authors didn't know about it mostly goes to the large corpus of knowledge we already posses.
But the core inspiration came from looking at another recent research paper "Tiny Pointers": that is totally against your premise.
If Krapivin was a software engineer looking to implement this solution as optimization for a particular problem, he would have done so without ever thinking of making a research paper to prove it formally, but mostly relied on benchmarking to prove his implementation works better.
Now, it has always been somewhat true that lots of existing knowledge limits our creativity in familiar domains, but you need both to really advance science.
> I'm beginning to think that the best way to approach a problem is by either not being aware of or disregarding most of the similar efforts that came before.
This is an "Einstein failed Math" fallacy. It's true that novel and notable work tends strongly not to be bound by existing consensus, which when you say it that way is hardly surprising. So yes, if consensus is wrong in some particular way the people most likely to see that are the ones least invested in the consensus.
But most problems aren't like that! Almost always the "best" way to solve a problem, certainly the best way you're going to solve the problem, is "however someone else already solved it". But sometimes it's not, and that's when interesting stuff happens.
There's a problem in all human understanding - knowing when, and knowing when not to apply pre-existing knowledge to a problem.
Have we been grinding away in the right direction and are only moments away from cracking the problem, or should we drop everything and try something completely new because we've obviously not found the solution in the direction we were heading.
To put it into a CS type context - Should we be using a DFS or BFS search for the solution, because we don't have knowledge of future cost (so UCS/Djikstra's is out) nor do we know where the solution lies in general (so A* is out, even if you ignore the UCS component)
> Krapivin made this breakthrough by being unaware of Yao's conjecture.
I don't think there's any evidence of this. Yao's conjecture is not exactly standard undergraduate material (although it might be—this is a commentary on detail rather than difficulty. But i certainly didn't encounter this conjecture in school). If not knowing this conjecture was the key, millions and millions of students failed to see what Krapivin did. I imagine you'd have to ask him what the key to his insight is.
Hashing is a pretty unintuitive sort of computation. I'm not surprised that there are still surprises.
Great point. Also, Krapivin was working on another paper co-authored by his former professor. He in fact was not working from ignorance. And like you said, most of everyone didn’t know anything about this conjecture, so ignorance certainly wasn’t an ingredient here.
This is confirmation/survivorship bias. You only hear about these positive cases. The vast majority just ends up rediscovering old techniques and their year-long paper/work got rejected.
Your exact thoughts have already been put to paper by
L.P.Hammet, godfather of physical organic chemistry (exact description of chemical reactions):
one might “... overlook the great difference between exact theory and approximate theory. Again, let me emphasize my great respect for approximate theory. [...] if one starts looking for an effect predicted by this kind of theory to be impossible, the odds are against a favorable outcome. Fortunately, however, the community of scientists, like that of horseplayers, contains some people who prefer to bet against the odds as well as a great many who always bet on the favorite. In science we should, I think, do all we can to encourage the man who is willing to gamble against the odds of this sort.
This does not mean that we should encourage the fool or the ignoramus who wants to play against suicidal odds, the man who wants to spend his time and usually someone else’s money looking for an effect incompatible with, let us say one of the conclusions reached by Willard Gibbs. Gibbs started from thoroughly proven generalizations, the first and second laws of thermodynamics, and reasoned from them by exact mathematical procedures, and his conclusions are the best example I know of exact theory, theory against which it is futile to struggle.”
> I'm beginning to think that the best way to approach a problem is by either not being aware of or disregarding most of the similar efforts that came before.
Both Danny Trejo and Tim Allen spent time in prison before becoming famous. While that's interesting, I'm not sure I'm ready to believe that's the best way to become a professional actor.
Edit to be a little less snarky, apologies:
"Outsiders" are great for approaching problems from fresh angles, but I can almost guarantee that the majority of nuts-and-bolts progress in a field comes from people who "fall in the rut of thought" in the sense that they area aware enough of the field to know which paths might be most fruitful. If I had to place a bet on myself, I wouldn't take a wild uninformed swing: I'd get myself up to speed on things first.
Outsiders sometimes do great work. They also sometimes:
There is a book about this theory written in the 1960's called 'The Structure of Scientific Revolution' by Kuhn that talks about some sciences which progress one funeral at a time and how progress is not linear. He also remarks how people from outside the standard thoughts and education surrounding the current system are typically the ones to actually progress science.
One example is Geocentrism vs Copernican astronomical models -- Copernican could never have sprung from the status quo because everything revolved around the Earth in Geocentrism instead of around the Sun. You can't square that circle.
About geocentrism vs heliocentrism, 3blue1brown has recently released a video [1] that talks about about it. It is about the cosmic distance ladder, but geocentrism is mentioned, and it makes a lot of sense in context.
To summarize: heliocentrism was known to the ancient Greeks, who realized the Sun was much bigger than the Earth, so it would seem logical to have the Earth go around the sun.
But the counterargument was that if the Earth goes around the Sun, the stars should move relative to each other during the year, because of parallax, and they didn't have instruments that were precise enough to see it, so they assumed that they didn't.
Copernicus major contribution wasn't heliocentrism, but the orbital periods of planets. And the model wasn't complete until Kepler calculated the shapes of the orbits. For details, watch the video, it is really good.
I'm being picky here, but I don't think you portray an fair view of Kuhn's epistemology here.
Kuhn does not define a value-scale of both methods, on the contrary, he merely introduces the concept of different researchs: one being critical (developing new paradigms) and one being accumulating (further refining existing paradigms).
He also hints to the almost inevitably organic interactions between the two, such that critical research naturally evolves from a pragmatic need to express things simply from a new paradigm when the old one becomes too clumsy for a use case.
This is what happened in your example as well. Copernic (and later Galileo) did not invent heliocentrism out of the blue, the theory around it existed since antic Greece. It is even arguably the Renaissance, leading metaphysicists to revisit ancien texts, that spurred the idea to Copernic to consider it. But ultimately the need for the new paradigm was pushed by the need to revisit the calendar, which was drifting, and the difficulty to do it in a geocentric world, where you have to take planet retrocession into account.
Heliocentrism was well known, the issue was that the copernican model was a bad model for the evidence and knowledge of physics available at the time (it was basically equivilent to a geocentric model but less need more epicycles, not less, and also required that the earth rotated and some unusual properties for stars). It took Kepler figuring out ellipses and slowly beating out epicycles as a way to do the math, as well as some other experiments which established the world did indeed rotate (not for lack of trying by heliocentricism advocates, but it's a hard measurement to make), to bring the idea mainstream. (And arguably only Newton's laws of motion actually tied it all together)
Having just finally read (well, listened to) Kuhn's book, I can say:
(a) I wouldn't quite characterize the book as being "about this theory" — it's a bit more nuanced. He definitely says that it's usually younger scientists with less invested in the currently reigning theory that are most likely to push forward a revolution. However, I don't recall any examples in the book of people who where wholly _unaware_ of the previous theory.
(b) You should absolutely, definitely read it. It's a classic for a reason, and the writing style is a delight.
I've always had a mind that worked that way - I can imagine how something works or could work before looking up how it actually does work. But there's no real benefit to thinking that way in my experience. Thinking differently has only been a career impediment or gotten on my teachers nerves for being "smart" in my experience.
For example, as a young kid I saw a geometric ball made up of hinges that allow it to expand and contract, and in some stages it looks a little like a gear. So then I started wondering if you change gears instead of switching gears in a car. Then a decade or so later I started seeing CVT transmissions in cars, which is the same concept where you can change the size/ratio by expanding or contracting the roller instead of switching gears.
Last year's Advent of Code had a task that was NP complete and lacked good well known approximation algorithms. I almost gave up on it when I realised as that feels impossible
In practice the data was well behaved enough and small enough that it was very doable.
I agree in the specific case that the state of the art is in a local maxima, but saying "the best way to approach a problem is by not being aware of disregarding previous attempts" ignores the much more frequent banal work of iterative improvement. Leaping out of a local maxima is rare and sexy and gets articles written about you and is important, but the work of slowly iterating up to a nearby peak is also important.
I think progress needs both individual achievements who break out of preconceived notions and the communal work of improving within the notions we currently have.
I watched a casual youtube video by a philosophy professor talking about the same thing that great scholars are different than great thinkers. Many great thinkers came up with great philosophies because they misread past works.
This is a really tough problem. I don't think ignorance is the answer, but it's also difficult to set aside things that seam legitimate and go down a rabbit hole of reinventing something on a hunch. I guess the saving grace is that it's impossible to know enough about such a wide swathe that it's often a problem. With large models that conceivably can encode the collective knowledge, though, we have to be vigilant about creating an orthodoxy that ultimately constrains us.
"fall[ing] in the rut of thought" reminds me of this paragraph from "The Footpath Way":
> So long as man does not bother about what he is or whence he came or whither he is going, the whole thing seems as simple as the verb "to be"; and you may say that the moment he does begin thinking about what he is (which is more than thinking that he is) and whence he came and whither he is going, he gets on to a lot of roads that lead nowhere, and that spread like the fingers of a hand or the sticks of a fan; so that if he pursues two or more of them he soon gets beyond his straddle, and if he pursues only one he gets farther and farther from the rest of all knowledge as he proceeds. You may say that and it will be true. But there is one kind of knowledge a man does get when he thinks about what he is, whence he came and whither he is going, which is this: that it is the only important question he can ask himself. (The Footpath Way, Introduction (1))
Even though the author is talking about a different kind of knowledge, the image of sticks of a fan - where going down one gradually excludes the others - stuck with me.
Eventually I'll get to actually rolling a POC/tech demonstrator that just has less modules at perhaps less current density, for showing that even several kV DC can be efficiently transformed not just on paper to few or sub kV DC. At enough voltage grounding is no longer optional anyways, so might as well do essentially an auto transformer plus extra protection to protect humans against electric shock (RCD doesn't work directly, but the functionality can still be offered, it just has to sense quite differently).
Why DC? An overhead line only limited by peak voltage (arc) and thermals can carry twice the power when running DC instead of AC, assuming both measured relative to ground.
Also, you can run you transistors completely steady-state at all frequency components between their own switching fundamental and your load transients.
No more over provisioning just to make up for legacy 50/60 Hz AC.
Also, to a degree, you can just plug raw batteries in with that be DC grid, at most having a little bit of DC regulation to force the voltage a bit higher/lower than the batteries.
Like, a power supply basically rated to a couple percent of the battery input/output max power: only need to move the small extra voltage, though ofc at the full current.
Lastly, DC converters are just way smaller and lighter, so you could avoid the heavy bulky transformers in trains and alleviate power limiting from them. Relevant for fast double-decker trains because you'd prefer to have human space where you currently park the transformer.
I have to say though, novel development of technology by pulling recent innovations in the fundamental/material science fields underlying the target, is very not an easy thing to do.
You get novel branches of thought, but in the limit case, you're also reinventing the universe to bake an apple pie.
So there's something of a tradeoff between attempting to ensure people can do more than mimic existing doctrine and efficiency of getting up to speed without having to re-prove existing math and science.
The Balatro dev also, for example, has talked about how he was heavily influenced by several specific other games.
I would say not letting your thoughts be constrained by the bias of existing approaches.
This isn't easy, at all. It requires training yourself into having a open and flexible mind in general.
Not knowing about something is more like a cheat to get there easier.
But it's supper common that innovation involves a lot of well known foundation work and just is very different in one specific aspects, and it's quite hard to know about the other foundation work but not that specific aspect especially if you don't even know which aspect can be fundamentally be "revolutionized"/"innovated".
But what always help if you learn about a new topic is to try blindly first yourself and then look at what the existing approaches do. Not just for doing ground braking work but even for e.g. just learning math.
One of the math teachers I had over the school years before university used this approach for teaching math it yielded way better independent understanding and engagement it helped me a lot later one. Sadly I only had that teacher for 2 years.
> I'm beginning to think that the best way to approach a problem is by either not being aware of or disregarding most of the similar efforts that came before.
Extrapolating a "best way" from a couple of examples of success is bad reasoning. There are definitely ways in which it can be necessary to ignore the standing wisdom to make progress. There are also definitely ways in which being ignorant of the knowledge gained by past attempts can greatly impede progress.
I would point out, that it is also possible to question and challenge the assumptions that prior approaches have made, without being ignorant of what those approaches tried.
Figuring which is which, is indeed hard. Generally, it seems like it works well to have a majority of people expanding/refining prior work and a minority people going in and starting from scratch to figure out which of the current assumptions or beliefs can be productively challenge/dropped. The precises balance point is vague, but it seems pretty clear that going to far either direction harms the rate of progress.
Walter Isaacson said something similar about Einstein and Steve Jobs. Sometimes you need to reject commonly held assumptions to make progress. Einstein rejected the idea of ether. According to Isaacson this was probably because Einstein was working outside of university. Inside university, professors would likely have pushed Einstein to stick to the idea of ether.
> I'm beginning to think that the best way to approach a problem is by either not being aware of or disregarding most of the similar efforts that came before.
I don't think that's warranted.
You will find that the vast majority of lottery winners have bought lottery tickets. However that doesn't mean that buying lottery tickets is a good idea financially.
Consider a simplified example. There is some area of scientific research. Working within the framework gives you a 1 in 4 chance of making some minor improvement. Working outside the framework gives you a 1 in a million chance to create a great leap in knowledge.
For any single individual, the best choice is the former. The latter is a gamble that most people will lose, wasting their lives chasing crazy theories.
For society, you want a split. You need some doing the second option to have the eventual amazing discovery, but you also need to progress the current understanding further.
If we introduce a chance for the minor progress to lead to the same major advancement, it becomes a bit more simple for society to calculate the best allocation of researchers, but for any single person, the best option still remains to dedicate themselves to the small advancement.
Ok, but you are disregarding the 1000s of things the undergrad was aware of and the fact that he worked with other researchers who were aware of the existing results enough to understand the significance of the result.
The real trick is simply to try to understand things directly and not rely on proof by authority all the time.
It is just easier to think out of the box when you do not have your mind "polluted" with previous ideas and from time to time someone appears that was thinking just in another way, probably the most obvious to them without knowing about the orthodox thinking in the subject.
It takes time to read all the prior research. You could grow old by the time you get through it all. Likelihood of contributing to the field declines with age.
You might believe someone's proof of a conjecture and then be discouraged from delving any more into that rabbit hole.
More often than not you will be reinventing something. But that's not necessary less productive than reading other people's work. In the former case, you're at least making something, if not new.
So there are some arguments for being fresh in a well-trodden field with an ocean of research that you cannot boil all at once.
On the other hand, there is the publish-or-perish pressure in academia, which requires original research. You could just keep busy and throw enough shit agains the wall such that enough of it sticks.
Domain knowledge is valuable as you can wield it as opportunities arise to great effect. This lets you leap frog problems by applying known solutions. There's risk of being blind to novel approaches that require innovation.
Being capable of tackling problems from first principles is invaluable because we frequently encounter problems that are novel in some dimension, even if that dimension is the combination of dimensions. This lets you leap frog large problems by decomposition, possibly going against the grain and innovating by, hopefully, simplifying. However there is risk in falling into traps that countless others have already learned the hard way.
This may come as a surprise to some but, believe it or not, you can have both. In fact, you should.
There is certainly a need for ignoring common wisdom if you want to make something new. I don't think being unaware of it is necessary as long as you are willing to go forward while being told that you are on a fool's errand.
It's less that it's the best way to approach a problem, but that it optimizes for a different goal. Building on existing knowledge is how you find the local maxima for a problem by moving along the slop you have. Starting from scratch is how you find different slopes, which may lead to higher local maximas.
Of course, if you happen to be on a slope that leads to the global maxima, starting from scratch is far less effective. We don't really know where we are usually, so there's a trade-off.
There was a good article posted to HN years ago that covered this and used rocketry as one of the examples, but I don't recall what it was. The author was well known, IIRC.
In university lectures, we'd be presented with a problem on one slide, and then like ten seconds later the solution on the next. I'd usually cover my ears and look away because I was still busy coming up with my own solution!
> I'm beginning to think that the best way to approach a problem is by either not being aware of or disregarding most of the similar efforts that came before.
I think sometimes this is true. On the time I've had new starters on my engineering team I've always tried to teach them about the problem before they get exposed to any of our solutions. Sometimes they will have brand new insights that we've been completely blind to. It doesn't always happen, but there is only one opportunity for this, once they've seen the solutions they can't be unseen.
Somewhat surprisingly (to me), this is also found for User Interfaces [0]. The best initial design for a User Interface for a feature phone was done by designers who were not shown previous work by other designers. Iterations based on previous designs were only better if they were shown the "winning" initial design".
This soinds like the approach deepseek CEO used for hiring. He favored young inexperienced teams so they can bring a fresh perspective and try things from new way. It paid off nicely.
> I'm beginning to think that the best way to approach a problem is by either not being aware of or disregarding most of the similar efforts that came before...
That depends...
- Krapivin was an undergrad, tinkering with stuff for fun. If he'd put a couple months into this, and just ended up re-inventing a few things? That'd be decently educational.
- Vs. if your team needs to ship product, on something resembling the schedule? Yeah. You definitely stick to tried-and-true algorithms.
When training physically, you can overtrain one muscle and depend on them. By not using those muscles on purpose you can improve your other muscles.
It is well known that limitations improve creativity.
That said I still think the best path is to learn a classical path, if you want you can question some axioms, but it's mostly irrational in that there's almost no reward for you personally, except clout, most of the reward goes to the whole science.
In terms of practical engineering, this is also why I love to do side projects that reject existing bodies of libraries, and try to work up from first principles, and/or focus on composability rather than integration.
It's a trade-off, at first it takes longer to iterate on features, but sometimes a more minimal and/or composable tool finds its way to production. Real Systems are made of duct tape anyways.
I used to think this a few decades ago. I think it's just as accessible with some mix of anti-authoritarianism and defiant personality.
Essentially you learn a thing, you accept it for now and you think "well but maybe!"
Like I personally think there should be multiple mathematical zeroes but I accept it as wackiness unless I can clearly demonstrate coherency and utility as to why.
> I'm beginning to think that the best way to approach a problem is by either not being aware of or disregarding most of the similar efforts that came before.
Everyone likes to focus on why you cannot do and why trying will be futile.
You don't have to disregard prior efforts. You just have to focus on one simple question:
> I'm beginning to think that the best way to approach a problem is by either not being aware of or disregarding most of the similar efforts that came before.
Survivorship bias, you aren't aware of all the failures where people who were unaware of prior art made all the mistakes predictable to people who were.
> I'm beginning to think that the best way to approach a problem is by either not being aware of or disregarding most of the similar efforts that came before.
Maybe it's just because there are more people working on these problems who don't know previous approaches than the opposite.
This is nonsense. You need to double check the answers, spot mistakes, adapt the code to your needs and go to the sources it lists to learn rapidly about that particular thing.
I think it's 2 different approaches, some enjoy the one (playing with the thing itself) and some enjoy the other (playing with the various secondhand abstractions that refer to the thing).
They are different tastes. They deliver different results.
That's a load of selection bias though. I'm sure there have been many, many more people who don't know anything about deck builder games who tried to make one and didn't succeed.
This is relevant to HN because I'm probably paraphrasing this incorrectly but pg has said the following about why it's hard to launch a startup: the vast majority of ideas that sound stupid are, in fact, stupid. The ones that sound like great ideas have most likely already been done. Thus, the startups that have huge growth potential tend to be the ones that sound stupid given the conventional wisdom (so unlikely to have been tried yet) but are, contrary to the norm, actually great ideas in disguise. These ideas are basically by definition very rare.
I get what you are saying but what if the amount of breakthroughs by people that did know about what came before was orders of magnitude higher than this number, would that change your mind?
Similarly, the fortran or algol team implemented a lot of optimization tricks on first try, things that are now considered advanced, without "knowing it".
You hear about this stuff because it's notable. Almost 100% of the time, if you disregard what other people have done, you are going to waste a lot of time.
For every case like this you have thousands of people who waste a huge amount of time and mental effort recreating something that has already been invented.
Maybe the best way to have the best of both worlds is to ensure well-established areas of research are open to “outsider art” submissions on the topic?
You've remembered two examples of this (arguably) happening, so you attempt to draw a conclusion based on the ease with which you came up with those examples. But in reality, this method of inference is prone to error, as it doesn't consider the denominator, or how many attempts were made to achieve the results you're able to remember.
Sounds like a bit of survivorship bias. Every success from people following well known principles does not translate into a blog post or research paper. You also don’t hear about all of the people who failed because they tried something novel and it didn’t work.
I would suggest positive takeaways is to: trust but verify. If you’ve got a novel solution idea and don’t understand why others aren’t doing it that way, do both and compare. You’re guaranteed to learn something one way or another. Also: if you reinvent the wheel or do something suboptimal then that’s okay too. Sometimes the solutions don’t make sense until you see what doesn’t work. Likewise: be open to learning from others and exploring solutions outside of your predefined notion of how things should work.
A while back I was asked to write some software to make print out labels small enough to fit through a button hole. I convinced myself it wasn't possible because the spacing between the cutter and the print head. Then my boss showed me a label that was printed by a competitors system and I had it figured out within an hour or so. Although this was a minor thing it convinced how powerful knowing something is possible (or not) really is.
> You also don’t hear about all of the people who failed because they tried something novel and it didn’t work.
Or all the people who, through ignorance or hubris, thought they could do better than the state of the art. Or all the people who independently invent things that already exist. These may be what you're referring to, but thought it's worth re-stating these as they are far more common cases than people inventing truly novel approaches.
I cannot tell you the number of times I thought I invented something new and novel only to later find out it already existed. So, while it is true that you can sometimes find paths untraveled, many things related to first principles seem already heavily explored in CS.
I actually have a hot take that is related to this (been showing up in a few of my recent comments). It is about why there's little innovation in academia, but I think it generalizes.
Major breakthroughs are those that make paradigm shifts. So, by definition, that means that something needs to be done that others are not doing. If not, things would have been solved and the status quo method would work.
Most major breakthroughs are not the result of continued progress in one direction, but rather they are made by dark horses. Often by nobodies. You literally have to say "fuck you all, I'm doing this anyways." Really this is not so much different than the founder mentality we encourage vocally yet discourage monetarily[0]. (I'm going to speak from the side of ML, because that's my research domain, but understand that this is not as bad in other fields, though I believe the phenomena still exists, just not to the same degree). Yet, it is really hard to publish anything novel. While reviewers care a lot about novelty, they actually care about something more: metrics. Not metrics in the way that you provided strong evidence for a hypothesis, but metrics in the way that you improved the state of the field.
We have 2 big reasons this environment will slow innovation and make breakthroughs rare.
1. It is very hard to do better than the current contenders on your first go. You're competing against not one player, but the accumulated work of thousands and over years or decades. You can find a flaw in that paradigm, address the specific flaw, but it is a lot of work to follow this through and mature it. Technological advancement is through the sum of s-curves, and the new thing always starts out worse. For example, think of solar panels. PVs were staggeringly expensive in the beginning and for little benefit. But now you can beat the grid pricing. New non-PV based solar is starting to make their way in and started out way worse than PV but addressed PV's theoretical limitations on power efficiency.
2. One needs to publish often. Truly novel work takes a lot of time. There's lots of pitfalls and nuances that need to be addressed. It involves A LOT of failure and from the outside (and even the inside) it is near impossible to quantify progress. It looks no different than wasting time, other than seeing that the person is doing "something." So what do people do? They pursue the things that are very likely to lead to results. By nature, these are low hanging fruit. (Well... there's also fraud... but that's a different discussion) Even if you are highly confident a research direction will be fruitful, it will often take too much time or be too costly to actually pursue (and not innovative/meaningful enough to "prototype"). So we all go in mostly the same direction.
(3. Tie in grants and funding. Your proposals need to be "promising" so you can't suggest something kinda out there. You're competing against a lot of others who are much more likely to make progress, even if the impact would be far lower)
So ironically, our fear of risk taking is making us worse at advancing. We try so hard to pick what are the right directions to go in, yet the truth is that no one has any idea and history backs this up. I'm not saying to just make it all chaotic. I think of it more like this: when exploring, you have a main party that travels in a set direction. Their strength together makes good progress, but the downside is there's less exploration. I am not saying that anyone should be able to command the ship on a whim, but rather that we need to let people be able to leave the ship if they want and to pursue their hunches or ideas. Someone thinks they saw an island off in the distance? Let them go. Even if you disagree, I do not think their efforts are fruitless and even if wrong they help map out the territory faster. But if we put all our eggs in one basket, we'll miss a lot of great opportunities. Right now, we let people off the main ship when there's an island that looks promising, and there are those that steal a lifeboat in the middle of the night. But we're all explorers and it seems like a bad idea to dissuade people who have that drive and passion in them. I know a lot of people in academia (including myself) who feel shackled by the systems, when really all they want to do is research. Not every one of these people are going to change things, in fact, likely most won't. But truth is, that's probably true if they stay on the ship too. Not to mention that it is incredibly common for these people to just leave academia all together anyways.
Research really is just a structured version of "fuck around and find out". So I think we should stop asking "why" we should pursue certain directions. "Because" is just as good of an excuse as any. In my ideal world, we'd publish anything if there is technical correctness and lack of plagiarism. Because the we usually don't know what is impactful. There are known knowns, known unknowns, and unknown unknowns. We really are trying to pretend that the unknown unknowns either don't exist, are not important, or very small. But we can't know, they're unknown unknowns, so why pretend?
[0] An example might be all the LLM based companies trying to make AGI. You want to compete? You're not going to win by making a new LLM. But one can significantly increase their odds by taking a riskier move, and fund things that are not well established. Other types of architectures. And hey, we know the LLM isn't the only way because we humans aren't LLMs. And we humans also use a lot less energy and require far less data, so even if you are fully convinced that LLMs will get us all the way, we know there are other ways to solve this problem.
No, no, no. That's the wrong thing to take away from it.
Something I got from Richard Feynman descriptions of his method of study, was to first and foremost, read the prompt of the problems, and work diligently trying to solve the problems by himself, for a reasonable amount of time.
Then, and only then, go and read the other solutions. The solutions can be the same, they can be different, and by doing all this preliminary work the researcher can truly understand the nuances of these solutions, something they can't grasp if the solutions were shown just after reading the problem.
So, the best way to approach a problem is:
- Try to solve it by yourself. Several times if necessary, give it an honest effort.
- Then, solved or not, go and read other people's solutions.
Ok, big shout out to monort [0] for the link to the video [1].
This is just a quick overview from a single viewing of the video, but it's called "funnel hashing". The idea is to split into exponentially smaller sub arrays, so the first chunk is n/m, the second is n/(m^2), etc. until you get down to a single element. Call them A0, A1, etc., so |A0| = n/m, |A1| = n/(m^2) etc., k levels in total.
Try inserting into A0 c times. If it fails, try inserting into A1 c times. If it fails, go down the "funnel" until you find a free slot.
Call \delta the fraction of slots that are empty (I'm unclear if this is a parameter that gets set at hash table creation or one that's dynamically updated). Setting c = log(1/d) and k = log(1/d) to get worst case complexity O(log^2(1/d)).
This circumvents Yao's result by not being greedy. Yao's result holds true for greedy insertion and search policies and the above is non-greedy, as it's cascading down the funnels.
There are probably many little hairy details to work out but that's the idea, as far as I've been able to understand it. People should let me know if I'm way off base.
This very much reminds me of the "Distinct Elements in Streams" idea by Chakraborty, Vinodchandran and Meel[2].
Actually they propose two algorithms: Funnel Hashing and Elastic Hashing. Funnel Hashing is "greedy" and defeats the Yao's conjecture that concerns greedy hash mechanisms. Elastic Hashing is "non-greedy" and provides a better amortized time than greedy algorithms.
That it circumvents Yao’s conjecture by being non-greedy contradicts the article. Is the article wrong or is your understanding of the paper? I don’t know, just want to see if you’re noticing something the article’s authors don’t know.
> Farach-Colton, Krapivin and Kuszmaul wanted to see if that same limit also applied to non-greedy hash tables. They showed that it did not by providing a counterexample, a non-greedy hash table with an average query time that’s much, much better than log x.
One thing I don't understand from watching the video, is what happens in the (very rare) case that you get collisions all the way down the funnel. I assume this is related to the "One special final level to catch a few keys" (around 14:41 in the video), but given that it has to be fixed size, this can also get full. What do you do in that case?
Quite a neat idea that could be useful for memory-constrained environments.
[Shameless plug]:
If you are into hashtables, you might want to check out Dandelion Hashtable [0]. We use it in our next-generation databases, and it was published in HPDC'24. It is currently the fastest in-memory hashtable in practice.
It improves closed addressing via bounded-cacheline chaining to surpass 1B in-memory requests/second on a commodity server.
This strikes me as something that many people probably figured out a non-rigorous version of and didn't think it was special.
It's kind of one of those resource management hacks you do when you're constrained and screwed by limitations. Splitting things up by priority is a common go-to for resource allocation. This is a spin on that.
I wonder how many other "in the trenches hacks" people have done that overturn widely accepted things the inventors didn't realize were a big deal: "well I usually have a bunch of deliveries to make and I've figured out a clever way to map out the quickest path... "
Don't get me wrong - recognizing it and then formalizing it, doing the work, publishing the paper - that's a lot of effort. I'm not taking that away.
Also relevant: in this particular case the authors themselves note that the results better theoretical behavior in the worst case, but no practical uses yet. So I think any software engineer exploring this direction would have abandoned it pretty quickly, for the same reason that galactic algorithms aren't typically invented by them either (unless they also do compsci as a hobby of course). In fact the Wiki page for galactic algorithm mentions another optimal-but-impractical hash table as one of its examples[0][1].
> I wonder how many other "in the trenches hacks" people have done that overturn widely accepted things the inventors didn't realize were a big deal: "well I usually have a bunch of deliveries to make and I've figured out a clever way to map out the quickest path... "
A lot of them. Having said that: yes, I can imagine that others would have thought up Dijkstra's shortest path algorithm, since he himself said it came to him while shopping, and that it only took him twenty minutes to reason through the original O(n²) algorithm. (edit: oh wait, that's what you're alluding to isn't it? Heh, that went straight over my head).
On the other hand, I don't think the faster versions of Dijkstra's algorithm would have been invented by anyone without at least some understanding of priority queues and big-O behavior. And at that point I hope people realize that they possess some specialized knowledge that might not be entirely common.
In fact, I'd argue that the true strength of Dijkstra's write-up is that it gives us a vocabulary to reason about it and come up with specialized data structures for particular situations.
Anyway, what you're touching on is the difference between engineering and science: engineering works with confidence built from tests, rules of thumb that reflect lessons learned from historical results, and (in modern times) verified predictions from science. Those rules of thumb might be used when lacking a deeper scientific understanding of why it works. The tests might exist to work around the limitations of scientific knowledge (e.g. modelling turbulence). Science creates insights and predictions through modelling of empirical results. At least that's the difference according to Bill Hammack[0].
In an ideal world the two professions work together and build on each other's results to propel each other forward of course.
Thanks so much for this link. I remain convinced that papers are so much more understandable with an accompanying talk by the creators. I wish papers would just come with a video talk included.
Exactly, the authors get to eschew the formalism required in papers. Often the core ideas of research are simple in themselves and the real complexity lies in formally proving the results.
Also, I'd not be surprised if someone already invented and used this funnel hashing technique in say the 80's in some game or whatnot but just never realized what they had stumbled onto. Not to diminish the research, it's very ingenius.
Thanks for the video, def a lot better than the article.
I do find it a bit weird that this is somehow better than just over-allocating (and thus reducing the chances of key collisions, which also makes worst case 'less worse') given his approach also allocates more memory through the aux arrays.
I don't think anybody is really saying it is. Academics treat big-Oh performance on very very full hash tables like a sport. Real world code on real world CPUs often has a more complex cost function than what the academics considered; cache sizes, fitting in a cacheline, memory bandwidth, icache pressure, ...
He's not allocating through aux arrays, he's splitting the already allocated memory into log(n) layers. You can just track those aux arrays with math in the implementation.
It's probably not better than over-allocating except in memory constrained scenarios. But the overhead of funnel hashing is not high - it requires 0 extra memory
Overallcoation has a limit. You only have so much RAM/storage. Beyond that you start swapping.
I could really use a hash table (or similar structure) that degrades less with higher occupancy.
Skimming the paper [1], the key difference they used is that their hash table insertion algorithm will probe further than the first empty slot, instead of greedily filling the first empty slot it finds. They combine this with a clever probing sequence which provably finds empty slots efficiently, even if the table is very full.
This means insertions when the hash table is less full are slower, but you avoid the worst-case scenario where you're probing for the last (few) remaining open slot(s) without any idea as to where they are.
An interesting theoretical result but I would expect the current 'trick' of simply allocating a larger table than necessary to be the superior solution in practice. For example, Rust's hashbrown intentionally leaves 1/8th (12.5%) of the table empty, which does cost a bit more memory but makes insertions/lookups very fast with high probability.
I might be misreading their algorithm, but from my look at the paper the key improvement is a non-uniform strategy where they divide the array into buckets and focus on different buckets as they fill the table. This increases the average number of locations to be probed even when the table is emptier. They still place the item in the first empty slot they see with this strategy.
The "skipping slots" has to do with jumping ahead in the hash sequence.
But you could do some hybrid, where you do greedy fill for a while and then switch to this fancier fill once your table is approaching full (using some heuristic)?
Not really because you have to use the same algorithms during subsequent lookups. Imagine you add a bunch of items with insert algorithm A, then you cross some threshold and you insert some more items with Algorithm B. Then you delete (tombstone) a bunch of items. Now you look up an item, and the slot is filled, and you're stuck. You have to proceed your lookup with both algorithms to find what you're looking for (thereby giving up any performance benefits) because the current occupancy doesn't tell you anything about the occupancy at the time when the item you were looking for was inserted.
This is neat! I always wondered if there would be a way to 'containerize' tables like this. IE a regular table is like a bulk carrier ship, with everything stuffed into it. If you could better organize it like a container ship, you could carry much more stuff more efficiently (and offload it faster too!)
The theoretical properties of hash table always seemed so impressive to me that they bordered on magic (and this just extends them). What seemed crazy was how they could be so much better than trees, which to me were intuitively the most efficient way to store data.
What I realized is that the theory of hash tables involves a fixed-sized collection of objects. For this fixed collection, you create a hash-function and used that like a vector-index and store the collection in a (pre-allocated) vector. This gives a (fuzzy-lens'd) recipe for O(1) time insert, deletion and look-up. (The various tree structures, in contrast, don't assume a particular size).
The two problems are you have to decide size beforehand and if your vector gets close to full, you insert etc processes might bog-down. So scanning the article, it seems this is a solution to the bogging down part - it allows quick insertion to a nearly-full table. It seems interesting and clever but actually not a great practical advance. In practice, rather than worrying a clever way to fill the table, I'd assume you just increase your assumed size.
Edit: I'm posting partly to test my understanding, so feel to correct me if I'm not getting something.
Proofs of constant time operations include time taken to resize the table. This takes much more time (linear in the size of the table), on insertions when the table is resized, but that time is amortized over all the insertions already done. It still works out to constant average time if you grow the table enough each time (once it starts to get too full) so it happens with decreasing frequency.
It looks to me like the idea is, as you generally describe, that you segment your table into a 2d structure (well conceptually) and proceed to fill one ‘row’ at a time until it’s about 75% full, at which point you move on to the next one.
I don’t have time to fully grok the paper, but they claim this makes insertion consistently fast (I believe this until we’re at 75% of total capacity, but maybe they have some other mode for filling when they’re at 75% in every row?). They also claim retrieval is fast, and I didn’t read enough to understand how even retrieval works, or why it is faster.
I’ll put out that there a lot of times that it would be really nice to have a nearly full hash table still, you know, work. You can’t always change the size of one during execution of a program. And, in some environments memory counts a lot. That said, I would like to see and play with an implementation — I’m not sure this is ‘worth it’ in the general case.
It is also probably cache inefficient, as are most things about hash tables, with the exception of linear probing for reading out of a fairly full one, in which case, you get to just keep pulling stuff directly out of memory to check it. So, it’s not clear to me that this is performance wise worth it. Anyway, I’d like to fully understand it, it seems like an interesting new idea.
> And for this new hash table, the time required for worst-case queries and insertions is proportional to (log x)2 — far faster than x.
> The team’s results may not lead to any immediate applications
I don't understand why it wouldn't lead to immediate applications. Is this a situation where analysis of real-world use cases allows you to tune your hash implementation better than what a purely mathematical approach would get you?
I haven't read the paper, but sometimes asymptotic improvements do not translate to real world improvements due to a large multiplicative factor in the complexity that gets factored out in the O() analysis. So the dataset required to see speed-up is impractically large.
In this case "x" is 1/d where d is the unused fraction of space.
So if you leave 0.1% of your hashtable unused your x is 1000 - quite problematic. However if you leave 12.5% of your hashtable unused your x is 8 - quite reasonable, and not something logarithmic behavior would necessarily speed up, for reasonable constants.
This is pretty much exactly the case for this algorithm. It is a very slow hash table due to the lack of locality. It seems to only have benefits at very large size and very high load factor.
At that scale, it may be practically better to use a B-tree or a radix tree over a 128-bit hash of the key.
I'm not up to date on the state of the art but I've implemented hash tables a few times and we would expand the hash table whenever it was 75% full, which means x is never greater than 4. Improving the runtime from O(x) to O((log x)^2) doesn't matter when x is so small.
I imagine there are some niche memory-constrained applications where you'd let x get larger but I never ran into them personally.
Robin Hood hash works well with high load factors ~95% because it balances the average lookup with a fair distribution of buckets. That makes it ideal when you don't want to waste memory on bloated tables.
I'm pretty sure nobody uses uniform probing hash tables in the wild. Every time I have wanted to have very high load factors (>90%), cuckoo hashing has been good enough, and below 70-80%, linear probing is blazing fast and absolutely good enough.
Last time I needed high occupancy I was for a cache. So I've stirred my 32b keys with Phi-mul-rshift and randomly (xor-shift) displaced picked slot old value by log2(size)/2 with linear probing of up to log2(size).
In practice the worst-case operations are avoided by reserving a little more space for the hash table. And the new results come at the cost of slower “good case” insertions.
Real world hash table can either be resized to avoid worst-case degradation or else be provisioned to have a good amount of slack when the worst expected case occurs. (E.g. a hash table used a cache will apply pressure to evacuate old entries before it gets foo full.)
Bucket hashing obviously has no problem finding a free spot. I didn't say chained on purpose because chains can be resizable vectors, so we load at most one pointer when going from table to bucket.
its more like real-world hash tables are tailored to the specific pattern of querying and inserting data.
if you know empirically that 80% of your requests come to a tiny subset of hot keys - you make a specific hashset just for these hot keys with constant access time, while keeping the rest of the table cold.
your L2 cache is an example of such optimization - a high bandwidth and low latency memory on the CPU die, that is orders of magnitude faster than random DRAM access - a tiered memory model
Isn’t the problem that the scaling behavior only dominates with infinite n?
If you have a constant factor, that doesn’t go into the scaling rule, so having something scale (log x)2 could still be 100 times more expensive than something that scales linearly with x for all x smaller than 2^100.
Perhaps the technique requires a lot of additional metadata, so that you could fit a 50% full "normal" hash table in less memory than it takes to store a 99% full hash table using this new approach. Thus the normal hash table can always outperform the new hash table in practice despite worse big O performance because it doesn't hit the pathological worst case except in situations where the new hash table would have run out of memory.
The developer of Balatro made an award winning deck builder game by not being aware of existing deck builders.
I'm beginning to think that the best way to approach a problem is by either not being aware of or disregarding most of the similar efforts that came before. This makes me kind of sad, because the current world is so interconnected, that we rarely see such novelty with their tendency to "fall in the rut of thought" of those that came before. The internet is great, but it also homogenizes the world of thought, and that kind of sucks.
I think this is true only if there is a novel solution that is in a drastically different direction than similar efforts that came before. Most of the time when you ignore previous successful efforts, you end up resowing non-fertile ground.
[1]: https://fliptomato.wordpress.com/2007/03/19/medical-research...
That being said, I so disagree with just taking the "state of the art" as written in stone, and "we can't possibly do better than library x" etc.
I think the issue is that when a lot of people have put work into something you think that the chances of success yourself are low. This is a pretty reasonable belief too. With the current publish or perish paradigm I think this discourages a lot of people from even attempting. You have evidence that the problem is hard and even if solvable, probably is timely, so why risk your entire career? There are other interesting things that are less risky. In fact, I'd argue that this environment in of itself results in far less risk being taken. (There are other issues too and I laid out some in another comment) But I think this would look identical to what we're seeing.
That being said, what we need is more rigorous thinking and more courage pursuing the truth where it leads. While advisors can be useful guides, and consensus can be a useful data point, there can also be an over-reliance on such opinions to guide and decide where to put one's research efforts, what to reevaluate, what to treat as basically certain knowledge, and so on. Frankly, moral virtue and wisdom are the most important. Otherwise, scientific praxis degenerates into popularity contest, fitting in, grants, and other incentives that vulgarize science.
Everything is impossible until someone comes along that's crazy enough to do it.
If you read the prior work too early to you get locked into existing mindsets. If you never read it then you miss important things you didn’t thought of.
Even if your approach is less good than the prior work (the normal case) you gain important insights into why the state of the art approach is better by comparing it with what you came up with.
I agree, though in some cases coming up with your own ideas first can result in you becoming attached to them, because they are your own. It is unlikely for this to happen if you read the prior work first.
Though I think overall reading the prior work later is probably still a good idea, but with the intention not to become too impressed with whatever you come up before.
He was aware of deck builders and was directly inspired by Luck be a Landlord, but he was not aware of just how massive the genre is.
Direct quote from the developer:
> The one largest influence on Balatro was Luck Be a Landlord. I watched Northernlion play for a few videos and loved the concept of a non-fanatsy themed score attach roguelike a ton, so I modified the card game I was working on at the time into a roguelike.
> I cut myself off from the genre at that point intentionally, I wanted to make my own mistakes and explore the design space naively just because that process is so fun. I hear the comparison to Slay the Spire a lot but the truth is that I hadn't played that game or seen footage of it when I designed Balatro, not until much later.
https://www.reddit.com/r/Games/comments/1bdtmlg/comment/kup7...
I ended up failing and taking his course again (because I had A Lot going on in college), and thus, noticed something.
Each semester, on one of the assignments in the latter half of the class, he assigned one problem out of, perhaps, 30 in the problem set, where as written, it was actually an open problem, and then a day or two before they were due, he'd send out an "oops, my bad" revised version.
I suspect that this was not an accident, given that it always happened only once.
A related anecdote is about George Dantzig (perhaps best known for the simplex algorithm):
> During his study in 1939, Dantzig solved two unproven statistical theorems due to a misunderstanding. Near the beginning of a class, Professor Spława-Neyman wrote two problems on the blackboard. Dantzig arrived late and assumed that they were a homework assignment. According to Dantzig, they "seemed to be a little harder than usual", but a few days later he handed in completed solutions for both problems, still believing that they were an assignment that was overdue. Six weeks later, an excited Spława-Neyman eagerly told him that the "homework" problems he had solved were two of the most famous unsolved problems in statistics.
https://en.wikipedia.org/wiki/George_Dantzig#Education
The argument of 'if that's such a good idea, why wouldn't somebody have just done it already?' seems to have grown exponentially with the advent of the internet. And I think it's because the visibility of competence of other's became so much more clear. For those who lived through e.g. Carmack's Golden Age you knew you were never going to be half the coder he was, at least based on the image he successfully crafted. That 'slight' at the end is not to say he wasn't a brilliant developer or even perhaps the best in the world at his peak, but rather that brilliance + image crafting creates this Gargantuan beast of infallibility and exceptionalism that just doesn't really exist in reality. I think it's from this exact phenomena that you also get the practical fetishism of expertise.
The biggest gains in theory and in practice are far more often obtained by masters of craft, giving much more weight to attacking problems from a position of knowledge.
In fact, even in this case, this progress required that the author was aware of very recent results in computer science, was thinking deeply about them, and most likely was scouring the literature for pieces to help. The “Tiny Pointers” paper is mentioned directly.
I would rather, instead of thinking ignorance is a key ingredient to discovery, that instead it's the willingness to try things.
[1] https://arxiv.org/abs/2111.12800
After asking ChatGPT not to agree with me that your comment and these two different approaches to solving problems are the alike, it concluded there still might be similarities between the two.
[0] https://youtu.be/P_fHJIYENdI?feature=shared&t=1030
[1] https://chatgpt.com/share/67aa8340-e540-8004-8438-3200e0d4e8...
I don't think so. If you are not aware of the box, there is a much greater chance for you to be well within the box and not realizing it. By that, I mean that either you rediscovered something, or you are wrong in the same way as many others before you. By chance, you may find something new and unexpected, but that's more of an exception than the rule.
But when we try to talk details, I asked him for example do you use minimax with alphabeta pruning and he told me like “I’m not sure if I am using minimax or what that is :(“ .. I ask him to describe what he does, he essentially describes minimax with pruning. I’ve sorta figured out that he must be doing some very intelligent version of an aspiration search. It’s really eye-opening because he doesn’t have any of this training. He’s never seen any related algorithms, he’s just figuring all this out on his own.
I'll also say I think that diversity in approaches is more important than One Right Way. Some people need to set out on their own, while others spend decades refining one technique. Both have led to extraordinary results!
Many problems are abstract and so we have to build "cartoon" models of what's going on, trying to distill the essence of the problem down to a simple narrative for what the shape of the problem space is and where the limitations are. That often works but backfires when the cartoon is wrong or some assumptions are violated about when the cartoon description works.
Results like this are pretty rare, nowadays, and I suspect this happened because the problem was niche enough or some new idea has had time to ferment that could be applied to this region. This seems like a pretty foundational result, so maybe I'm wrong about that for this case.
A lot of progress is made when there's deeper knowledge about the problem space along with some maturity for when these cartoon descriptions are invalid.
>A number of engineers are sitting together in a room, bouncing ideas off each other. Out of the discussion emerges a new concept that seems promising. Then some laptop-wielding person in the corner, having performed a quick Google search, announces that this “new” idea is, in fact, an old one—or at least vaguely similar—and has already been tried. Either it failed, or it succeeded. If it failed, then no manager who wants to keep his or her job will approve spending money trying to revive it. If it succeeded, then it’s patented and entry to the market is presumed to be unattainable, since the first people who thought of it will have “first-mover advantage” and will have created “barriers to entry.” The number of seemingly promising ideas that have been crushed in this way must number in the millions. What if that person in the corner hadn’t been able to do a Google search?
>In a world where decision-makers are so close to being omniscient, it’s easy to see risk as a quaint artefact of a primitive and dangerous past (…) Today’s belief in ineluctable certainty is the true innovation-killer of our age
Could knowing about prior research skew one's perspective and tarnish novel thought? Sure. But we don't know. Maybe we'd have an even better Balatro if the creator knew about some other deck builders. Maybe we wouldn't, we don't know. We cannot prove the counterfactual.
On the opposite extreme, there are examples of thinkers whose success stemmed from knowing much about one domain or much about many domains and integrating (Luhmann, Goethe, Feynman, Von Neumann etc.). In the general case, I think we are probably much better off promoting knowledge and study, and not ignorance and chance.
That said, I do think we should retain our willingness to play and to try things that are "out of bounds" with respect to the existing accumulated knowledge. We should live informed lives, but play and explore like unschooled children.
At least part of the result was already known, and the fact authors didn't know about it mostly goes to the large corpus of knowledge we already posses.
But the core inspiration came from looking at another recent research paper "Tiny Pointers": that is totally against your premise.
If Krapivin was a software engineer looking to implement this solution as optimization for a particular problem, he would have done so without ever thinking of making a research paper to prove it formally, but mostly relied on benchmarking to prove his implementation works better.
Now, it has always been somewhat true that lots of existing knowledge limits our creativity in familiar domains, but you need both to really advance science.
https://en.wikipedia.org/wiki/Shoshin
This is an "Einstein failed Math" fallacy. It's true that novel and notable work tends strongly not to be bound by existing consensus, which when you say it that way is hardly surprising. So yes, if consensus is wrong in some particular way the people most likely to see that are the ones least invested in the consensus.
But most problems aren't like that! Almost always the "best" way to solve a problem, certainly the best way you're going to solve the problem, is "however someone else already solved it". But sometimes it's not, and that's when interesting stuff happens.
There are a lot of great deck builders that are not roguelike. Has he played Dominion, Magic the Gathering, Hearthstone?
Have we been grinding away in the right direction and are only moments away from cracking the problem, or should we drop everything and try something completely new because we've obviously not found the solution in the direction we were heading.
To put it into a CS type context - Should we be using a DFS or BFS search for the solution, because we don't have knowledge of future cost (so UCS/Djikstra's is out) nor do we know where the solution lies in general (so A* is out, even if you ignore the UCS component)
I don't think there's any evidence of this. Yao's conjecture is not exactly standard undergraduate material (although it might be—this is a commentary on detail rather than difficulty. But i certainly didn't encounter this conjecture in school). If not knowing this conjecture was the key, millions and millions of students failed to see what Krapivin did. I imagine you'd have to ask him what the key to his insight is.
Hashing is a pretty unintuitive sort of computation. I'm not surprised that there are still surprises.
one might “... overlook the great difference between exact theory and approximate theory. Again, let me emphasize my great respect for approximate theory. [...] if one starts looking for an effect predicted by this kind of theory to be impossible, the odds are against a favorable outcome. Fortunately, however, the community of scientists, like that of horseplayers, contains some people who prefer to bet against the odds as well as a great many who always bet on the favorite. In science we should, I think, do all we can to encourage the man who is willing to gamble against the odds of this sort.
This does not mean that we should encourage the fool or the ignoramus who wants to play against suicidal odds, the man who wants to spend his time and usually someone else’s money looking for an effect incompatible with, let us say one of the conclusions reached by Willard Gibbs. Gibbs started from thoroughly proven generalizations, the first and second laws of thermodynamics, and reasoned from them by exact mathematical procedures, and his conclusions are the best example I know of exact theory, theory against which it is futile to struggle.”
Both Danny Trejo and Tim Allen spent time in prison before becoming famous. While that's interesting, I'm not sure I'm ready to believe that's the best way to become a professional actor.
Edit to be a little less snarky, apologies:
"Outsiders" are great for approaching problems from fresh angles, but I can almost guarantee that the majority of nuts-and-bolts progress in a field comes from people who "fall in the rut of thought" in the sense that they area aware enough of the field to know which paths might be most fruitful. If I had to place a bet on myself, I wouldn't take a wild uninformed swing: I'd get myself up to speed on things first.
Outsiders sometimes do great work. They also sometimes:
https://www.reddit.com/r/mathmemes/comments/wq9hcl/terrence_...
One example is Geocentrism vs Copernican astronomical models -- Copernican could never have sprung from the status quo because everything revolved around the Earth in Geocentrism instead of around the Sun. You can't square that circle.
https://en.wikipedia.org/wiki/The_Structure_of_Scientific_Re...
To summarize: heliocentrism was known to the ancient Greeks, who realized the Sun was much bigger than the Earth, so it would seem logical to have the Earth go around the sun. But the counterargument was that if the Earth goes around the Sun, the stars should move relative to each other during the year, because of parallax, and they didn't have instruments that were precise enough to see it, so they assumed that they didn't. Copernicus major contribution wasn't heliocentrism, but the orbital periods of planets. And the model wasn't complete until Kepler calculated the shapes of the orbits. For details, watch the video, it is really good.
[1] https://youtu.be/YdOXS_9_P4U
Kuhn does not define a value-scale of both methods, on the contrary, he merely introduces the concept of different researchs: one being critical (developing new paradigms) and one being accumulating (further refining existing paradigms).
He also hints to the almost inevitably organic interactions between the two, such that critical research naturally evolves from a pragmatic need to express things simply from a new paradigm when the old one becomes too clumsy for a use case.
This is what happened in your example as well. Copernic (and later Galileo) did not invent heliocentrism out of the blue, the theory around it existed since antic Greece. It is even arguably the Renaissance, leading metaphysicists to revisit ancien texts, that spurred the idea to Copernic to consider it. But ultimately the need for the new paradigm was pushed by the need to revisit the calendar, which was drifting, and the difficulty to do it in a geocentric world, where you have to take planet retrocession into account.
(a) I wouldn't quite characterize the book as being "about this theory" — it's a bit more nuanced. He definitely says that it's usually younger scientists with less invested in the currently reigning theory that are most likely to push forward a revolution. However, I don't recall any examples in the book of people who where wholly _unaware_ of the previous theory.
(b) You should absolutely, definitely read it. It's a classic for a reason, and the writing style is a delight.
For example, as a young kid I saw a geometric ball made up of hinges that allow it to expand and contract, and in some stages it looks a little like a gear. So then I started wondering if you change gears instead of switching gears in a car. Then a decade or so later I started seeing CVT transmissions in cars, which is the same concept where you can change the size/ratio by expanding or contracting the roller instead of switching gears.
In practice the data was well behaved enough and small enough that it was very doable.
I think progress needs both individual achievements who break out of preconceived notions and the communal work of improving within the notions we currently have.
If anyone wants to watch: https://youtu.be/4vou_dXuB8M?si=Wdr7q96MFULPAEc4
Definitely something we should all keep in mind that sometimes you just have to pave your own way and hope it is great on its own merits.
> So long as man does not bother about what he is or whence he came or whither he is going, the whole thing seems as simple as the verb "to be"; and you may say that the moment he does begin thinking about what he is (which is more than thinking that he is) and whence he came and whither he is going, he gets on to a lot of roads that lead nowhere, and that spread like the fingers of a hand or the sticks of a fan; so that if he pursues two or more of them he soon gets beyond his straddle, and if he pursues only one he gets farther and farther from the rest of all knowledge as he proceeds. You may say that and it will be true. But there is one kind of knowledge a man does get when he thinks about what he is, whence he came and whither he is going, which is this: that it is the only important question he can ask himself. (The Footpath Way, Introduction (1))
Even though the author is talking about a different kind of knowledge, the image of sticks of a fan - where going down one gradually excludes the others - stuck with me.
1: https://www.gutenberg.org/ebooks/59813
Why DC? An overhead line only limited by peak voltage (arc) and thermals can carry twice the power when running DC instead of AC, assuming both measured relative to ground.
Also, you can run you transistors completely steady-state at all frequency components between their own switching fundamental and your load transients. No more over provisioning just to make up for legacy 50/60 Hz AC.
Also, to a degree, you can just plug raw batteries in with that be DC grid, at most having a little bit of DC regulation to force the voltage a bit higher/lower than the batteries. Like, a power supply basically rated to a couple percent of the battery input/output max power: only need to move the small extra voltage, though ofc at the full current.
Lastly, DC converters are just way smaller and lighter, so you could avoid the heavy bulky transformers in trains and alleviate power limiting from them. Relevant for fast double-decker trains because you'd prefer to have human space where you currently park the transformer.
I have to say though, novel development of technology by pulling recent innovations in the fundamental/material science fields underlying the target, is very not an easy thing to do.
You get novel branches of thought, but in the limit case, you're also reinventing the universe to bake an apple pie.
So there's something of a tradeoff between attempting to ensure people can do more than mimic existing doctrine and efficiency of getting up to speed without having to re-prove existing math and science.
The Balatro dev also, for example, has talked about how he was heavily influenced by several specific other games.
This isn't easy, at all. It requires training yourself into having a open and flexible mind in general.
Not knowing about something is more like a cheat to get there easier.
But it's supper common that innovation involves a lot of well known foundation work and just is very different in one specific aspects, and it's quite hard to know about the other foundation work but not that specific aspect especially if you don't even know which aspect can be fundamentally be "revolutionized"/"innovated".
But what always help if you learn about a new topic is to try blindly first yourself and then look at what the existing approaches do. Not just for doing ground braking work but even for e.g. just learning math.
One of the math teachers I had over the school years before university used this approach for teaching math it yielded way better independent understanding and engagement it helped me a lot later one. Sadly I only had that teacher for 2 years.
Extrapolating a "best way" from a couple of examples of success is bad reasoning. There are definitely ways in which it can be necessary to ignore the standing wisdom to make progress. There are also definitely ways in which being ignorant of the knowledge gained by past attempts can greatly impede progress.
I would point out, that it is also possible to question and challenge the assumptions that prior approaches have made, without being ignorant of what those approaches tried.
Figuring which is which, is indeed hard. Generally, it seems like it works well to have a majority of people expanding/refining prior work and a minority people going in and starting from scratch to figure out which of the current assumptions or beliefs can be productively challenge/dropped. The precises balance point is vague, but it seems pretty clear that going to far either direction harms the rate of progress.
I don't think that's warranted.
You will find that the vast majority of lottery winners have bought lottery tickets. However that doesn't mean that buying lottery tickets is a good idea financially.
Consider a simplified example. There is some area of scientific research. Working within the framework gives you a 1 in 4 chance of making some minor improvement. Working outside the framework gives you a 1 in a million chance to create a great leap in knowledge.
For any single individual, the best choice is the former. The latter is a gamble that most people will lose, wasting their lives chasing crazy theories.
For society, you want a split. You need some doing the second option to have the eventual amazing discovery, but you also need to progress the current understanding further.
If we introduce a chance for the minor progress to lead to the same major advancement, it becomes a bit more simple for society to calculate the best allocation of researchers, but for any single person, the best option still remains to dedicate themselves to the small advancement.
The real trick is simply to try to understand things directly and not rely on proof by authority all the time.
This is valuable.
You might believe someone's proof of a conjecture and then be discouraged from delving any more into that rabbit hole.
More often than not you will be reinventing something. But that's not necessary less productive than reading other people's work. In the former case, you're at least making something, if not new.
So there are some arguments for being fresh in a well-trodden field with an ocean of research that you cannot boil all at once.
On the other hand, there is the publish-or-perish pressure in academia, which requires original research. You could just keep busy and throw enough shit agains the wall such that enough of it sticks.
Being capable of tackling problems from first principles is invaluable because we frequently encounter problems that are novel in some dimension, even if that dimension is the combination of dimensions. This lets you leap frog large problems by decomposition, possibly going against the grain and innovating by, hopefully, simplifying. However there is risk in falling into traps that countless others have already learned the hard way.
This may come as a surprise to some but, believe it or not, you can have both. In fact, you should.
Of course, if you happen to be on a slope that leads to the global maxima, starting from scratch is far less effective. We don't really know where we are usually, so there's a trade-off.
There was a good article posted to HN years ago that covered this and used rocketry as one of the examples, but I don't recall what it was. The author was well known, IIRC.
I think sometimes this is true. On the time I've had new starters on my engineering team I've always tried to teach them about the problem before they get exposed to any of our solutions. Sometimes they will have brand new insights that we've been completely blind to. It doesn't always happen, but there is only one opportunity for this, once they've seen the solutions they can't be unseen.
[0] https://www.nngroup.com/articles/parallel-design/
George Dantzig also solved two open problems because he thought they were homework.
That depends...
- Krapivin was an undergrad, tinkering with stuff for fun. If he'd put a couple months into this, and just ended up re-inventing a few things? That'd be decently educational.
- Vs. if your team needs to ship product, on something resembling the schedule? Yeah. You definitely stick to tried-and-true algorithms.
It is well known that limitations improve creativity.
That said I still think the best path is to learn a classical path, if you want you can question some axioms, but it's mostly irrational in that there's almost no reward for you personally, except clout, most of the reward goes to the whole science.
It's a trade-off, at first it takes longer to iterate on features, but sometimes a more minimal and/or composable tool finds its way to production. Real Systems are made of duct tape anyways.
Essentially you learn a thing, you accept it for now and you think "well but maybe!"
Like I personally think there should be multiple mathematical zeroes but I accept it as wackiness unless I can clearly demonstrate coherency and utility as to why.
Deleted Comment
Everyone likes to focus on why you cannot do and why trying will be futile.
You don't have to disregard prior efforts. You just have to focus on one simple question:
"how can I do ______ ?"
Survivorship bias, you aren't aware of all the failures where people who were unaware of prior art made all the mistakes predictable to people who were.
Maybe it's just because there are more people working on these problems who don't know previous approaches than the opposite.
Deleted Comment
Deleted Comment
Funny this breakthrough happens at same time Antirez made this post https://news.ycombinator.com/item?id=42983275
They are different tastes. They deliver different results.
This is relevant to HN because I'm probably paraphrasing this incorrectly but pg has said the following about why it's hard to launch a startup: the vast majority of ideas that sound stupid are, in fact, stupid. The ones that sound like great ideas have most likely already been done. Thus, the startups that have huge growth potential tend to be the ones that sound stupid given the conventional wisdom (so unlikely to have been tried yet) but are, contrary to the norm, actually great ideas in disguise. These ideas are basically by definition very rare.
https://thedecisionlab.com/biases/einstellung-effect
You've remembered two examples of this (arguably) happening, so you attempt to draw a conclusion based on the ease with which you came up with those examples. But in reality, this method of inference is prone to error, as it doesn't consider the denominator, or how many attempts were made to achieve the results you're able to remember.
I would suggest positive takeaways is to: trust but verify. If you’ve got a novel solution idea and don’t understand why others aren’t doing it that way, do both and compare. You’re guaranteed to learn something one way or another. Also: if you reinvent the wheel or do something suboptimal then that’s okay too. Sometimes the solutions don’t make sense until you see what doesn’t work. Likewise: be open to learning from others and exploring solutions outside of your predefined notion of how things should work.
Or all the people who, through ignorance or hubris, thought they could do better than the state of the art. Or all the people who independently invent things that already exist. These may be what you're referring to, but thought it's worth re-stating these as they are far more common cases than people inventing truly novel approaches.
Deleted Comment
Major breakthroughs are those that make paradigm shifts. So, by definition, that means that something needs to be done that others are not doing. If not, things would have been solved and the status quo method would work.
Most major breakthroughs are not the result of continued progress in one direction, but rather they are made by dark horses. Often by nobodies. You literally have to say "fuck you all, I'm doing this anyways." Really this is not so much different than the founder mentality we encourage vocally yet discourage monetarily[0]. (I'm going to speak from the side of ML, because that's my research domain, but understand that this is not as bad in other fields, though I believe the phenomena still exists, just not to the same degree). Yet, it is really hard to publish anything novel. While reviewers care a lot about novelty, they actually care about something more: metrics. Not metrics in the way that you provided strong evidence for a hypothesis, but metrics in the way that you improved the state of the field.
We have 2 big reasons this environment will slow innovation and make breakthroughs rare.
1. It is very hard to do better than the current contenders on your first go. You're competing against not one player, but the accumulated work of thousands and over years or decades. You can find a flaw in that paradigm, address the specific flaw, but it is a lot of work to follow this through and mature it. Technological advancement is through the sum of s-curves, and the new thing always starts out worse. For example, think of solar panels. PVs were staggeringly expensive in the beginning and for little benefit. But now you can beat the grid pricing. New non-PV based solar is starting to make their way in and started out way worse than PV but addressed PV's theoretical limitations on power efficiency.
2. One needs to publish often. Truly novel work takes a lot of time. There's lots of pitfalls and nuances that need to be addressed. It involves A LOT of failure and from the outside (and even the inside) it is near impossible to quantify progress. It looks no different than wasting time, other than seeing that the person is doing "something." So what do people do? They pursue the things that are very likely to lead to results. By nature, these are low hanging fruit. (Well... there's also fraud... but that's a different discussion) Even if you are highly confident a research direction will be fruitful, it will often take too much time or be too costly to actually pursue (and not innovative/meaningful enough to "prototype"). So we all go in mostly the same direction.
(3. Tie in grants and funding. Your proposals need to be "promising" so you can't suggest something kinda out there. You're competing against a lot of others who are much more likely to make progress, even if the impact would be far lower)
So ironically, our fear of risk taking is making us worse at advancing. We try so hard to pick what are the right directions to go in, yet the truth is that no one has any idea and history backs this up. I'm not saying to just make it all chaotic. I think of it more like this: when exploring, you have a main party that travels in a set direction. Their strength together makes good progress, but the downside is there's less exploration. I am not saying that anyone should be able to command the ship on a whim, but rather that we need to let people be able to leave the ship if they want and to pursue their hunches or ideas. Someone thinks they saw an island off in the distance? Let them go. Even if you disagree, I do not think their efforts are fruitless and even if wrong they help map out the territory faster. But if we put all our eggs in one basket, we'll miss a lot of great opportunities. Right now, we let people off the main ship when there's an island that looks promising, and there are those that steal a lifeboat in the middle of the night. But we're all explorers and it seems like a bad idea to dissuade people who have that drive and passion in them. I know a lot of people in academia (including myself) who feel shackled by the systems, when really all they want to do is research. Not every one of these people are going to change things, in fact, likely most won't. But truth is, that's probably true if they stay on the ship too. Not to mention that it is incredibly common for these people to just leave academia all together anyways.
Research really is just a structured version of "fuck around and find out". So I think we should stop asking "why" we should pursue certain directions. "Because" is just as good of an excuse as any. In my ideal world, we'd publish anything if there is technical correctness and lack of plagiarism. Because the we usually don't know what is impactful. There are known knowns, known unknowns, and unknown unknowns. We really are trying to pretend that the unknown unknowns either don't exist, are not important, or very small. But we can't know, they're unknown unknowns, so why pretend?
[0] An example might be all the LLM based companies trying to make AGI. You want to compete? You're not going to win by making a new LLM. But one can significantly increase their odds by taking a riskier move, and fund things that are not well established. Other types of architectures. And hey, we know the LLM isn't the only way because we humans aren't LLMs. And we humans also use a lot less energy and require far less data, so even if you are fully convinced that LLMs will get us all the way, we know there are other ways to solve this problem.
Something I got from Richard Feynman descriptions of his method of study, was to first and foremost, read the prompt of the problems, and work diligently trying to solve the problems by himself, for a reasonable amount of time.
Then, and only then, go and read the other solutions. The solutions can be the same, they can be different, and by doing all this preliminary work the researcher can truly understand the nuances of these solutions, something they can't grasp if the solutions were shown just after reading the problem.
So, the best way to approach a problem is:
- Try to solve it by yourself. Several times if necessary, give it an honest effort.
- Then, solved or not, go and read other people's solutions.
Dead Comment
Dead Comment
This is just a quick overview from a single viewing of the video, but it's called "funnel hashing". The idea is to split into exponentially smaller sub arrays, so the first chunk is n/m, the second is n/(m^2), etc. until you get down to a single element. Call them A0, A1, etc., so |A0| = n/m, |A1| = n/(m^2) etc., k levels in total.
Try inserting into A0 c times. If it fails, try inserting into A1 c times. If it fails, go down the "funnel" until you find a free slot.
Call \delta the fraction of slots that are empty (I'm unclear if this is a parameter that gets set at hash table creation or one that's dynamically updated). Setting c = log(1/d) and k = log(1/d) to get worst case complexity O(log^2(1/d)).
This circumvents Yao's result by not being greedy. Yao's result holds true for greedy insertion and search policies and the above is non-greedy, as it's cascading down the funnels.
There are probably many little hairy details to work out but that's the idea, as far as I've been able to understand it. People should let me know if I'm way off base.
This very much reminds me of the "Distinct Elements in Streams" idea by Chakraborty, Vinodchandran and Meel[2].
[0] https://news.ycombinator.com/item?id=43007860
[1] https://www.youtube.com/watch?v=ArQNyOU1hyE
[2] https://arxiv.org/pdf/2301.10191
From the article:
> Farach-Colton, Krapivin and Kuszmaul wanted to see if that same limit also applied to non-greedy hash tables. They showed that it did not by providing a counterexample, a non-greedy hash table with an average query time that’s much, much better than log x.
https://arxiv.org/pdf/2501.02305#:~:text=If%20both%20buckets...
(the text fragment doesn't seem to work in a PDF, it's the 12th page, first paragraph)
[Shameless plug]:
If you are into hashtables, you might want to check out Dandelion Hashtable [0]. We use it in our next-generation databases, and it was published in HPDC'24. It is currently the fastest in-memory hashtable in practice.
It improves closed addressing via bounded-cacheline chaining to surpass 1B in-memory requests/second on a commodity server.
[0] https://dandelion-datastore.com/#dlht
The authors very clearly state "non greedy":
https://www.youtube.com/watch?v=ArQNyOU1hyE&t=1087s
It's kind of one of those resource management hacks you do when you're constrained and screwed by limitations. Splitting things up by priority is a common go-to for resource allocation. This is a spin on that.
I wonder how many other "in the trenches hacks" people have done that overturn widely accepted things the inventors didn't realize were a big deal: "well I usually have a bunch of deliveries to make and I've figured out a clever way to map out the quickest path... "
Don't get me wrong - recognizing it and then formalizing it, doing the work, publishing the paper - that's a lot of effort. I'm not taking that away.
[0] https://en.wikipedia.org/wiki/Galactic_algorithm
[1] https://www.quantamagazine.org/scientists-find-optimal-balan...
A lot of them. Having said that: yes, I can imagine that others would have thought up Dijkstra's shortest path algorithm, since he himself said it came to him while shopping, and that it only took him twenty minutes to reason through the original O(n²) algorithm. (edit: oh wait, that's what you're alluding to isn't it? Heh, that went straight over my head).
On the other hand, I don't think the faster versions of Dijkstra's algorithm would have been invented by anyone without at least some understanding of priority queues and big-O behavior. And at that point I hope people realize that they possess some specialized knowledge that might not be entirely common.
In fact, I'd argue that the true strength of Dijkstra's write-up is that it gives us a vocabulary to reason about it and come up with specialized data structures for particular situations.
Anyway, what you're touching on is the difference between engineering and science: engineering works with confidence built from tests, rules of thumb that reflect lessons learned from historical results, and (in modern times) verified predictions from science. Those rules of thumb might be used when lacking a deeper scientific understanding of why it works. The tests might exist to work around the limitations of scientific knowledge (e.g. modelling turbulence). Science creates insights and predictions through modelling of empirical results. At least that's the difference according to Bill Hammack[0].
In an ideal world the two professions work together and build on each other's results to propel each other forward of course.
[0] https://www.youtube.com/playlist?list=PL0INsTTU1k2X4kCPqmi1e...
Also, I'd not be surprised if someone already invented and used this funnel hashing technique in say the 80's in some game or whatnot but just never realized what they had stumbled onto. Not to diminish the research, it's very ingenius.
Academia is a weird and broken place.
Disclaimer: work in a research lab full of awesome PhDs who largely agree with me!
I do find it a bit weird that this is somehow better than just over-allocating (and thus reducing the chances of key collisions, which also makes worst case 'less worse') given his approach also allocates more memory through the aux arrays.
It's probably not better than over-allocating except in memory constrained scenarios. But the overhead of funnel hashing is not high - it requires 0 extra memory
This means insertions when the hash table is less full are slower, but you avoid the worst-case scenario where you're probing for the last (few) remaining open slot(s) without any idea as to where they are.
[1]: https://arxiv.org/pdf/2501.02305
---
An interesting theoretical result but I would expect the current 'trick' of simply allocating a larger table than necessary to be the superior solution in practice. For example, Rust's hashbrown intentionally leaves 1/8th (12.5%) of the table empty, which does cost a bit more memory but makes insertions/lookups very fast with high probability.
The "skipping slots" has to do with jumping ahead in the hash sequence.
Convert table row to a string, json to whatever
Apply base16 to the that variable
You've now got a base16 string of that data.
Create a hash table, setup a key value for that base16 string.
You now have a container holding the data.
All you need to do is decode the hex string and you've got base32 data.
Deleted Comment
What I realized is that the theory of hash tables involves a fixed-sized collection of objects. For this fixed collection, you create a hash-function and used that like a vector-index and store the collection in a (pre-allocated) vector. This gives a (fuzzy-lens'd) recipe for O(1) time insert, deletion and look-up. (The various tree structures, in contrast, don't assume a particular size).
The two problems are you have to decide size beforehand and if your vector gets close to full, you insert etc processes might bog-down. So scanning the article, it seems this is a solution to the bogging down part - it allows quick insertion to a nearly-full table. It seems interesting and clever but actually not a great practical advance. In practice, rather than worrying a clever way to fill the table, I'd assume you just increase your assumed size.
Edit: I'm posting partly to test my understanding, so feel to correct me if I'm not getting something.
Trees (sorted) are good at finding subsets and ranges "scanning" or "searching" but hashmaps are better at "seeking" like a key-value lookup
I don’t have time to fully grok the paper, but they claim this makes insertion consistently fast (I believe this until we’re at 75% of total capacity, but maybe they have some other mode for filling when they’re at 75% in every row?). They also claim retrieval is fast, and I didn’t read enough to understand how even retrieval works, or why it is faster.
I’ll put out that there a lot of times that it would be really nice to have a nearly full hash table still, you know, work. You can’t always change the size of one during execution of a program. And, in some environments memory counts a lot. That said, I would like to see and play with an implementation — I’m not sure this is ‘worth it’ in the general case.
It is also probably cache inefficient, as are most things about hash tables, with the exception of linear probing for reading out of a fairly full one, in which case, you get to just keep pulling stuff directly out of memory to check it. So, it’s not clear to me that this is performance wise worth it. Anyway, I’d like to fully understand it, it seems like an interesting new idea.
> The team’s results may not lead to any immediate applications
I don't understand why it wouldn't lead to immediate applications. Is this a situation where analysis of real-world use cases allows you to tune your hash implementation better than what a purely mathematical approach would get you?
So if you leave 0.1% of your hashtable unused your x is 1000 - quite problematic. However if you leave 12.5% of your hashtable unused your x is 8 - quite reasonable, and not something logarithmic behavior would necessarily speed up, for reasonable constants.
At that scale, it may be practically better to use a B-tree or a radix tree over a 128-bit hash of the key.
I imagine there are some niche memory-constrained applications where you'd let x get larger but I never ran into them personally.
Deleted Comment
Bucket hashing obviously has no problem finding a free spot. I didn't say chained on purpose because chains can be resizable vectors, so we load at most one pointer when going from table to bucket.
Deleted Comment
if you know empirically that 80% of your requests come to a tiny subset of hot keys - you make a specific hashset just for these hot keys with constant access time, while keeping the rest of the table cold.
your L2 cache is an example of such optimization - a high bandwidth and low latency memory on the CPU die, that is orders of magnitude faster than random DRAM access - a tiered memory model
If you have a constant factor, that doesn’t go into the scaling rule, so having something scale (log x)2 could still be 100 times more expensive than something that scales linearly with x for all x smaller than 2^100.