I'm surprised that many comments here seem to have missed this bit of context:
> One thing you need to know about me is that despite working on SumatraPDF C++ code base for 16 years, I don’t know 80% of C++.
I'm pretty sure that most "why don't you just use x…" questions are implicitly answered by it, with the answer being "because using x correctly requires learning about all of it's intricacies and edge-cases, which in turn requires understanding related features q, r, s… all the way to z, because C++ edge-case complexity doesn't exist in a vacuum".
I agree: This quote is the star of the show. I'll troll a little bit here: I thought to myself: "Finally, a humble C++ programmer. Really, they do exist... well, at least one."
> Even I can’t answer every question about C++ without reference to supporting material (e.g. my own books, online documentation, or the standard). I’m sure that if I tried to keep all of that information in my head, I’d become a worse programmer.
30 year c++ veteran here. Also don’t know 80%. I used to know more but a combination of way better tooling and “modern” c++ and most importantly realising that thinking more about things other than the language details led to better software meant I have forgotten stuff I used to know.
Sounds like the 80/20 rule but applied to mastering C++ features. Which honestly makes sense to me, and doesn't even feel C++ specific, really. It just becomes more applicable the bigger and more complicated a language gets.
Out of curiosity, what's your use case for it? Years ago I preferred Sumatra/Foxit to Adobe, but every major browser has supported rendering PDFs for at least a decade and I haven't had needed or wanted a dedicated PDF reader in all that time.
Opening a pdf inside a browser feels to me like an application inside an application. My brain can't handle that load. I would rather have the browser to browse the internet and a pdf reader to display pdfs. If I clicked on a link to a pdf, it is _not_ part of the web, and I want the browser to stay out of it. Same goes for Office 360 wanting to documents inside my browser. I don't want it to do that. I have the necessary apps installed for it.
Not only is it faster in opening than a browser and a separation of concerns (documents get their own app, which I can leave with open tabs), it also opens epub, .cbz, and other formats, so I have it installed on all my Windows machines. I eventually open a book.
Part of why I use SumatraPDF is that it automatically reloads its view when the files change (at least for PDFs, I haven't tested on the other file types it supports).
Sumatra excels at read-only. Usually anything to do with PDF is synonymous with slow, bloat, buggy, but Sumatra at just 10Mbytes, managed to feel snappy, fast like a win32 native UI.
> I haven't had needed or wanted a dedicated PDF reader in all that time.
OK. Now load 100 PDF's. You will need a dedicated PDF reader unless you don't mind wasting a truckload of RAM. Also, browser PDF readers are generally slower and are not optimal at search/bookmarks/navigation/etc.
I use my browser for most PDFs. But for PDFs that have a lot of vector graphics and are over 50-100mb, the browser viewer is very slow to load and render the pages.
Even zooming in on a part of a drawing can take 10-15 seconds in the browser which is pretty disruptive.
Sumatra has no issues with 200mb+ PDFs, or ones with complex drawings.
These are all engineering drawings such as mechanical, electrical, and architectural drawings, so mine might not be a use case everyone has.
It's smaller, lighter and much faster than launching a web browser to view a PDF. I can configure it to open a new instance for each PDF which is nice if you need to have several docs open at once. Again, nothing that you can't do with a browser and dragging tabs, but I prefer this.
As I still recalled it's possible to configure an external editor so that when you click on any place on sumatraPDF viewer you can open the source file that is annotated with the clicked position. This is extremely helpful when working with LaTeX documents.
Sumatra will reload any PDF that changes while you are viewing it (Adobe locks the file, so you can't change it to begin with). This is incredibly useful when you are writing documentation using a document generating system (like docbook).
This seems very similar to Java's oldschool single-interface callback mechanism. Originally, Java didn't have lambdas or closures or anything of the sort, so instead they'd litter the standard library with single-method interfaces with names like ActionListener, MouseListener, ListItemSelectedListener, etc. You'd make a class that implements that interface, manually adding whatever data you need in the callback (just like here), and implement the callback method itself of course.
I think that has the same benefit as this, that the callbacks are all very clearly named and therefore easy to pick out of a stack trace.
(In fact, it seems like a missed opportunity that modern Java lambdas, which are simply syntactical sugar around the same single-method interface, do not seem to use the interface name in the autogenerated class)
They don't autogenerate classes anymore, just private static methods though I agree that it would be nice to have more of the metadata in the name of the generated method.
How does that work with variables in the closure then? I could see that work with the autogenerated class: Just make a class field for every variable referenced inside the lambda function body, and assign those in constructor. Pretty similar to this here article. But it's not immediately obvious to me how private static methods can be used to do the same, except for callbacks that do not form a closure (eg filter predicates and sort compare functions and the likes that only use the function parameters).
I'm not a C++ programmer, but I was under the impression that closures in c++ were just classes that overload the function call operator `operator()`. So each closure could also be implemented as a named class. Something like:
Indeed, that is exactly the case, lambdas are essentially syntax sugar for doing this.
The one thing the author's solution does which this solution (and lambdas) does not is type erasure: if you want to pass that closure around, you have to use templates, and you can't store different lambdas in the same data structure even if they have the same signature.
You could solve that in your case by making `void operator()` virtual and inheriting (though that means you have to heap-allocate all your lambdas), or use `std::function<>`, which is a generic solution to this problem (which may or may not allocate, if the lambda is small enough, it's usually optimized to be stored inline).
I get where the author is coming from, but this seems very much like an inferior solution to just using `std::function<>`.
The author of the article freely admits that `std::function<>` is more flexible. He still prefers this solution, as it is easier for him to reason about. This is covered in the "Fringe Benefits" part of the document.
> though that means you have to heap-allocate all your lambdas
I think whether or not you have to allocate from the heap depends on the lifetime of the lambda. Virtual methods also work just fine on stack-allocated objects.
Exactly! And if you need type erasure, you can just store it in a std::function.
> OnListItemSelectedData data;
In this case you can just store the data as member variables. No need for defining an extra class just for the data.
As I've written elsewhere, you can also just use a lambda and forward the captures and arguments to a (member) function. Or if you're old-school, use std::bind.
Yes, but that's exactly why I mention this. By explicitly creating a class (that behaves the same as a lambda) the author might get better names in crash reports.
I don't have this problem with backtraces in Clang. The 'anonymous' lambdas have debugging symbols named after the function it lexically appears in, something like parent_function::$_0::invoke. $_0 is the first lambda in that function, then $_1, etc. So it's easy enough to look up.
It's up to the demangler, the info must be there in the decorated/mangled name. Demanglers sometimes choke on these complex symbols.
AFAIK MSVC also changed their lambda ABI once, including mangling. As I recall at one point it even produced some hash in the decorated/mangled name, with no way to revert it, but that was before /Zc:lambda (enabled by default from C++20).
(This approach also requires explicitly writing the argument type. It's possible to remove the need for this, but not without the kind of complexity you're trying to avoid.)
I don’t really understand what problem this is trying to solve and how the solution is better than std::function. (I understand the issue with the crash reports and lambdas being anonymous classes but not sure how the solution improved on this or how std::function has this problem?)
I haven’t used windows in a long time but back in the day I remember installing SumatraPDF to my Pentium 3 system running windows XP and that shit rocked
I think none of these points are demonstrated in the post hence I fail to visualize it
Also I copy pasted the code from the post and I got this:
test.cpp:70:14: error: assigning to 'void ' from 'func0Ptr' (aka 'void ()(void *)') converts between void pointer and function pointer
70 | res.fn = (func0Ptr)fn;
You can't just keep claiming these things without providing evidence. How much faster? How much smaller? These claims are meaningless without numbers to back it up.
Your Func thing is better than std::function the same way a hammer is better than a drill press... ie it's not better because it's not the same thing at all. Yes the hammer can do some of the same things, at a lower complexity, but it can't do all the same things.
What I'm trying to say is being better than x means you can do all the same things as x better. Your thing is not better, it is just different.
Yours is smaller (in terms of sizeof), because std::function employs small-buffer optimization (SBO). That is if the user data fits into a specific size, then it's stored inline the std::function, instead of getting heap allocated. Yours need heap allocation for the ones that take data.
Whether yours win or lose on using less memory heavily depends on your typical closure sizes.
It's a daily thing we all do: decide if this problem is better solved by a big chunk of code that is probably well tested but probably satisfies a bunch of requirements and other constraints or a smaller chunk of code that I can write or vendor in and has other advantages or maybe I just prefer how its spelled. Sometimes there's a "right" answer, e.g. you should generally link in your TLS implantation unless you're a professional TLS pereon, but usually its a judgement call, and the aggregate of all those micro-decisions are a component of the intangible ideal of "good taste" (also somewhat subjective but most agree on the concept of an ideal).
In this instance the maintainer of a useful piece of software has made a choice that's a little less common in C++ (totally standard practice in C) and it seems fine, its on the bubble, I probably default the other way, but std::function is complex and there are platforms where that kind of machine economy is a real consideration, so why not?
In a zillion contributor project I'd be a little more skeptical of the call, but even on massive projects like the Linux kernel they make decisions about the house style that seem unorthodox to outsiders and they have their reasons for doing so. I misplaced the link but a kernel maintainer raised grep-friendliness as a reason he didn't want a patch. At first I was like, nah you're not saying the real reason, but I looked a little and indeed, the new stuff would be harder to navigate without a super well-configured LSP.
Longtime maintainers have reasons they do things a certain way, and the real test is the result. In this instance (and in most) I think the maintainer seems to know what's best for their project.
I guess the point is that the articule does not prove what he did is better in any of the ways he claimed except for the “I understand it” part
Making changes like this claiming it will result in faster code or more smaller code without any test or comparison before vs after seems to be not the best way of engineering something
I think this is why the thread has seen a lot of push back overall
Maybe the claims are true or maybe they are not - we cannot really say based on the article (though I’m guessing not really)
I just want to thank SumatraPDF's creator, he literally saved my sanity from the evil that Adobe Acrobat Reader is. He probably saved millions of people thousands of hours of frustration using Acrobat Reader.
> One thing you need to know about me is that despite working on SumatraPDF C++ code base for 16 years, I don’t know 80% of C++.
I'm pretty sure that most "why don't you just use x…" questions are implicitly answered by it, with the answer being "because using x correctly requires learning about all of it's intricacies and edge-cases, which in turn requires understanding related features q, r, s… all the way to z, because C++ edge-case complexity doesn't exist in a vacuum".
> Even I can’t answer every question about C++ without reference to supporting material (e.g. my own books, online documentation, or the standard). I’m sure that if I tried to keep all of that information in my head, I’d become a worse programmer.
-- Bjarne Stroustrup, creator of C++
Sumatra excels at read-only. Usually anything to do with PDF is synonymous with slow, bloat, buggy, but Sumatra at just 10Mbytes, managed to feel snappy, fast like a win32 native UI.
OK. Now load 100 PDF's. You will need a dedicated PDF reader unless you don't mind wasting a truckload of RAM. Also, browser PDF readers are generally slower and are not optimal at search/bookmarks/navigation/etc.
Sumatra has no issues with 200mb+ PDFs, or ones with complex drawings.
These are all engineering drawings such as mechanical, electrical, and architectural drawings, so mine might not be a use case everyone has.
Deleted Comment
So one can expect zero day exists and are exploited.
That may not be a feature for you, but it is for attackers.
I think that has the same benefit as this, that the callbacks are all very clearly named and therefore easy to pick out of a stack trace.
(In fact, it seems like a missed opportunity that modern Java lambdas, which are simply syntactical sugar around the same single-method interface, do not seem to use the interface name in the autogenerated class)
How does that work with variables in the closure then? I could see that work with the autogenerated class: Just make a class field for every variable referenced inside the lambda function body, and assign those in constructor. Pretty similar to this here article. But it's not immediately obvious to me how private static methods can be used to do the same, except for callbacks that do not form a closure (eg filter predicates and sort compare functions and the likes that only use the function parameters).
The one thing the author's solution does which this solution (and lambdas) does not is type erasure: if you want to pass that closure around, you have to use templates, and you can't store different lambdas in the same data structure even if they have the same signature.
You could solve that in your case by making `void operator()` virtual and inheriting (though that means you have to heap-allocate all your lambdas), or use `std::function<>`, which is a generic solution to this problem (which may or may not allocate, if the lambda is small enough, it's usually optimized to be stored inline).
I get where the author is coming from, but this seems very much like an inferior solution to just using `std::function<>`.
I think whether or not you have to allocate from the heap depends on the lifetime of the lambda. Virtual methods also work just fine on stack-allocated objects.
> OnListItemSelectedData data;
In this case you can just store the data as member variables. No need for defining an extra class just for the data.
As I've written elsewhere, you can also just use a lambda and forward the captures and arguments to a (member) function. Or if you're old-school, use std::bind.
AFAIK MSVC also changed their lambda ABI once, including mangling. As I recall at one point it even produced some hash in the decorated/mangled name, with no way to revert it, but that was before /Zc:lambda (enabled by default from C++20).
https://gcc.godbolt.org/z/EaPqKfvne
You could get around this by using a wrapper function, at the cost of a slightly different interface:
(This approach also requires explicitly writing the argument type. It's possible to remove the need for this, but not without the kind of complexity you're trying to avoid.)I haven’t used windows in a long time but back in the day I remember installing SumatraPDF to my Pentium 3 system running windows XP and that shit rocked
Smaller size at runtime (uses less memory).
Smaller generated code.
Faster at runtime.
Faster compilation times.
Smaller implementation.
Implementation that you can understand.
How is it worse?
std::function + lambda with variable capture has better ergonomics i.e. less typing.
Also I copy pasted the code from the post and I got this:
test.cpp:70:14: error: assigning to 'void ' from 'func0Ptr' (aka 'void ()(void *)') converts between void pointer and function pointer 70 | res.fn = (func0Ptr)fn;
My unanswered question on this from 8 years ago:
https://stackoverflow.com/questions/41385439/named-c-lambdas...
If there was a way to name lambdas for debug purposes then all other downsides would be irrelevant (for most usual use cases of using callbacks).
What I'm trying to say is being better than x means you can do all the same things as x better. Your thing is not better, it is just different.
How much smaller is it? Does it reduce the binary size and RAM usage by just 100 bytes?
Is it actually faster?
How much faster does it compile? 2ms faster?
Yours is smaller (in terms of sizeof), because std::function employs small-buffer optimization (SBO). That is if the user data fits into a specific size, then it's stored inline the std::function, instead of getting heap allocated. Yours need heap allocation for the ones that take data.
Whether yours win or lose on using less memory heavily depends on your typical closure sizes.
> Faster at runtime
Benchmark, please.
In this instance the maintainer of a useful piece of software has made a choice that's a little less common in C++ (totally standard practice in C) and it seems fine, its on the bubble, I probably default the other way, but std::function is complex and there are platforms where that kind of machine economy is a real consideration, so why not?
In a zillion contributor project I'd be a little more skeptical of the call, but even on massive projects like the Linux kernel they make decisions about the house style that seem unorthodox to outsiders and they have their reasons for doing so. I misplaced the link but a kernel maintainer raised grep-friendliness as a reason he didn't want a patch. At first I was like, nah you're not saying the real reason, but I looked a little and indeed, the new stuff would be harder to navigate without a super well-configured LSP.
Longtime maintainers have reasons they do things a certain way, and the real test is the result. In this instance (and in most) I think the maintainer seems to know what's best for their project.
Making changes like this claiming it will result in faster code or more smaller code without any test or comparison before vs after seems to be not the best way of engineering something
I think this is why the thread has seen a lot of push back overall
Maybe the claims are true or maybe they are not - we cannot really say based on the article (though I’m guessing not really)