Simplest C++ Callback, from SumatraPDF

vanderZwan · 6 months ago

I'm surprised that many comments here seem to have missed this bit of context:

> One thing you need to know about me is that despite working on SumatraPDF C++ code base for 16 years, I don’t know 80% of C++.

I'm pretty sure that most "why don't you just use x…" questions are implicitly answered by it, with the answer being "because using x correctly requires learning about all of it's intricacies and edge-cases, which in turn requires understanding related features q, r, s… all the way to z, because C++ edge-case complexity doesn't exist in a vacuum".

throwaway2037 · 6 months ago

I agree: This quote is the star of the show. I'll troll a little bit here: I thought to myself: "Finally, a humble C++ programmer. Really, they do exist... well, at least one."

uncircle · 6 months ago

There's two:

> Even I can’t answer every question about C++ without reference to supporting material (e.g. my own books, online documentation, or the standard). I’m sure that if I tried to keep all of that information in my head, I’d become a worse programmer.

-- Bjarne Stroustrup, creator of C++

jahnu · 6 months ago

30 year c++ veteran here. Also don’t know 80%. I used to know more but a combination of way better tooling and “modern” c++ and most importantly realising that thinking more about things other than the language details led to better software meant I have forgotten stuff I used to know.

vanderZwan · 6 months ago

Sounds like the 80/20 rule but applied to mastering C++ features. Which honestly makes sense to me, and doesn't even feel C++ specific, really. It just becomes more applicable the bigger and more complicated a language gets.

JeanMarcS · 6 months ago

Don't know about the code subtilities, but SumatraPDF is a gift for viewing PDF on MS Windows. So big thanks to the author !

Arainach · 6 months ago

Out of curiosity, what's your use case for it? Years ago I preferred Sumatra/Foxit to Adobe, but every major browser has supported rendering PDFs for at least a decade and I haven't had needed or wanted a dedicated PDF reader in all that time.

sameerds · 6 months ago

Opening a pdf inside a browser feels to me like an application inside an application. My brain can't handle that load. I would rather have the browser to browse the internet and a pdf reader to display pdfs. If I clicked on a link to a pdf, it is _not_ part of the web, and I want the browser to stay out of it. Same goes for Office 360 wanting to documents inside my browser. I don't want it to do that. I have the necessary apps installed for it.

drewbitt · 6 months ago

Not only is it faster in opening than a browser and a separation of concerns (documents get their own app, which I can leave with open tabs), it also opens epub, .cbz, and other formats, so I have it installed on all my Windows machines. I eventually open a book.

mjmas · 6 months ago

Part of why I use SumatraPDF is that it automatically reloads its view when the files change (at least for PDFs, I haven't tested on the other file types it supports).

vachina · 6 months ago

> use case

Sumatra excels at read-only. Usually anything to do with PDF is synonymous with slow, bloat, buggy, but Sumatra at just 10Mbytes, managed to feel snappy, fast like a win32 native UI.

lenkite · 6 months ago

> I haven't had needed or wanted a dedicated PDF reader in all that time.

OK. Now load 100 PDF's. You will need a dedicated PDF reader unless you don't mind wasting a truckload of RAM. Also, browser PDF readers are generally slower and are not optimal at search/bookmarks/navigation/etc.

MC995 · 6 months ago

I use my browser for most PDFs. But for PDFs that have a lot of vector graphics and are over 50-100mb, the browser viewer is very slow to load and render the pages. Even zooming in on a part of a drawing can take 10-15 seconds in the browser which is pretty disruptive.

Sumatra has no issues with 200mb+ PDFs, or ones with complex drawings.

These are all engineering drawings such as mechanical, electrical, and architectural drawings, so mine might not be a use case everyone has.

Cadwhisker · 6 months ago

It's smaller, lighter and much faster than launching a web browser to view a PDF. I can configure it to open a new instance for each PDF which is nice if you need to have several docs open at once. Again, nothing that you can't do with a browser and dragging tabs, but I prefer this.

df0b9f169d54 · 6 months ago

As I still recalled it's possible to configure an external editor so that when you click on any place on sumatraPDF viewer you can open the source file that is annotated with the clicked position. This is extremely helpful when working with LaTeX documents.

graemep · 6 months ago

Large PDFs are very slow in browsers. I believe they all use pdf.js (or similar).

agent327 · 6 months ago

Sumatra will reload any PDF that changes while you are viewing it (Adobe locks the file, so you can't change it to begin with). This is incredibly useful when you are writing documentation using a document generating system (like docbook).

vgb2k18 · 6 months ago

If you hate it when pdfs won't print because of restrictive permissions... Sumatra.

ternaryoperator · 6 months ago

Not the OP, but my use case is epub books, which it handles flawlessly.

eviks · 6 months ago

How do you alt tab to a browser tab with a PDF? How do you change navigation shortcuts when browsers are notoriously bad at such customizations?

Deleted Comment

dolmen · 6 months ago

It is not sandboxed.

So one can expect zero day exists and are exploited.

That may not be a feature for you, but it is for attackers.

wavemode · 6 months ago

opening (and browsing/searching through) a very large PDF is a nightmare in most browsers

NooneAtAll3 · 6 months ago

in my experience, browser pdf viewers take a loooot more RAM than Sumatra

skrebbel · 6 months ago

This seems very similar to Java's oldschool single-interface callback mechanism. Originally, Java didn't have lambdas or closures or anything of the sort, so instead they'd litter the standard library with single-method interfaces with names like ActionListener, MouseListener, ListItemSelectedListener, etc. You'd make a class that implements that interface, manually adding whatever data you need in the callback (just like here), and implement the callback method itself of course.

I think that has the same benefit as this, that the callbacks are all very clearly named and therefore easy to pick out of a stack trace.

(In fact, it seems like a missed opportunity that modern Java lambdas, which are simply syntactical sugar around the same single-method interface, do not seem to use the interface name in the autogenerated class)

spullara · 6 months ago

They don't autogenerate classes anymore, just private static methods though I agree that it would be nice to have more of the metadata in the name of the generated method.

skrebbel · 6 months ago

Oh really? Cool, I did not know that.

How does that work with variables in the closure then? I could see that work with the autogenerated class: Just make a class field for every variable referenced inside the lambda function body, and assign those in constructor. Pretty similar to this here article. But it's not immediately obvious to me how private static methods can be used to do the same, except for callbacks that do not form a closure (eg filter predicates and sort compare functions and the likes that only use the function parameters).

_randyr · 6 months ago

I'm not a C++ programmer, but I was under the impression that closures in c++ were just classes that overload the function call operator `operator()`. So each closure could also be implemented as a named class. Something like:

    class OnListItemSelected {
        OnListItemSelectedData data;

        void operator()(int selectedIndex) { ... }
    }

Perhaps I'm mistaken in what the author is trying to accomplish though?

OskarS · 6 months ago

Indeed, that is exactly the case, lambdas are essentially syntax sugar for doing this.

The one thing the author's solution does which this solution (and lambdas) does not is type erasure: if you want to pass that closure around, you have to use templates, and you can't store different lambdas in the same data structure even if they have the same signature.

You could solve that in your case by making `void operator()` virtual and inheriting (though that means you have to heap-allocate all your lambdas), or use `std::function<>`, which is a generic solution to this problem (which may or may not allocate, if the lambda is small enough, it's usually optimized to be stored inline).

I get where the author is coming from, but this seems very much like an inferior solution to just using `std::function<>`.

pwagland · 6 months ago

The author of the article freely admits that `std::function<>` is more flexible. He still prefers this solution, as it is easier for him to reason about. This is covered in the "Fringe Benefits" part of the document.

usefulcat · 6 months ago

> though that means you have to heap-allocate all your lambdas

I think whether or not you have to allocate from the heap depends on the lifetime of the lambda. Virtual methods also work just fine on stack-allocated objects.

spacechild1 · 6 months ago

Exactly! And if you need type erasure, you can just store it in a std::function.

> OnListItemSelectedData data;

In this case you can just store the data as member variables. No need for defining an extra class just for the data.

As I've written elsewhere, you can also just use a lambda and forward the captures and arguments to a (member) function. Or if you're old-school, use std::bind.

InfiniteRand · 6 months ago

Main issue author had with lambdas is autogenerated names in crash reports

_randyr · 6 months ago

Yes, but that's exactly why I mention this. By explicitly creating a class (that behaves the same as a lambda) the author might get better names in crash reports.

tlb · 6 months ago

I don't have this problem with backtraces in Clang. The 'anonymous' lambdas have debugging symbols named after the function it lexically appears in, something like parent_function::$_0::invoke. $_0 is the first lambda in that function, then $_1, etc. So it's easy enough to look up.

lenkite · 6 months ago

This. I was confused when I read that - I guess MSVC doesn't generate such conventional lambda names ?

badmintonbaseba · 6 months ago

It's up to the demangler, the info must be there in the decorated/mangled name. Demanglers sometimes choke on these complex symbols.

AFAIK MSVC also changed their lambda ABI once, including mangling. As I recall at one point it even produced some hash in the decorated/mangled name, with no way to revert it, but that was before /Zc:lambda (enabled by default from C++20).

psyclobe · 6 months ago

Oh that’s nice, wish gcc did that

comex · 6 months ago

Note that some CFI (control flow integrity) implementations will get upset if you call a function pointer with the wrong argument types:

https://gcc.godbolt.org/z/EaPqKfvne

You could get around this by using a wrapper function, at the cost of a slightly different interface:

    template <typename T, void (*fn)(T *)>
    void wrapper(void *d) {
        fn((T *)d);
    }

    template <typename T, void (*fn)(T *)>
    Func0 MkFunc0(T* d) {
        auto res = Func0{};
        res.fn = (void *)wrapper<T, fn>;
        res.userData = (void*)d;
        return res;
    }

    ...

    Func0 x = MkFunc0<int, my_func>(nullptr);

(This approach also requires explicitly writing the argument type. It's possible to remove the need for this, but not without the kind of complexity you're trying to avoid.)

akdev1l · 6 months ago

I don’t really understand what problem this is trying to solve and how the solution is better than std::function. (I understand the issue with the crash reports and lambdas being anonymous classes but not sure how the solution improved on this or how std::function has this problem?)

I haven’t used windows in a long time but back in the day I remember installing SumatraPDF to my Pentium 3 system running windows XP and that shit rocked

kjksf · 6 months ago

How is Func0 / Func1<T> better than std::function?

Smaller size at runtime (uses less memory).

Smaller generated code.

Faster at runtime.

Faster compilation times.

Smaller implementation.

Implementation that you can understand.

How is it worse?

std::function + lambda with variable capture has better ergonomics i.e. less typing.

akdev1l · 6 months ago

I think none of these points are demonstrated in the post hence I fail to visualize it

Also I copy pasted the code from the post and I got this:

test.cpp:70:14: error: assigning to 'void ' from 'func0Ptr' (aka 'void ()(void *)') converts between void pointer and function pointer 70 | res.fn = (func0Ptr)fn;

spacechild1 · 6 months ago

You can't just keep claiming these things without providing evidence. How much faster? How much smaller? These claims are meaningless without numbers to back it up.

oezi · 6 months ago

I think the one key downside for std::function+lambda which resonated with me was bad ergonomics during debugging.

My unanswered question on this from 8 years ago:

https://stackoverflow.com/questions/41385439/named-c-lambdas...

If there was a way to name lambdas for debug purposes then all other downsides would be irrelevant (for most usual use cases of using callbacks).

almostgotcaught · 6 months ago

Your Func thing is better than std::function the same way a hammer is better than a drill press... ie it's not better because it's not the same thing at all. Yes the hammer can do some of the same things, at a lower complexity, but it can't do all the same things.

What I'm trying to say is being better than x means you can do all the same things as x better. Your thing is not better, it is just different.

m-schuetz · 6 months ago

None of the arguments on this list seem convincing. The only one that makes sense was the argument that it helps identify the source of a crash.

How much smaller is it? Does it reduce the binary size and RAM usage by just 100 bytes?

Is it actually faster?

How much faster does it compile? 2ms faster?

badmintonbaseba · 6 months ago

> Smaller size at runtime (uses less memory).

Yours is smaller (in terms of sizeof), because std::function employs small-buffer optimization (SBO). That is if the user data fits into a specific size, then it's stored inline the std::function, instead of getting heap allocated. Yours need heap allocation for the ones that take data.

Whether yours win or lose on using less memory heavily depends on your typical closure sizes.

> Faster at runtime

Benchmark, please.

benreesman · 6 months ago

It's a daily thing we all do: decide if this problem is better solved by a big chunk of code that is probably well tested but probably satisfies a bunch of requirements and other constraints or a smaller chunk of code that I can write or vendor in and has other advantages or maybe I just prefer how its spelled. Sometimes there's a "right" answer, e.g. you should generally link in your TLS implantation unless you're a professional TLS pereon, but usually its a judgement call, and the aggregate of all those micro-decisions are a component of the intangible ideal of "good taste" (also somewhat subjective but most agree on the concept of an ideal).

In this instance the maintainer of a useful piece of software has made a choice that's a little less common in C++ (totally standard practice in C) and it seems fine, its on the bubble, I probably default the other way, but std::function is complex and there are platforms where that kind of machine economy is a real consideration, so why not?

In a zillion contributor project I'd be a little more skeptical of the call, but even on massive projects like the Linux kernel they make decisions about the house style that seem unorthodox to outsiders and they have their reasons for doing so. I misplaced the link but a kernel maintainer raised grep-friendliness as a reason he didn't want a patch. At first I was like, nah you're not saying the real reason, but I looked a little and indeed, the new stuff would be harder to navigate without a super well-configured LSP.

Longtime maintainers have reasons they do things a certain way, and the real test is the result. In this instance (and in most) I think the maintainer seems to know what's best for their project.

akdev1l · 6 months ago

I guess the point is that the articule does not prove what he did is better in any of the ways he claimed except for the “I understand it” part

Making changes like this claiming it will result in faster code or more smaller code without any test or comparison before vs after seems to be not the best way of engineering something

I think this is why the thread has seen a lot of push back overall

Maybe the claims are true or maybe they are not - we cannot really say based on the article (though I’m guessing not really)

noomen · 6 months ago

I just want to thank SumatraPDF's creator, he literally saved my sanity from the evil that Adobe Acrobat Reader is. He probably saved millions of people thousands of hours of frustration using Acrobat Reader.