C++ patterns for low-latency applications including high-frequency trading

Fairly trivial base introduction to the subject.

In my experience teaching undergrads they mostly get this stuff already. Their CompArch class has taught them the basics of branch prediction, cache coherence, and instruction caches; the trivial elements of performance.

I'm somewhat surprised the piece doesn't deal at all with a classic performance killer, false sharing, although it seems mostly concerned with single-threaded latency. The total lack of "free" optimization tricks like fat LTO, PGO, or even the standardized hinting attributes ([[likely]], [[unlikely]]) for optimizing icache layout was also surprising.

Neither this piece, nor my undergraduates, deal with the more nitty-gritty elements of performance. These mostly get into the usage specifics of particular IO APIs, synchronization primitives, IPC mechanisms, and some of the more esoteric compiler builtins.

Besides all that, what the nascent low-latency programmer almost always lacks, and the hardest thing to instill in them, is a certain paranoia. A genuine fear, hate, and anger, towards unnecessary allocations, copies, and other performance killers. A creeping feeling that causes them to compulsively run the benchmarks through callgrind looking for calls into the object cache that miss and go to an allocator in the middle of the hot loop.

I think a formative moment for me was when I was writing a low-latency server and I realized that constructing a vector I/O operation ended up being overall slower than just copying the small objects I was dealing with into a contiguous buffer and performing a single write. There's no such thing as a free copy, and that includes fat pointers.

chipdart · a year ago

> Fairly trivial base introduction to the subject.

Might be, but low-latency C++, in spite of being a field on its own, is a desert of information.

The best resources available at the moment on low-latency C++ are a hand full of lectures from C++ conferences which left much to be desired.

Putting aside the temptation to grandstand, this document is an outstanding contribution to the field and perhaps the first authoritative reference on the subject. Vague claims that you can piece together similar info from other courses does not count as a contribution, and helps no one.

Aurornis · a year ago

> this document is an outstanding contribution to the field and perhaps the first authoritative reference on the subject.

I don't know how you arrive at this conclusion. The document really is an introduction to the same basic performance techniques that have been covered over and over. Loop unrolling, inlining, and the other techniques have appeared in countless textbooks and blog posts already.

I was disappointed to read the paper because they spent so much time covering really basic micro techniques but then didn't cover any of the more complicated issues mentioned in the parent comment.

I don't understand why you'd think this is an "outstanding contribution to the field" when it's basically a recap of simple techniques that have been covered countless times in textbooks and other works already. This paper may seem profound if someone has never, ever read anything about performance optimization before, but it's likely mundane to anyone who has worked on performance before or even wondered what inlining or -Funroll-loops does while reading some other code.

radarsat1 · a year ago

> A creeping feeling that causes them to compulsively run the benchmarks through callgrind

I'm happy I don't deal with such things these days, but I feel where the real paranoia always lies is the Heisenberg feeling of not even being able to even trust these things, the sneaky suspicion that the program is doing something different when I'm not measuring it.

mbo · a year ago

Out of interest, do you have any literature that you'd recommend instead?

nickelpro · a year ago

On the software side I don't think HFT is as special a space as this paper makes it out to be.[1] Each year at cppcon there's another half-dozen talks going in depth on different elements of performance that cover more ground collectively than any single paper will.

Similarly, there's an immense amount of formal literature and textbooks out of the game development space that can be very useful to newcomers looking for structural approaches to high performance compute and IO loops. Games care a lot about local and network latency, the problem spaces aren't that far apart (and writing games is a very fun way to learn).

I don't have specific recommendations for holistic introductions to the field. I learn new techniques primarily through building things, watching conference talks, reading source code of other low latency projects, and discussion with coworkers.

[1]: HFT is quite special on the hardware side, which is discussed in the paper. The NICs, network stacks, and extensive integration of FPGAs do heavily differentiate the industry and I don't want to insinuate otherwise.

You will not find a lot of SystemVerilog programmers at a typical video game studio.

_aavaa_ · a year ago

I'd recommend: https://www.computerenhance.com

The author has a strong game development (engine and tooling) background and I have found it incredibly useful.

It also satisfies the requirement for "A genuine fear, hate, and anger, towards unnecessary allocations, copies, and other performance killers."

contingencies · a year ago

How I might approach it. Interested in feedback from people closer to the space.

First, split the load in to simple asset-specific data streams with a front-end FPGA for raw speed. Resist the temptation to actually execute here as the friction is too high for iteration, people, supply chain, etc. Input may be a FIX stream or similar, output is a series of asset-specific binary event streams along low-latency buses, split in to asset-specific segments of a scalable cluster of low-end MCUs. Second, get rid of the general purpose operating system assumption on your asset-specific MCU-based execution platform to enable faster turnaround using low-level code you can actually find people to write on hardware you can actually purchase. Third, profit? In such a setup you'd need to monitor the overall state with a general purpose OS based governor which could pause or change strategies by reprogramming the individual elements as required.

Just how low are the latencies involved? At a certain point you're better off paying to get the hardware closer to the core than bothering with engineering, right? I guess that heavily depends on the rules and available DCs / link infrastructure offered by the exchanges or pools in question. I would guess a number of profitable operations probably don't disclose which pools they connect to and make a business of front-running, regulations or terms of service be damned. In such cases, the relative network geographic latency between two points of execution is more powerful than the absolute latency to one.

nickelpro · a year ago

The work I do is all in the single-to-low-double-digit microsecond range to give you an idea of timing constraints. I'm peripheral to HFT as a field though.

> First, split the load in to simple asset-specific data streams with a front-end FPGA for raw speed. Resist the temptation to actually execute here as the friction is too high for iteration, people, supply chain, etc.

This is largely incorrect, or more generously out-of-date, and it influences everything downstream of your explanation. Think of FPGAs as far more flexible GPUs and you're in the right arena. Input parsing and filtering are the obvious applications, but this is by no means the end state.

A wide variety of sanity checks and monitoring features are pushed to the FPGAs, fixed calculation tasks, and output generation. It is possible for the entire stack for some (or most, or all) transactions to be implemented at the FPGA layer. For such transactions the time magnitudes are mid-to-high triple digit nanoseconds. The stacks I've seen with my own two eyeballs talked to supervision algorithms over PCIe (which themselves must be not-slow, but not in the same realm as <10us work), but otherwise nothing crazy fancy. This is well covered in the older academic work on the subject [1], which is why I'm fairly certain its long out of date by now.

HRT has some public information on the pipeline they use for testing and verifying trading components implemented in HDL.[2] With the modern tooling, namely Verilator, development isn't significantly different than modern software development. If anything, SystemVerilog components are much easier to unit test than typical C++ code.

Beyond that it gets way too firm-specific to really comment on anything, and I'm certainly not the one to comment. There's maybe three dozen HFT firms in the entire United States? It's not a huge field with widely acknowledged industry norms.

[1]: https://ieeexplore.ieee.org/document/6299067

[2]: https://www.hudsonrivertrading.com/hrtbeat/verify-custom-har...

PoignardAzur · a year ago

> optimization tricks like fat LTO, PGO, or even the standardized hinting attributes ([[likely]], [[unlikely]]) for optimizing icache layout

If you do PGO, aren't hinting attributes counter-productive?

In fact, the common wisdom I mostly see compiler people express is that most of the time they're counter-productive even without PGO, and modern compilers trust their own analysis passes more than they trust these hints and will usually ignore them.

FWIW, the only times I've seen these hints in the wild were in places where the compiler could easily insert them, eg the null check after a malloc call.

nickelpro · a year ago

I said "or even", if you're regularly using PGO they're irrelevant, but not everyone regularly uses PGO in a way that covers all their workloads.

The hinting attributes are exceptional for lone conditionals (not if/else trees) without obvious context to the compiler if it will frequently follow or skip the branch. Compilers are frequently conservative with such things and keep the code in the hot path.

The [[likely]] attribute then doesn't matter so much, but [[unlikely]] is absolutely respected and gets the code out of the hot path, especially with inlined into a large section. Godbolt is useful to verify this but obviously there's no substitute for benchmarking the performance impact.

matheusmoreira · a year ago

> allocations, copies, and other performance killers

Please elaborate on those other performance killers.

T *item = &this->shared_mem_region ->entities[this->shared_mem_region->consumer_position]; this->shared_mem_region->consumer_position++; this->shared_mem_region->consumer_position %= this->slots;

Is there any good reason for high-frequency trading to exist? People often complain about bitcoin wasting energy, but oddly this gets a free pass despite this being a definite net negative to society as far as I can tell.

jeffreyrogers · a year ago

Bid/ask spreads are far narrower than they were previously. If you look at the profits of the HFT industry as a whole they aren't that large (low billions) and their dollar volume is in the trillions. Hard to argue that the industry is wildly prosocial but making spreads narrower does mean less money goes to middlemen.

akira2501 · a year ago

> but making spreads narrower does mean less money goes to middlemen.

On individual trades. I would think you'd have to also argue that their high overall trading volume is somehow also a benefit to the broader market or at the very least that it does not outcompete the benefits of narrowing.

sesuximo · a year ago

Why do high spreads mean more money for middlemen?

TheAlchemist · a year ago

Because it's not explicitly forbidden ?

I would argue that HFT is a rather small space, albeit pretty concentrated. It's several orders of magnitude smaller in terms of energy wasting than Bitcoin.

The only positive from HFT is liquidity and tighter spreads, but it also depends what people put into HFT definition. For example, Robinhood and free trading, probably wouldn't exist without it.

They are taking a part of the cake that previously went to brokers and banks. HFT is not in a business of screwing 'the little guy'.

From my perspective there is little to none negative to the society. If somebody is investing long term in the stock market, he couldn't care less about HFT.

musicale · a year ago

Buying and quickly selling a stock requires a buyer and a seller, so it doesn't seem like an intrinsic property that real liquidity would be increased. Moreover, fast automated trading seems likely to increase volatility.

I tend to agree that for long term investment it probably doesn't make a huge difference except for possible cumulative effects of decreased liquidity, increased volatility, flash crashes, etc. Also possibly a loss of small investor confidence since the game seems even more rigged in a way that they cannot compete with.

bravura · a year ago

Warren Buffett proposed that the stock market should be open less frequently, like once a quarter or similar. This would encourage long-term investing rather than reacting to speculation.

Regardless, there are no natural events that necessitate high-frequency trading. The underlying value of things rarely changes very quickly, and if it does it's not volatile, rather it's a firm transiton.

astromaniak · a year ago

> Warren Buffett proposed that the stock market should be open less frequently

This will result in another market where deals will be made and then finalized on that 'official' when it opens. It's like with employee stock. You can sell it before you can...

Arnt · a year ago

AIUI the point of HFT isn't to trade frequently, but rather to change offer/bid prices in the smallest possible steps (small along both axes) while waiting for someone to accept the offer/bid.

affyboi · a year ago

Why shouldn’t people be allowed to do both? I don’t see much of an advantage to making the markets less agile.

It would be nice to be able to buy and sell stocks more than once a quarter, especially given plenty of events that do affect the perceived value of a company happen more frequently than that

musicale · a year ago

The value of a stock usually doesn't change every millisecond. There doesn't seem to be a huge public need to pick winners and losers based on ping time.

htrp · a year ago

most trading volume already happens at open or before close.

kolbe · a year ago

Seems like a testable hypothesis. Choose the four times a year that the stock is at a fair price, and buy when it goes below it and sell when it goes above it?

FredPret · a year ago

Non-bitcoin transactions are just a couple of entries in various databases. Mining bitcoin is intense number crunching.

HFT makes the financial markets a tiny bit more accurate by resolving inconsistencies (for example three pairs of currencies can get out of whack with one another) and obvious mispricings (for various definitions of "obvious")

jeffbee · a year ago

That's a nice fairy tale that they probably tell their kids when asked, but what the profitable firms are doing at the cutting edge is inducing responses in the other guys' robots, in a phase that the antagonist controls, then trading against what they know is about to happen. It is literally market manipulation. A way to kill off this entire field of endeavor is to charge a tax on cancelled orders.

idohft · a year ago

How far have you tried to tell, and do you buy/sell stocks?

There's someone on the other side of your trade when you want to trade something. You're more likely than not choosing to interact with an HFT player at your price. If you're getting a better price, that's money that you get to keep.

*I'm going to disagree on "free pass" also. HFT is pretty often criticized here.

pi-rat · a year ago

Generally gets attributed with:

- Increased liquidity. Ensures there's actually something to be traded available globally, and swiftly moves it to places where it's lacking.

- Tighter spreads, the difference between you buying and then selling again is lower. Which often is good for the "actual users" of the market.

- Global prices / less geographical differences in prices. Generally you can trust you get the right price no matter what venue you trade at, as any arbitrage opportunity has likely already been executed on.

- etc..

munk-a · a year ago

> Tighter spreads, the difference between you buying and then selling again is lower. Which often is good for the "actual users" of the market.

I just wanted to highlight this one in particular - the spread is tighter because HFTs eat the spread and reduce the error that market players can benefit from. The spread is disappearing because of rent-seeking from the HFTs.

rcxdude · a year ago

Yes: it put a much larger, more expensive, and less efficient part of wall street out of business. Before it was done with computers, it was done with lots of people doing the same job. What was that job? Well, if you want to go to a market and sell something, you generally would like for there to be someone there who's buying it. But it's not always the case that there's a buyer there right at the time who actually wants the item for their own use. The inverse is also true for a prospective buyer. Enter middle-men or market makers who just hang around the marketplace, learning roughly how much people will buy or sell a given good for, and buy it for slightly less than they can sell it later for. This is actually generally useful if you just want to buy or sell something.

Now, does this need to get towards milli-seconds or nano-seconds? No, this is just the equivalent of many of these middle-men racing to give you an offer. But it's (part of) how they compete with each other, and as they do so they squeeze the margins of the industry as a whole: In fact the profits of HFT firms have decreased as a percentage of the overall market and in absolute terms after the initial peak as they displaced the day traders doing the same thing.

bostik · a year ago

> it's not always the case that there's a buyer there right at the time

This hits the nail on the head. For a trade to happen, counterparties need to meet in price and in time. A market place is useless if there is nobody around to buy or sell at the same time you do.

The core service market makers provide is not liquidity. It's immediacy: they offer (put up) liquidity in order to capture trades, but the value proposition for other traders - and the exchanges! - is that there is someone to take the other side of a trade when a non-MM entity wants to buy or sell instruments.

It took me a long time to understand what the difference is. And in order to make sure that there is sufficient liquidity in place, exchanges set up both contractual requirements and incentive structures for their market makers.

torlok · a year ago

To go a step further, I don't think you should be required to talk to middlemen when buying stocks, yet here we are. The house wants its cut.

smabie · a year ago

So who would sell it to you then? At any given time there's not very many actual natural buyer and sellers for a single security.

mrbald · a year ago

Central counterparty concept implemented by most exchanges is a valid service, as otherwise counterparty risk management would be a nightmare - an example of a useful “middleman”.

astromaniak · a year ago

Under the carpet deals would make it less transparent. It would be hard to detect that top manager sold all his stock to competitors and now is making decisions in their favor.

arcimpulse · a year ago

It would be trivial (and vastly more equitable) to quantize trade times.

FredPret · a year ago

You mean like settling trades every 0.25 seconds or something like that? Wouldn't there be a queue of trades piling up every 0.25 seconds, incentivizing maximum speed anyway?

lifeformed · a year ago

Just from an energy perspective, I'm pretty sure HFT uses many orders of magnitude less energy than bitcoin mining.

Deleted Comment

mkoubaa · a year ago

Why does it exist?

Because it's legal and profitable.

If you don't like it, try to convince regulators that it shouldn't be legal and provide a framework for criminalizing/fining it without unintended consequences, and then find a way to pay regulators more in bribes than the HFT shops do, even though their pockets are deeper than yours, and then things may change.

If that sounds impossible, that's another answer to your question

lmm · a year ago

> a definite net negative to society as far as I can tell.

What did you examine to reach that conclusion? If high-frequency trading were positive for society, what would you expect to be different?

The reason for high-frequency trading to exist is that the sub-penny rule makes it illegal to compete on price so you have to compete on speed instead. Abolishing the sub-penny rule would mean high-frequency trading profits got competed-away to nothing, although frankly they're already pretty close. The whole industry is basically an irrelevant piece of plumbing anyway.

calibas · a year ago

A net negative to society, but a positive for the wealthiest.

cheonic730 · a year ago

> A net negative to society, but a positive for the wealthiest.

No.

When your passive index fund manager rebalances every month because “NVDA is now overweighted in VTI, QQQ” the manager does not care about the bid/ask spread.

When VTI is $1.6 trillion, even a $0.01 difference in price translates to a loss $60 million for the passive 401k, IRA, investors.

HFT reduces the bid/ask spread, and “gives this $60 million back” to the passive investors for every $0.01 price difference, every month. Note that VTI mid price at time of writing is $272.49.