The Limits of NTP Accuracy on Linux

The problem with these types of posts is that this is an area that many are unfamiliar with (at least not in depth) and making lots of authoritative statements makes it believable at face value. There are so many variables to network time sync that you design to minimise them. For example no multipath and no asymmetry unless you have PTP P2P transparent clocks everywhere.

The author also mixes precision with accuracy and relies on self-reported figures from NTP (chrony says xxx ns jitter). With every media speed change you get asymmetry which affects accuracy (not always precision though). So your 100m->1G link for example will already introduce over 1 us of error (to accuracy!), but NTP will never show you this and nothing will unless you measure both ends with 1PPS, and the only way around it is PTP BC or TC. There is a very long list of similar clarifications that can be made. For example nothing is mentioned about message rates / intervals, which are crucial for improving the numbers of samples filters work with - and the fact that ptp4l and especially phc2sys aren't great with filtering. Finally getting time into the OS clock, unless you use PCIE PTM which practically limits you to newer Intel CPUs and newer Intel NICs, relies on a PCIE transaction with unknown delays and asymmetries, and without PTM (excluding few NICs) your OS clock is nearly always 500+ ns away from the PHC and you don't know by how much and you can't measure it. It's just a complex topic and requires an end to end, leave no stone unturned, semi-scientific approach to really present things correctly.

mlichvar · 3 days ago

I agree with most of what you said.

The author has other posts in the series where he tried to measure the accuracy relative to the PHC (not system clock) using PPS: https://scottstuff.net/posts/2025/06/02/measuring-ntp-accura...

Steering the same PHC with phc2sys as chronyd is using for HW timestamping is not the best approach as that creates a feedback loop (instability). It would be better to leave the PHC running free and just compare the sys<->PHC with PHC<->PPS offsets.

> So your 100m->1G link for example will already introduce over 1 us of error (to accuracy!), but NTP will never show you this

That doesn't apply to NTP to such an extent as PTP because it timestamps end of the reception (HW RX timestamps are transposed to the end of the packet), so the asymmetries in transmission lengths due to different link speeds should cancel out (unless the switches are cutting through in the faster->slower link direction, but that seems to be rare).

wowczarek · 3 days ago

Yes, I saw those PTP posts and where the methodology lacks a bit.

Re. asymmetries canceling out, OK, I oversimplified, and this is true in theory and often in practice, but for example, having done this with nearly all generations of enterprise type Broadcom ASICs sort of 2008 onwards I know that there are so many variations to this behaviour that the only way to know is to precisely measure latencies in one direction and the other for a variety of speed, CT vs. S&F and frame sizes, and even bandwidth combinations and see. I used to characterise switches for this, build test harnesses, measurement tools etc., and I saw everything ranging from: CT one way, S&F the other way, but not for all speed combinations, then CT behaviour regardless of enabling or disabling it, finally even things like latency having quantised step characteristics in increments of say X bytes because internally the switching fabric used X-byte cells, and then CT only behaving like CT above certain frame sizes. There's just a lot to take into account. There are even cases where a certain level of background traffic _improves_ latency fairness and symmetry, an equivalent of keeping the caches hot.

The author's best bet at reliable numbers would be to get himself a Latte Panda Mu or another platform with TGPIO and measure against 1PPS straight from the CPU. That would be the ultimate answer. Failing that, at least a PTM NIC synced to the OS clock, but that will alter the noise profile of the OS clock.

But you and me know all this because we've been digging deep into the hardware and software guts of those things for years, and have done this for a job, and what's a home lab user to do. It's a never-ending learning exercise and the key is to acknowledge the possible unknowns, and by that I don't mean scientific unknowns but that we don't know what we don't know, and bloggers sometimes don't do this.

eqvinox · 3 days ago

> Steering the same PHC with phc2sys as chronyd is using for HW timestamping is not the best approach as that creates a feedback loop (instability).

This is standard practice, though, for most PTP slave clocks. The feedback is just factored into the math. (Why? No idea. I just know how the code works.)

Although… it's standard practice in PTP setups that are designed for it. Not NTP… if only there were a specification… :)

I do have to wonder though. Of what use are timestamps from an unsynchronized PHC to chrony? Is it continuously taking twin sys+PHC timestamps to line up things?

Dylan16807 · 3 days ago

It's not that the author is mixing accuracy and precision, it's that they only care about precision.

Any asymmetry that is consistent is irrelevant.

petesoper · 3 days ago

The OP says early he only needs 10us.

eqvinox · 3 days ago

ConnectX-6 has PTM, though I have not tested it.

wowczarek · 2 days ago

Do it, that should make a significant difference. See https://www.opencompute.org/wiki/PTM_Readiness for other hardware that supports it, i225/226 are the most common these days, but also a system with TGPIO 1PPS will show the real picture.

anon6362 · 2 days ago

Sigh Yep. Dunning-Kruger effect specimens hammer out puff pieces to get their participation awards.

Meanwhile, here's some other articles:

NTP: https://austinsnerdythings.com/2025/02/14/revisiting-microse...

PTP: https://austinsnerdythings.com/2025/02/18/nanosecond-accurat...

https://www.jeffgeerling.com/blog/2025/diy-ptp-grandmaster-c...

They state there is a problem but then state they are happy with what Chrony is doing so what exactly is the problem they are trying to solve? What on their network is requiring better than 200ns? or even 400ns for that matter? Not in theory but in reality? Also there are optimizations they are missing in this document [1] such as disabling EEE.

On a more taboo note, while RasPi's can be great little time servers they have more drift and will have higher jitter but that should not matter for a home setup and should not be surprising. If jitter is their concern then they should consider using mini-pc's, disable cpuspeed, all power management and confine/lock the min/max speed to half the CPU capabilities and disable all services other than chrony. It will use more power but would address their concerns. They could also try different models of layer 2 switches. Consumer switches will add some artificial jitter and that varies wildly from make, model and even batch but again for a home network that should not matter. I think they are nitpicking. Perfect is the enemy of good, especially in a day and age when people prefer power saving over accuracy.

[Edit] As a side note the aggressive min/max poll settings they are using can amplify the inefficiencies of consumer switches and NICs regardless of filter settings and that can make the graphs more chaotic. They should consider re-testing that on data-center class servers, server NICs and enterprise class switches or just reduce the polling to something reasonable for a home network minpoll 1 maxpoll 2 for client, minpoll 5 maxpoll 7 for edge talking to a dozen stratum 1's with a high combinelimit. Presend should not be required even with default ARP neighbor GC times and intervals. Oh and if you want to try something fun with the graphs, run chronyc makestep ever minute in cron on every node. yeah yeah I know why one would not do that and its just cheating.

[1] - https://chrony-project.org/examples.html

wowczarek · 3 days ago

In addition, for the purposes of characterising the system using NTP, ideally one should also either eavoid any ensembling / combining of sources because that's just pulling in multiple sources of noise, or it should be proven that doing so does not affect the final results, or if it does, then by how much.

There's so much more that can be picked apart here because it's an absolute rabbit hole of a topic - for example, saturate the links a little or a little more, especially with bursty traffic in both directions (or do an 80-20 cycle), and watch those measurements go out the window and only with PTP-capable switches at every hop will you survive this. The Telecom industry has done it ad nauseam and for years with appropriate standardised measurements, test masks and requirements.

And this whole business is also not fundamentally PTP vs. NTP because the principles are exactly the same, it's the fact that PTP was designed with hardware timestamping in mind and it would serve no purpose more useful than NTP had NTP gained support for one-step operation, hardware timestamping - and network assistance. But the default PTP profile uses known multicast groups and thus known destination MACs and it was the easiest entry into hardware packet matching - early "PTP-enabled" NICs only timestamped PTP packets (and most only multicast), only more modern ones allowed to timestamp all packets and that includes NTP.

And as far as RasPi goes - for time sync, at least in terms of COTS equipment, Intel is king, but that's because they had smart people working hard for years to purposefully integrate time-aware functionalities into the architectures (Hey Kevin and team!) - invariant TSC, ART, culminating with PCIE PTM. But this is where aiming for the tens to single digit ns region.

You can easily deliver sub-10 ns sync to a NIC, but a huge source of uncertainty is time transfer from your hardware-timestamping NIC to the OS clock. PTM is the only way to do this in hardware, otherwise, with Solarflare being the only NON-PTM exception I've worked with, comparing NIC to OS time is literally reading the time register on the NIC and the kernel time in quick succession in batches (granted, with local interrupts disabled), and then picking the pair of reads that seems to have taken the least amount of time. Unknowns on top of unknowns.

Bender · 3 days ago

There's so much more that can be picked apart here because it's an absolute rabbit hole of a topic

That pretty much sums it up and I agree with everything you stated. There are countless variables that one could spend a lifetime trying to understand, tune and compensate for and all of that changes with each combination of hardware and refreshing hardware is inevitable. It can be a never ending game. I just tune for good enough for my needs that being slightly better than defaults.

diarmuidc · 3 days ago

Why is there no mention of PTP here? If you want accurate time synchronisation in a network just use the correct tool, https://en.wikipedia.org/wiki/Precision_Time_Protocol

Linux PTP (https://linuxptp.sourceforge.net/) and hardware timestamping in the network card will get you in the sub 100ns range

jacob2161 · 3 days ago

Chrony over NTP is capable of incredible accuracy, as shown in the post. Most users who think they need PTP actually just need Chrony and high quality NICs.

Chrony is also much better software than any of the PTP daemons I tested a few years ago (for an onboard autonomous vehicle system).

NTP fundamentally cannot reach the same accuracy as PTP because Ethernet switches introduce jitter due to queueing delays and can report that in PTP but not NTP.

ainiriand · 3 days ago

Correct me if I am wrong, but wouldn't that be true only for testing accross comparable hardware? Would that be true in scenarios like the one that the author describes, where he uses 3 different systems (threadriper cpu, raspberrypi, and LeoNTP GPS-backed NTP server) and architectures?

secondcoming · 3 days ago

On our GCP cloud VMs, cloud-init installs chrony and uninstalls ntp automatically.

EnnEmmEss · 3 days ago

You're probably looking for https://scottstuff.net/posts/2025/06/10/timing-conclusions/ which discusses the overall conclusions of NTP vs PTP and is the culmination of several blog posts on the topic.

senderista · 3 days ago

PTP is discussed in the concluding article of the series: https://scottstuff.net/posts/2025/06/10/timing-conclusions/

michaelt · 3 days ago

The blog's next post is about PTP, if that's what you're interested in.

The Linux PTP stack is great for the price, but as an open source project it's hamstrung by the fact the PTP standard (IEEE1588) is paywalled; and the fact it doesn't work on wifi or usb-ethernet converters (meaning it also doesn't work on laptop docking stations or raspberry pi 3 and earlier)

This limits people developing/using for fun. And it's the people using it for fun who actually write all the documentation, the 'serious users' at high frequency trading firms and cell phone networks aren't blogging about their exploits.

RossBencina · 3 days ago

> it doesn't work on wifi

802.1AS-2020 (gPTP) includes 802.11-2016 (wifi) support.

The IEEE's gatekeeping is indeed odious.

The biggest limitation is that many ethernet MACs do not support hardware timestamping. Nor do many entry-level ethernet switches.

For what it's worth, I'm interested in TSN for fun (music, actually), and I'm prepared to buy compatible networking hardware to do it. No difference to gamers spending money on a GPU.

rendaw · 3 days ago

There's a discussion on that in the comments at the bottom of the article, where the author explains why it wasn't analyzed.

Deleted Comment

GPS modules need to be put in a special stationary mode (and ideally measured-in to a location for a day or two) to get accurate timing. I'm consistently achieving ca. 10ns of deviation. Hope the author didn't forget this. (But it might also just be crappy GPS modules, I'm using u-blox M8T which are specifically intended for timing.)

mwpmaybe · 3 days ago

Interesting. The serial GPS module currently wired up to my Raspberry Pi doesn't have a stationary mode per se but there is a feature called AlwaysLocate that seems related. I can choose between Periodic Backup/Standby and AlwaysLocate Backup/Standby modes. I'll need to look into this...

ETA: I can also increase the nav speed threshold to 2m/s.

Not all modules have this feature; and it's also locked behind feature/license bits sometimes. It's obviously not needed for normal GPS use… u-blox timing targeted modules definitely have it. Some have a "measure-in" mode where you let it sit for a while (days) and it does all the setup automatically. Other cases you actually have to feed things into the module (annoying and error prone…)

It's simply that if you know your location, you can remove that as free variable from the equations and instead constrain the time further.

What method do you use to measure 10ns deviation?

Delta between modules on a scope

(offset values on the hardware timestamp on the immediately connected PTP clock also line up with this)

[Caveat: everything is in the same room with the same ambient temperature drifts…]

azalemeth · 3 days ago

My experience with rt Linux is that it can be exceptionally good at keeping time, if you give up the rest of the multitasking micro sleeping architecture. What do you need this accurate time for? I'm equally sure, as acknowledged, the multipath routing isn't helping either.

> What do you need this accurate time for?

Some major uses of high-precision timing, albeit not with NTP, include:

* Synchronising cell phone towers, the system partly relies on precise timing to avoid them interfering with one another.

* Timestamping required by regulators, in industries like high-frequency trading.

* Certain video production systems, where a ten-parts-per-million framerate error would build up into an unacceptable 1.7 second error over 48 hours.

* Certain databases (most famously Google Spanner) that use precise timing to ensure events are ordered correctly

* As a debugging luxury in distributed systems.

In some of these applications you could get away with a precise-but-free-floating time, as you only need things precisely synchronised relative to one another, not globally. But if you're building an entire data centre, a sub-$1000 GPS-controlled clock is barely noticeable.

rkomorn · 3 days ago

> But if you're building an entire data centre, a sub-$1000 GPS-controlled clock is barely noticeable.

Dumb personal and useless anecdote: one of those appliances made my life more difficult for months (at a FAANG company that built its own data centers, no less) for the nearly comical reason that we needed to move it but somehow couldn't rewire the GPS antenna, and the delays kept retriggering alerting that we kept disabling until the expecte "it'll be moved by then" time.

So, I guess to make the anecdote more useful: if you're gonna get one, make sure it doesn't hamstring you with wires...

JdeBP · 3 days ago

Bear in mind that the author specifically reminds us, halfway down, that the goal is consistency, not accuracy per se. Making all of the systems accurate to GNSS is merely a means of achieving the consistency goal so that event logs from multiple systems can be aligned.

mustache_kimono · 3 days ago

Securities regulation?: https://podcasts.apple.com/us/podcast/signals-and-threads/id...

aa-jv · 3 days ago

Scientific and consistent analysis of streaming realtime sensor data.

Been there, done that, shipped the package. Took quite a bit of fun to get it working consistently, which was the main thing.

RossBencina · 2 days ago

Synchronising the clocks on network connected audio devices (ADCs, DACs, DSP processors) on a LAN (https://en.wikipedia.org/wiki/Audio_Video_Bridging), or over the internet (broadcast-grade live streaming). This, and related standards, are more or less the norm in live sound and high-channel-count digital recording setups.

mschuster91 · 3 days ago

Say you are running a few geographically apart radio receivers to triangulate signals, you want to have all of them as closely synchronized as possible for better accuracy.

stinkbeetle · 3 days ago

What is the rest of the multitasking micro sleeping architecture, and how do you give it up to improve time keeping?

There was some related discussion a couple of weeks ago here:

Graham: Synchronizing Clocks by Leveraging Local Clock Properties (2022) [pdf] (usenix.org) https://news.ycombinator.com/item?id=44860832

In particular the podcast about Jane Street's NTP setup was discussed.

nullc · 3 days ago

GPS timing modules should have a sawtooth correction value that will tell you the error between the upcoming pulse and GPS time. The issue is that PPS pulse has to be aligned to the receiver's clock. Using that will remove the main source of jitter.

Applying sawtooth correction will remove _some_ of the jitter but at the time server level, not network level, and quantisation error is not the main source of jitter here (I'm talking long time constants) - Packet Delay Variation (PDV) and internal time comparisons are. Plus, any decent loop should average the sawtooth and transform it into fluctuations slow enough that they will not have that much effect on what is being measured in the blog post - the output of the time server looks nothing like the raw 1PPS input, at least in the short term it doesn't. Of course sawtooth should be removed and let's hope his time servers do it, especially the RPi ones.

> any decent loop should average the sawtooth

You can't really, depending on their relative phases and the resulting aliasing products the average of the sawtooth error can still have an arbitrary offset which last for an arbitrarily long time.

> that they will not have that much effect

okay fine for some definition of 'not much' that's true. But failing to account for it can result in a bigger error than many people expect-- and in an annoying way, since when you test it might be in a state where it is averaging out okay but then later shift into a state where it's producing an offset that doesn't average out.

Assuming your receiver outputs the correction it's pretty easy to handle, so long as you know it's a thing.

Only the expensive ones have the correction capability (e.g. uBlox LEA-M8T) hat tip to tverbeure:

https://news.ycombinator.com/item?id=44147523

Aligning the PPS pulse with an asynchronous local clock is going to require a very large number of measurements, or a high resolution timer (e.g. a time to digital converter, TDOA chip, etc. there are a few options.)

To an extent. You can get previous generation GNSS receivers with sawtooth correction for cheap, eBay is full of those, say an old Trimble Resolution whatever, and lots of LEA-6T carrier boards going for $20-range, and bare modules for less. I would trust those carrier boards more though, less chance of getting a fake module.

watersb · 3 days ago

Segal's Law:

"A man with a watch knows what time it is. A man with two watches is never sure."

https://en.m.wikipedia.org/wiki/Segal's_law

ofalkaed · 3 days ago

If you actually care about what time it is you need at least three so you can average them and knock out the error. The Beagle carried 22 when it also carried Darwin, in the nearly 5 year expedition they only lost 30 odd seconds.

A person with three or more watches knows what time it is in proportion to the square root of the number of watches.

A person with four watches knows what time it is in proportion to 2?

A person with two watches finds xyrself suddenly in the messy business of doing full NTP, rather than the much simpler model of SNTP. (-: