We’ve beforehand described why we expect it’s time to leave the leap second in the past. In as we speak’s quickly evolving digital panorama, introducing new leap seconds to account for the long-term slowdown of the Earth’s rotation is a dangerous observe that, frankly, does extra hurt than good. That is notably true within the knowledge heart area, the place new protocols like Precision Time Protocol (PTP) are permitting programs to be synchronized all the way down to nanosecond precision.
With the ever-growing demand for greater precision time distribution, and the larger role of PTP for time synchronization in knowledge facilities, we have to contemplate the way to deal with leap seconds inside programs that use PTP and are thus way more time delicate.
Leap second smearing – an answer previous its time
Leap second smearing is a strategy of adjusting the speeds of clocks to accommodate the correction that has been a typical technique for dealing with leap seconds. At Meta, we’ve historically centered our smearing effort on NTP because it has been the de facto normal for time synchronization in knowledge facilities.
In massive NTP deployments, leap second smearing is mostly carried out on the Stratum 2 (layer), which consists of NTP servers that instantly work together with NTP purchasers (the Stratum 3) which might be the downstream customers of the NTP service.
There are a number of approaches to smearing. Within the case of NTP, linear or quadratic smearing formulation might be utilized.
Quadratic smearing is usually most well-liked because of the layered nature of the NTP protocol, the place purchasers are inspired to dynamically regulate their polling interval as the worth of pending correction will increase. This answer has its personal tradeoffs, equivalent to inconsistent changes, which might result in completely different offset values throughout a big server fleet.
Linear smearing could also be superior if a whole fleet is counting on the identical time sources and performs smearing on the similar time. Together with extra frequent sync cycles of usually as soon as per second, this can be a extra predictable, exact and dependable method.
Dealing with leap seconds in PTP
In distinction to NTP, which synchronizes on the millisecond degree, PTP offers a degree of precision usually within the vary of nanoseconds. At this degree of precision even periodic linear smearing would create an excessive amount of delta throughout the fleet and violate ensures supplied to the shoppers.
To deal with leap seconds in a PTP surroundings we take an algorithmic method that shifts time mechanically for programs that use PTP and mix this with an emphasis on utilizing Coordinated Universal Time (UTC) over Worldwide Atomic Time (TAI).
Self-smearing
At Meta, customers work together with the PTP service through the fbclock library, which offers a tuple of values, {earliest_ns, latest_ns}, which represents a time interval known as the Window of Uncertainty (WOU). Every time the library is known as through the smearing interval we regulate the return values based mostly on the smearing algorithm, which shifts the time values 1 nanosecond each 62.5 microseconds.
This method has a number of benefits, together with being utterly stateless and reproducible. The service continues to make the most of TAI timestamps however can return UTC timestamps to purchasers through the API. And, as the beginning time is set by tzdata timestamps, the present smearing place might be decided even after a server is rebooted.
This method does include some tradeoffs. For instance, because the leap smearing technique differs between the NTP (quadratic) and PTP (linear) ecosystems, companies might battle to match timestamps acquired from completely different sources through the smearing interval.
The distinction between two approaches can imply variations of over 100 microseconds, creating challenges for companies that eat time from each programs.
UTC over TAI
The smearing technique we applied in our fbclock library reveals good efficiency. Nonetheless, it nonetheless introduces important time deltas between a number of hosts through the smearing interval, regardless of being absolutely stateless and utilizing small (1 nanosecond) and stuck step sizes.
One other important disadvantage comes from periodically operating jobs. Smearing time means our scheduling is off by near 1 millisecond after 60 seconds for companies that run at exact intervals.
This isn’t superb for a service that ensures nanosecond-level accuracy and precision.
In consequence, we advocate that prospects use TAI over UTC and thus keep away from having to take care of the leap seconds. Sadly, although, usually, the conversion to UTC continues to be required and finally must be carried out someplace.
PTP with out leap seconds
At Meta, we assist the latest push to freeze any new leap seconds after 2035. If we will stop the introduction of recent leap seconds, then all the business can depend on UTC as a substitute of TAI for greater precision timekeeping. This may simplify infrastructure and take away the necessity for various smearing options.
Finally, a future with out leap seconds is one the place we will push programs to larger ranges of timekeeping precision extra simply and effectively.