Semaphore Lag Time Tests
There’s a number of different interfaces for using semaphores in a multi-threaded or multi-process program. I was curious which was the most efficient, so I decided to do some tests:
Lag Time (ms) vs # of Threads
2.16GHz Core 2 Duo, Mac OS X


This chart shows the performance results from the code on a 2.16GHz Core Duo 2 running Mac OS X, where the time lag between setting a semaphore in one thread, and awaking each of n observing threads is recorded. Note that the vertical time scale is in milliseconds, meaning these measurements are on the order of microseconds per trial.
Tests were also conducted on a single-core 3.0GHz Pentium IV running Linux, giving very similar results, both in absolute lag time and relative performance between methods.
Lag Time (ms) vs # of Threads
3.06GHz Pentium 4, Linux 2.6.13.4 (FC3)


You may be tempted to try to compare the platforms to each other, but we are comparing a dual-core machine to a single core machine, as well as different operating systems, so two important variables are conflated in these results.
Test descriptions:
-
•POSIX interface consists of sem_open, sem_post, and sem_wait.
-
•SysV interface consists of semget, semctl, and semop.
-
•The SysV-A style uses a single semaphore which is raised by the number of observers, and each observer then lowers the semaphore by one.
-
•The SysV-B style gives each observer a separate semaphore, and then uses the SETALL semctl command to raise them all together.
-
•The SysV-C style also gives each observer a separate semaphore, but uses semop with an operation array to raise them all. SysV-C can run into problems on systems with a low SEMOPM setting, restricting the number of semaphores which can be raised at once.
-
•An attempt at using pthread_cond_broadcast was attempted, but is slightly slower, perhaps due to additional locking mechanisms needed to ensure observers don’t miss the signal if they weren’t ready when the signal was sent. The conditional variable’s lag time alone is probably comparable to the other methods if it were used for its intended purpose without the additional mutexes needed to make it behave like a semaphore.
Subjective notes:
The POSIX interface is the most straightforward in my opinion, but the SysV interface is much more powerful, allowing you to raise or lower by more than one at a time, and allowing you to do atomic operations on multiple semaphores. Also, it seems that currently only the SysV interface supports “shared” semaphores across processes. POSIX has an interface for requesting shared semaphores, but as far as I can tell it isn’t widely implemented.
Summary:
The SysV semaphore narrowly edges out the other implementations for this task. Specifically, SysV-A, which uses a single semaphore raised by the number of observers, each of which then lower the semaphore by one. SysV-C is equivalent as long as you stay below SEMOPM threads needing to be modified. (on Mac OS X is defined as 5) POSIX’s results are equivalent to the SysV method for notifying a single thread, but perhaps due to the lack of an interface for raising or lowering by more than one, it scales slightly less well with additional threads. (still linear, but a higher constant)
Benchmark source code:
