Announcing TPC-C.js; A Lightweight Implementation of TPC-C
I am glad to announce the beta release of TPC-C.js, which implements one of the most popular database benchmarks, TPC-C. It’s not a coincidence that today is also the 22nd anniversary of the TPC-C benchmark.
It currently supports Postgres database, but can be easily extended to test other database systems.
You might ask “Why another TPC-C implementation when we already have so many of them?””
Short answer: This one is very light on system resources, so you can
- Run the benchmark strictly adhering to the specification, and
- Invest more in database hardware, rather than client hardware.
Long answer: It’s covered in the Motivation section of TPC-C.js, which I’ll quote here:
Motivation
The TPC-C benchmark drivers currently available to us, like TPCC-UVa, DBT2, HammerDB, BenchmarkSQL, etc., all run one process (or thread) per simulated client. Because the TPC-C benchmark specification limits the max tpmC metric (transactions per minute of benchmark-C) from any single client to be 1.286 tpmC, this means that to get a result of, say, 1 million tpmC we have to run about 833,000 clients. Even for a decent number as low as 100,000 tpmC, one has to run 83,000 clients.
Given that running a process/thread, even on modern operating systems, is a bit expensive, it requires a big upfront investment in hardware to run the thousands of clients required for driving a decent tpmC number. For example, the current TPC-C record holder had to run 6.8 million clients to achieve 8.55 million tpmC, and they used 16 high-end servers to run these clients, which cost them about $ 220,000 (plus $ 550,000 in client-side software).
So, to avoid those high costs, these existing open-source implementations of TPC-C compromise on the one of the core requirements of the TPC-C benchmark: keying and thinking times. These implementations resort to just hammering the SUT (system under test) with a constant barrage of transactions from a few clients (ranging from 10-50).
So you can see that even though a decent modern database (running on a single machine) can serve a few hundred clients simultaneously, it ends up serving very few (10-50) clients. I strongly believe that this way the database is not being tested to its full capacity; at least not as the TPC-C specification intended.
The web-servers of yesteryears also suffer from the same problem; using one process for each client request prohibits them from scaling, because the underlying operating system cannot run thousands of processes efficiently. The web-servers solved this problem (known as c10k problem) by using event-driven architecture which is capable of handling thousands of clients using a single process, and with minimal effort on the operating system’s part.
So this implementation of TPC-C uses a similar architecture and uses NodeJS, the event-driven architecture, to run thousands of clients against a database.