Workload Design and Behaviors
The File Workloads (NFS, SMB) use a probability-based design to generate the user specified distribution of commands and load. Factors such as SUT performance, Test Bed, errors, and also runtime duration can cause the actual distribution and load to be very different from the expected distribution and load.
For example, if READ/WRITE ratio = 71%/29%, it means that READ probability is 0.71 and WRITE probability is 0.29 during the workload run. The “larger” your test environment is, the better a longer workload run duration should be configured as from a probability perspective, it is more likely that the actual observation will match the configuration. Some metadata commands have dependencies with each other, such as CLOSE and OPEN commands, READDIR and LOOKUP commands, so it is recommended that you specify “realistic” distributions for these types of commands.
Throughout a workload test run, if you have a single concurrent worker, then it will choose to perform a set of operations from start to finish based on probability on one pass, and then on the next pass it may perform a different set of operations from the previous pass. For example, in the first pass the worker may be doing some LOOKUP and OPENs and sequential Writes on file X, and on the next pass it may be doing some more LOOKUPS and CLOSEs and random Reads on a different file Y. When you have multiple concurrent workers, each worker goes through its own “pass”, so you have many different workers doing different things on different files on the SUT. However, throughout the workload test run, the configured operations will eventually be performed with sufficient runtime duration, in the absence of unexpected errors or performance issues.
While the protocol commands and some backend mechanisms differ across protocols, the “experience” the SUT goes through is largely similar across the file protocols. For example, SMB workloads use a backend Loop-Thread mechanism, whereas NFS does not, but from the SUT’s perspective, it still sees similar behavior in that many different clients are issuing a variety of metadata and data operations across its shares, folders, and files at the same time.