K8s storage options
Migrating to Openebs Mayastor
Recently I posted on here about how I was having issues with Longhorn. I read more about Openebs Mayastor and decided to give it a try. Here are the results.
Openebs Mayastor
Below are the results from the fio test that is mentioned in the Openebs documentation.
benchtest: (groupid=0, jobs=1): err= 0: pid=22: Sat Dec 21 03:54:57 2024
read: IOPS=966, BW=3864KiB/s (3957kB/s)(226MiB/60006msec)
slat (usec): min=3, max=728958, avg=364.19, stdev=5211.50
clat (usec): min=9, max=2678.8k, avg=7410.38, stdev=43892.33
lat (usec): min=281, max=2678.8k, avg=7774.57, stdev=44403.61
clat percentiles (usec):
| 1.00th=[ 553], 5.00th=[ 832], 10.00th=[ 1057],
| 20.00th=[ 1434], 30.00th=[ 1827], 40.00th=[ 2311],
| 50.00th=[ 2966], 60.00th=[ 3851], 70.00th=[ 5014],
| 80.00th=[ 6980], 90.00th=[ 11469], 95.00th=[ 19006],
| 99.00th=[ 74974], 99.50th=[ 128451], 99.90th=[ 341836],
| 99.95th=[ 557843], 99.99th=[2667578]
bw ( KiB/s): min= 1, max=16472, per=14.09%, avg=4136.52, stdev=2979.44, samples=107
iops : min= 0, max= 4118, avg=1033.92, stdev=744.90, samples=107
write: IOPS=964, BW=3857KiB/s (3949kB/s)(226MiB/60006msec); 0 zone resets
slat (usec): min=3, max=1878.8k, avg=391.48, stdev=8841.78
clat (usec): min=122, max=2678.8k, avg=8375.08, stdev=41173.43
lat (usec): min=313, max=2678.8k, avg=8766.56, stdev=42873.68
clat percentiles (usec):
| 1.00th=[ 619], 5.00th=[ 914], 10.00th=[ 1172],
| 20.00th=[ 1598], 30.00th=[ 2089], 40.00th=[ 2737],
| 50.00th=[ 3556], 60.00th=[ 4621], 70.00th=[ 6063],
| 80.00th=[ 8356], 90.00th=[ 13566], 95.00th=[ 21890],
| 99.00th=[ 82314], 99.50th=[ 135267], 99.90th=[ 387974],
| 99.95th=[ 557843], 99.99th=[1501561]
bw ( KiB/s): min= 48, max=15864, per=14.20%, avg=4164.25, stdev=2945.69, samples=106
iops : min= 12, max= 3966, avg=1040.82, stdev=736.49, samples=106
lat (usec) : 10=0.01%, 100=0.01%, 250=0.05%, 500=0.47%, 750=2.45%
lat (usec) : 1000=4.71%
lat (msec) : 2=23.43%, 4=26.51%, 10=28.36%, 20=8.85%, 50=3.48%
lat (msec) : 100=0.93%, 250=0.58%, 500=0.12%, 750=0.01%, 1000=0.01%
lat (msec) : 2000=0.03%, >=2000=0.01%
cpu : usr=1.07%, sys=3.72%, ctx=33243, majf=6, minf=59
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=57968,57857,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
Run status group 0 (all jobs):
READ: bw=28.7MiB/s (30.0MB/s), 3369KiB/s-4071KiB/s (3449kB/s-4169kB/s), io=1720MiB (1803MB), run=60001-60008msec
WRITE: bw=28.6MiB/s (30.0MB/s), 3352KiB/s-4075KiB/s (3433kB/s-4173kB/s), io=1718MiB (1802MB), run=60001-60008msec
Disk stats (read/write):
nvme0n1: ios=0/0, sectors=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
- CPU usage for OpenEBS is over 6 CPUs. the IO engine pods are setup with a request of 2 CPUs.
- Memory is around 736 MiB.
Longhorn
benchtest: (groupid=0, jobs=1): err= 0: pid=22: Sat Dec 21 04:05:29 2024
read: IOPS=21, BW=85.1KiB/s (87.2kB/s)(5112KiB/60050msec)
slat (usec): min=4, max=800017, avg=1135.57, stdev=23789.63
clat (usec): min=136, max=5372.3k, avg=264699.64, stdev=536953.31
lat (msec): min=4, max=5372, avg=265.84, stdev=537.19
clat percentiles (msec):
| 1.00th=[ 10], 5.00th=[ 22], 10.00th=[ 31], 20.00th=[ 48],
| 30.00th=[ 64], 40.00th=[ 81], 50.00th=[ 101], 60.00th=[ 124],
| 70.00th=[ 163], 80.00th=[ 255], 90.00th=[ 810], 95.00th=[ 1150],
| 99.00th=[ 2735], 99.50th=[ 4530], 99.90th=[ 5336], 99.95th=[ 5403],
| 99.99th=[ 5403]
bw ( KiB/s): min= 1, max= 487, per=17.52%, avg=125.77, stdev=92.86, samples=65
iops : min= 0, max= 121, avg=31.03, stdev=23.19, samples=65
write: IOPS=22, BW=91.3KiB/s (93.4kB/s)(5480KiB/60050msec); 0 zone resets
slat (usec): min=4, max=2955.6k, avg=4996.26, stdev=97229.41
clat (usec): min=156, max=6285.4k, avg=447512.02, stdev=768041.10
lat (msec): min=10, max=6285, avg=452.51, stdev=773.90
clat percentiles (msec):
| 1.00th=[ 19], 5.00th=[ 30], 10.00th=[ 43], 20.00th=[ 70],
| 30.00th=[ 102], 40.00th=[ 136], 50.00th=[ 180], 60.00th=[ 247],
| 70.00th=[ 363], 80.00th=[ 609], 90.00th=[ 1083], 95.00th=[ 1519],
| 99.00th=[ 4530], 99.50th=[ 5537], 99.90th=[ 6208], 99.95th=[ 6275],
| 99.99th=[ 6275]
bw ( KiB/s): min= 9, max= 550, per=18.88%, avg=138.08, stdev=84.52, samples=62
iops : min= 2, max= 137, avg=34.13, stdev=21.13, samples=62
lat (usec) : 250=0.15%
lat (msec) : 4=0.04%, 10=0.34%, 20=2.53%, 50=14.20%, 100=21.94%
lat (msec) : 250=30.59%, 500=12.05%, 750=3.89%, 1000=4.72%, 2000=6.99%
lat (msec) : >=2000=2.57%
cpu : usr=0.01%, sys=0.10%, ctx=1061, majf=0, minf=46
IO depths : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.3%, 16=99.4%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=1278,1370,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
Run status group 0 (all jobs):
READ: bw=714KiB/s (731kB/s), 71.8KiB/s-104KiB/s (73.5kB/s-106kB/s), io=42.0MiB (44.1MB), run=60022-60335msec
WRITE: bw=731KiB/s (749kB/s), 77.4KiB/s-107KiB/s (79.3kB/s-110kB/s), io=43.1MiB (45.2MB), run=60022-60335msec
Disk stats (read/write):
sdg: ios=10766/11044, sectors=86624/88496, merge=0/18, ticks=993148/1872498, in_queue=2865647, util=69.67%
- CPU usage for Longhorn is around 300m.
- Memory is around 1.32GiB.
Compared
Metric | OpenEBS | Longhorn | Winner |
---|---|---|---|
CPU | 6 CPUs | 300m | Longhorn |
Memory | 736 MiB | 1.32GiB | OpenEBS |
Read IOPS | 966 | 21 | OpenEBS |
Read BW (KiB/s) | 3864 | 85.1 | OpenEBS |
Write IOPS | 964 | 22 | OpenEBS |
Write BW (KiB/s) | 3857 | 91.3 | OpenEBS |
Avg Read Latency (msec) | 7.77457 | 265.84 | OpenEBS |
Avg Write Latency (msec) | 8.76656 | 452.51 | OpenEBS |
Pretty incredible results. OpenEBS Mayastar is the clear winner, except when it comes to CPU. OpenEBS's io-engine pod is what uses so much CPU, the documentation and the helm chart sets the io-engine pods to 2 CPUs for requests and limits, and it uses 100% of that. I have 3 worker nodes which means 3 io-engine pods, so 6 CPUs.
Months ago I ran similar fio tests on Longhorn in Harvester at work and the results were much better. The results at work were on 10Gbps network though. I would be curious to compare OpenEBS Mayastor to Longhorn on a 10Gbps network. It also would be interesting to test with Longhorn's V2 engine.