1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
fio -direct=1 -iodepth=32 -rw=read -ioengine=libaio -bs=4m -size=10g -numjobs=1 -runtime=1000 -group_reporting -filename=testfile --allow_mounted_write=1 -name=Sequ_Read_Testing

Sequ_Read_Testing: (g=0): rw=read, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=libaio, iodepth=32
fio-3.16
Starting 1 process
Jobs: 1 (f=1): [R(1)][96.7%][r=368MiB/s][r=92 IOPS][eta 00m:01s]
Sequ_Read_Testing: (groupid=0, jobs=1): err= 0: pid=74694: Thu May 16 18:40:31 2024
  read: IOPS=86, BW=347MiB/s (364MB/s)(10.0GiB/29479msec)
    slat (msec): min=7, max=260, avg=11.51, stdev=14.19
    clat (usec): min=17, max=2646.5k, avg=344495.25, stdev=192586.04
     lat (msec): min=11, max=2655, avg=356.01, stdev=197.77
    clat percentiles (msec):
     |  1.00th=[  284],  5.00th=[  305], 10.00th=[  309], 20.00th=[  313],
     | 30.00th=[  326], 40.00th=[  330], 50.00th=[  334], 60.00th=[  334],
     | 70.00th=[  338], 80.00th=[  338], 90.00th=[  342], 95.00th=[  347],
     | 99.00th=[  435], 99.50th=[ 2601], 99.90th=[ 2635], 99.95th=[ 2635],
     | 99.99th=[ 2635]
   bw (  KiB/s): min=303104, max=417792, per=100.00%, avg=384223.08, stdev=21823.92, samples=53
   iops        : min=   74, max=  102, avg=93.77, stdev= 5.35, samples=53
  lat (usec)   : 20=0.04%
  lat (msec)   : 20=0.04%, 50=0.12%, 100=0.16%, 250=0.55%, 500=98.16%
  lat (msec)   : 750=0.08%, 1000=0.04%, 2000=0.16%, >=2000=0.66%
  cpu          : usr=0.04%, sys=6.67%, ctx=2614, majf=0, minf=32779
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.3%, 16=0.6%, 32=98.8%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=2560,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
   READ: bw=347MiB/s (364MB/s), 347MiB/s-347MiB/s (364MB/s-364MB/s), io=10.0GiB (10.7GB), run=29479-29479msec
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
fio -direct=1 -iodepth=32 -rw=read -ioengine=libaio -bs=4m -size=10g -numjobs=1 -runtime=1000 -group_reporting -filename=testfile --allow_mounted_write=1 -name=Sequ_Read_Testing

Sequ_Read_TestingA: (g=0): rw=read, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=libaio, iodepth=32
fio-3.16
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=364MiB/s][r=91 IOPS][eta 00m:00s]
Sequ_Read_TestingA: (groupid=0, jobs=1): err= 0: pid=563: Thu May 16 18:11:28 2024
  read: IOPS=78, BW=315MiB/s (330MB/s)(10.0GiB/32490msec)
    slat (msec): min=7, max=261, avg=12.69, stdev=21.49
    clat (usec): min=3, max=5356.2k, avg=357490.74, stdev=319869.00
     lat (msec): min=10, max=5386, avg=370.18, stdev=334.81
    clat percentiles (msec):
     |  1.00th=[  288],  5.00th=[  309], 10.00th=[  313], 20.00th=[  317],
     | 30.00th=[  326], 40.00th=[  334], 50.00th=[  334], 60.00th=[  338],
     | 70.00th=[  342], 80.00th=[  347], 90.00th=[  347], 95.00th=[  351],
     | 99.00th=[  355], 99.50th=[ 3171], 99.90th=[ 5336], 99.95th=[ 5336],
     | 99.99th=[ 5336]
   bw (  KiB/s): min=98304, max=417792, per=100.00%, avg=376955.83, stdev=41482.14, samples=54
   iops        : min=   24, max=  102, avg=92.00, stdev=10.13, samples=54
  lat (usec)   : 4=0.04%
  lat (msec)   : 20=0.04%, 50=0.12%, 100=0.16%, 250=0.51%, 500=98.24%
  lat (msec)   : 750=0.04%, 1000=0.04%, 2000=0.16%, >=2000=0.66%
  cpu          : usr=0.05%, sys=6.02%, ctx=2757, majf=0, minf=32780
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.3%, 16=0.6%, 32=98.8%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=2560,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
   READ: bw=315MiB/s (330MB/s), 315MiB/s-315MiB/s (330MB/s-330MB/s), io=10.0GiB (10.7GB), run=32490-32490msec

Fluid 中,ThinRuntime 的 PVC 性能与主机直接挂载的性能,不会损失很多。这里需要注意,blocksize 和 size 的大小会严重影响测试的结果。如果只读取 1g 的数据,顺序读的性能可以达到 500+ MB/s; 如果 blocksize 为 128k,顺序读的性能又只有 100+ MB/s。因此,需要根据使用场景进行调整,才能准确评估。

最近国内的模型推理服务需要在海外进行部署,我们选定了 AWS FSx for Lustre 作为存储后端,但为了保持业务层使用存储逻辑的一致性,需要将 Lustre 对接到 Fluid 中。

Fluid 早期版本就支持 Lustre,但 Fluid 社区中并没有提供详细的文档描述和 Demo 示例,因此本篇主要记录了使用 Fluid 的 ThinRuntime 对接 Lustre 的实践过程。