Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to analyze the reasons why the IOPS test value of Fio random reading may be too large

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

How to analyze the reasons why the IOPS test value of Fio random reading may be too large? I believe that many inexperienced people are at a loss about this. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.

Problem description:

When using fio for IOPS testing of virtual machine disks (Ceph's RBD, formatted as ext4 file system), it was found that randread was much higher than expected.

This phenomenon occurs after randwrite testing with the same parameters and then randread.

This will not happen after building the test file with dd and then randread, and the IOPS value is normal.

It is speculated that the random of fio may be pseudo-random, which leads to the use of the same pseudorandom sequence in both randwrite and randread. When the file system allocates physical blocks from front to back, logically random blocks are actually written sequentially to the physical disk, and the final random read is actually read sequentially, resulting in IO being merged by the disk scheduler, resulting in fewer actual IO times, so the IOPS tested is too large, so a detailed analysis and test is carried out.

Print Debug log

Open the debug mode of fio, execute the test, and output the log:

Fio-direct=1-iodepth=128-rw=randwrite-ioengine=libaio-bs=4k-size=1G-numjobs=1-runtime=10-group_reporting-filename=iotest-name=Rand_Write_Testing-- debug=random > rand_write_offset.log$ fio-direct=1-iodepth=128-rw=randread-ioengine=libaio-bs=4k-size=1G-numjobs=1-runtime=10-group_reporting-filename=iotest-name=Rand_Read_Testingg-debug=random > rand_read_offset.log

View the log:

$head-N30 rand_write_offset.log fio: set debug option randomRand_Write_Testing: (Grou0): rw=randwrite, bs= (R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio Iodepth=128fio-3.1Starting 1 processrandom 4057532 off rand 259043585random 4057532 off rand 3179521932random 4057532 off rand 3621444214random 4057532 off rand 2018697059random 4057532 off rand 1726199243random 4057532 off rand 3608323581random 4057532 off rand 1634212905random 4057532 off rand 1518359867random 4057532 off rand 3921331707random 4057532 off rand 287004724random 4057532 off rand 3673173177random 4057532 off rand 2796675757random 4057532 off rand 3988051731random 4057532 off rand 1060357494random 4057532 off rand 1685717462random 4057532 off rand 2400737531random 4057532 off rand 1891936796random 4057532 off rand 3455447349random 4057532 off rand 1553547805random 4057532 off rand 2660809810random 4057532 off rand 17263379random 4057532 off rand 1823528783random 4057532 off Rand 1355450167random 4057532 off rand 2956359995random 4057532 off rand 3392712188random 4057532 off rand 4240594610$ head-n30 rand_read_offset.log fio: set debug option randomRand_Read_Testingg: (grub0): rw=randread Bs= (R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio Iodepth=128fio-3.1Starting 1 processrandom 4057831 off rand 259043585random 4057831 off rand 3179521932random 4057831 off rand 3621444214random 4057831 off rand 2018697059random 4057831 off rand 1726199243random 4057831 off rand 3608323581random 4057831 off rand 1634212905random 4057831 off rand 1518359867random 4057831 off rand 3921331707random 4057831 off rand 287004724random 4057831 off rand 3673173177random 4057831 off rand 2796675757random 4057831 off rand 3988051731random 4057831 off rand 1060357494random 4057831 off rand 1685717462random 4057831 off rand 2400737531random 4057831 off rand 1891936796random 4057831 off rand 3455447349random 4057831 off rand 1553547805random 4057831 off rand 2660809810random 4057831 off rand 17263379random 4057831 off rand 1823528783random 4057831 off Rand 1355450167random 4057831 off rand 2956359995random 4057831 off rand 3392712188random 4057831 off rand 4240594610

Compared with the log, it is found that the random offset on the right is the same:

Get the Fio source code

The source and version of the source code are as follows:

$git clone https://github.com/axboe/fio.git$ cd fio$ git branch-av* master ee636f3 libaio: switch to newer libaio polled IO API remotes/origin/HEAD-> origin/master remotes/origin/latency-probe fcd4e74 target: fixes remotes/origin/master ee636f3 libaio: switch to newer libaio polled IO API Analysis debug option

Find the definition and reference location of the debug option:

$grep-rHn\ "debug\" init.c:176: .name = (char *) "debug"

Looking for the definition and reference location of the random parameter, you can see that the random parameter is defined using the FD_ Random macro or enumerated values:

$grep-rHn\ "random\"-A5 init.c init.c:2260: {.name = "random", init.c-2261- .help = "Random generation logging", init.c-2262- .shift = FD_RANDOM,init.c-2263-}, init.c-2264- {.name = "parse", init.c-2265- .help = "Parser logging"

Looking for the definition and reference location of the FD_RANDOM macro switch, you can find that the definition is in debug.h, referenced in io_u.c, and is used to switch debug printing, where line 98 is in the same format as the previous Debug log:

$grep-rHn FD_RANDOMdebug.h:13: FD_RANDOM,init.c:2262: .shift = FD_RANDOM,io_u.c:98: dprint (FD_RANDOM, "off rand% llu\ n", (unsigned long long) r); io_u.c:124: dprint (FD_RANDOM, "get_next_rand_offset: offset% llu busy\ n"

Check the source code near the reference of FD_RANDOM. Line 96 is where the random number is generated, and line 98 prints the generated random number:

$grep-rHn FD_RANDOM io_u.c- C12io_u.c-86-io_u.c-87-static int _ _ get_next_rand_offset (struct thread_data * td, struct fio_file * f, iofuu.cmer88-enum fio_ddir ddir, uint64_t * b Io_u.c-89- uint64_t lastb) io_u.c-90- {io_u.c-91- uint64_t r Io_u.c-92-io_u.c-93- if (td- > o.random_generator = = FIO_RAND_GEN_TAUSWORTHE | | io_u.c-94- td- > o.random_generator = = FIO_RAND_GEN_TAUSWORTHE64) {io_u.c-95-io_u.c-96- r = _ _ rand (& td- > random_state) Io_u.c-97-io_u.c:98: dprint (FD_RANDOM, "off rand% llu\ n", (unsigned long long) r); io_u.c-99-io_u.c-100- * b = lastb * (r / (rand_max (& td- > random_state) + 1.0)) Io_u.c-101-} else {io_u.c-102- uint64_t off = 0poliiotropu.cMur103muriofuu.cmel104-assert (fio_file_lfsr (f)); io_u.c-105-io_u.c-106- if (lfsr_next (& f-> lfsr, & off)) io_u.c-107- return 1 Io_u.c-108-io_u.c-109- * b = off;io_u.c-110-}-- io_u.c-112- / * io_u.c-113- * if we are not maintaining a randommap, we are done.io_u.c-114- * / io_u.c-115- if (! file_randommap (td, f)) io_u.c-116- goto ret Io_u.c-117-io_u.c-118- / * io_u.c-119- * calculate map offset and check if it's freeio_u.c-120- * / io_u.c-121- if (random_map_free (f, * b)) io_u.c-122- goto ret Io_u.c-123-io_u.c:124: dprint (FD_RANDOM, "get_next_rand_offset: offset% llu busy\ n", io_u.c-125- (unsigned long long) * b); io_u.c-126-io_u.c-127- * b = axmap_next_free (f-> io_axmap, * b) Io_u.c-128- if (* b = = (uint64_t)-1ULL) io_u.c-129- return 1 × iofu.cMUR 130RAPHIOREGUU.cLY 131-return 0 Io_u.c-132-} io_u.c-133-io_u.c-134-static int _ _ get_next_rand_offset_zipf (struct thread_data * td,io_u.c-135- struct fio_file * f, enum fio_ddir ddir,io_u.c-136- uint64_t * b) analyze dprint function

Find the definition and reference of a dprint function or macro, which is defined in debug.h:

$grep-rHn "dprint" debug.h:62:#define dprint (type, str, args...)\ debug.h:71:static inline void dprint (int type, const char * str,...) gettime.c:320: dprint (FD_TIME, "tmp=%llu, sft=%u\ n", tmp, sft) Io_u.h:153:static inline void dprint_io_u (struct io_u * io_u, const char * p) io_u.h:170:#define dprint_io_u (io_u, p) t/time-test.c:88:#define dprintf (...) If (DEBUG) {printf (_ _ VA_ARGS__);}

View the contents of the dprint definition in debug.h:

$grep-rHn "dprint"-C7 debug.hdebug.h-55-}; debug.h-56-extern const struct debug_level debug_levels []; debug.h-57-debug.h-58-extern unsigned long fio_debug;debug.h-59-debug.h-60-void _ dprint (int type, const char * str,...) _ attribute__ ((format (printf, 2,3) Debug.h-61-debug.h:62:#define dprint (type, str, args...)\ debug.h-63- do {\ debug.h-64- if ((1 name, strlen (dl- > name) If (! found) continue; if (dl- > shift = = FD_JOB) {opt = strchr (opt,':') If (! opt) {log_err ("fio: missing job number\ n"); break;} opt++ Fio_debug_jobno = atoi (opt); log_info ("fio: set debug jobno% d\ n", fio_debug_jobno) } else {log_info ("fio: set debug option% s\ n", opt); fio_debug | = (1UL shift);} break } if (! found) log_err ("fio: debug mask% s not found\ n", opt);} return 0;} # elsestatic int set_debug (const char * string) {log_err ("fio: debug tracing not included in build\ n"); return 1;} # endif Analysis randwrite option

Find the definition and reference of the randwrite parameter, using TD_DDIR_RANDWRITE as the parameter value:

$grep-rHn\ "randwrite\"-C5iodirddir.hmur62 -} io_ddir.h-63-io_ddir.h-64-static inline const char * ddir_str (enum td_ddir ddir) io_ddir.h-65- {io_ddir.h-66- static const char * _ str [] = {NULL, "read", "write", "rw", "rand", io_ddir.h:67: "randread" "randwrite", "randrw", io_ddir.h-68- "trim", NULL, "trimwrite", NULL, "randtrim"} Io_ddir.h-69-io_ddir.h-70- return _ _ str[ddir] Io_ddir.h-71-} io_ddir.h-72---options.c-1690-}, options.c-1691- {.ival = "randread", options.c-1692- .oval = TD_DDIR_RANDREAD,options.c-1693- .help = "Random read", options.c-1694-} Options.c:1695: {.ival = "randwrite", options.c-1696- .oval = TD_DDIR_RANDWRITE,options.c-1697- .help = "Random write", options.c-1698-}, options.c-1699- {.ival = "randtrim" Options.c-1700- .oval = TD_DDIR_RANDTRIM,--profiles/act.c-182-profiles/act.c-183- if (act_add_opt ("name=act-%s-%s", reads? "read": "write", dev) profiles/act.c-184- return 1 if if ("rw=%s", reads?) profiles/act.c-186- return 1 if ("rw=%s", reads? ("randread": "randwrite") profiles/act.c-188- return 1th ProfilesAct.Cmure 189-if (reads) {profiles/act.c-190- int rload = ao- > load * R_LOAD / ao- > threads_per_queue Profiles/act.c-191-profiles/act.c-192- if (act_add_opt ("numjobs=%u", ao- > threads_per_queue))-- t/sgunmap-test.py-116-t/sgunmap-test.py-117-t/sgunmap-test.py-118-def runalltests (args, qd, batch): TGunmapMurtest.pymure 119-block = Falset/sgunmap-test.py-120- for dev in [args.chardev Args.blockdev]: t/sgunmap-test.py:121: for rw in ["randread", "randwrite", "randtrim"]: t runtime=30s: sgunmapcolor test.pyly122-parameters = ["--name=test", "--time_based", "--runtime=30s". "--output-format=json", "--ioengine=sg", "- -"

Look for the definition of TD_DDIR_RANDWRITE, which consists of TD_DDIR_WRITE and TD_DDIR_RAND. We should pay attention to the impact of the parameter TD_DDIR_RAND on program execution:

$grep-rHn TD_DDIR_RANDWRITEio_ddir.h:38: TD_DDIR_RANDWRITE = TD_DDIR_WRITE | TD_DDIR_RAND,options.c:1696: .oval = TD_DDIR_RANDWRITE

Find the definition and reference of TD_DDIR_RAND, which is mainly referenced by td_random macros, and should be used as flag bit judgment:

$grep-rHn TD_DDIR_RAND-C3io_ddir.h-31-enum td_ddir {io_ddir.h-32- TD_DDIR_READ = 1 o.td_ddir & TD_DDIR_RAND) io_ddir.h-49-#define file_randommap (td F) (! (td)-> o.norandommap & & fio_file_axmap ((f)) io_ddir.h-50-#define td_trimwrite (td) (td)-> o.td_ddir & TD_DDIR_TRIMWRITE)\ io_ddir.h-51- = = TD_DDIR_TRIMWRITE)-- options.c-1689- .help = "Sequential trim" Options.c-1690-}, options.c-1691- {.ival = "randread", options.c:1692: .oval = TD_DDIR_RANDREAD,options.c-1693- .help = "Random read", options.c-1694-} Options.c-1695- {.ival = "randwrite", options.c:1696: .oval = TD_DDIR_RANDWRITE,options.c-1697- .help = "Random write", options.c-1698-}, options.c-1699- {.ival = "randtrim" Options.c:1700: .oval = TD_DDIR_RANDTRIM,options.c-1701- .help = "Random trim", options.c-1702-}, options.c-1703- {.ival = "rw",-- options.c-1709- .help = "Sequential read and write mix" Options.c-1710-}, options.c-1711- {.ival = "randrw", options.c:1712: .oval = TD_DDIR_RANDRW,options.c-1713- .help = "Random read and write mix" options.c-1714-} Options.c-1715- {.ival = "trimwrite"

Then look for the definition and reference of td_random, focusing on the influence in io_u.c, because this is the main location where the sequence is generated. When you find that random calls the get_next_rand_block function, it should be the location where random numbers are generated:

$grep-rHn td_random-C5 io_u.cio_u.c-416- assert (ddir_rw (ddir)); io_u.c-417-io_u.c-418- b = offset =-1ULL Io_u.c-419-io_u.c-420- if (rw_seq) {io_u.c:421: if (td_random (td)) {io_u.c-422- if (should_do_random (td, ddir)) {io_u.c-423- ret = get_next_rand_block (td, f, ddir, & b) Io_u.c-424- * is_random = true;io_u.c-425-} else {io_u.c-426- * is_random = false -- io_u.c-934-} io_u.c-935-io_u.c-936- / * io_u.c-937- * mark entry before potentially trimming io_uio_u.c-938- * / io_u.c:939: if (td_random (td) & & file_randommap (td, io_u- > file) io_u.c-940- io_u- > buflen = mark_random_map (td) Io_u, offset, io_u- > buflen) Io_u.c-941-io_u.c-942-out:io_u.c-943- dprint_io_u (io_u, "fill"); io_u.c-944- td- > zone_bytes + = io_u- > buflen

Check the code of get_next_rand_block in io_u.c, and finally call the previously analyzed location of calling dprint, and use _ _ rand and rand_max to calculate the random number:

Static int get_next_rand_block (struct thread_data * td, struct fio_file * f, enum fio_ddir ddir, uint64_t * b) {if (! get_next_rand_offset (td, f, ddir, b)) return 0 If (td- > o.time_based | | (td- > o.file_service_type & _ FIO_FSERVICE_NONUNIFORM)) {fio_file_reset (td, f); loop_cache_invalidate (td, f); if (! get_next_rand_offset (td, f, ddir, b)) return 0 } dprint (FD_IO, "% s: rand offset failed, last=%llu, size=%llu\ n", f-> file_name, (unsigned long long) f-> last_pos [ddir], (unsigned long long) f-> real_file_size); return 1 } static int get_next_rand_offset (struct thread_data * td, struct fio_file * f, enum fio_ddir ddir, uint64_t * b) {if (td- > o.random_distribution = = FIO_RAND_DIST_RANDOM) {uint64_t lastb; lastb = last_block (td, f, ddir) If (! lastb) return 1; return _ get_next_rand_offset (td, f, ddir, b, lastb);} else if (td- > o.random_distribution = = FIO_RAND_DIST_ZIPF) return _ get_next_rand_offset_zipf (td, f, ddir, b) Else if (td- > o.random_distribution = = FIO_RAND_DIST_PARETO) return _ get_next_rand_offset_pareto (td, f, ddir, b); else if (td- > o.random_distribution = = FIO_RAND_DIST_GAUSS) return _ get_next_rand_offset_gauss (td, f, ddir, b) Else if (td- > o.random_distribution = = FIO_RAND_DIST_ZONED) return _ get_next_rand_offset_zoned (td, f, ddir, b); else if (td- > o.random_distribution = = FIO_RAND_DIST_ZONED_ABS) return _ get_next_rand_offset_zoned_abs (td, f, ddir, b) Log_err ("fio: unknown random distribution:% d\ n", td- > o.random_distribution); return 1;} static int _ get_next_rand_offset (struct thread_data * td, struct fio_file * f, enum fio_ddir ddir, uint64_t * b, uint64_t lastb) {uint64_t r If (td- > o.random_generator = = FIO_RAND_GEN_TAUSWORTHE | | td- > o.random_generator = = FIO_RAND_GEN_TAUSWORTHE64) {r = _ _ rand (& td- > random_state); dprint (FD_RANDOM, "off rand% llu\ n", (unsigned long long) r) * b = lastb * (r / (rand_max (& td- > random_state) + 1.0));} else {uint64_t off = 0; assert (fio_file_lfsr (f)); if (lfsr_next (& f-> lfsr, & off)) return 1; * b = off } / * if we are not maintaining a random map, we are done. * / if (! file_randommap (td, f)) goto ret; / * * calculate map offset and check if it's free * / if (random_map_free (f, * b)) goto ret Dprint (FD_RANDOM, "get_next_rand_offset: offset% llu busy\ n", (unsigned long long) * b); * b = axmap_next_free (f-> io_axmap, * b); if (* b = = (uint64_t)-1ULL) return 1: return 0;} Analytical random number calculation

Find the _ _ rand function, which is a static inline function defined in lib/rand.h:

$grep-rHn "_ rand (" backend.c:1012: io_u- > rand_seed = _ _ rand (& td- > verify_state); backend.c:1014: io_u- > rand_seed * = _ _ rand (& td- > verify_state); engines/rdma.c:715: index = _ rand (& rd- > rand_state)% rd- > rmt_nr Engines/rdma.c:725: index = _ rand (& rd- > rand_state)% rd- > rmt_nr;filesetup.c:337: r = _ rand (& td- > file_size_state); io_u.c:96: r = _ rand (& td- > random_state); io_u.c:548: r = _ rand (& td- > bsrange_ state [ddir]) Io_u.c:1165: r = _ rand (& td- > next_file_state); lib/gauss.c:16: r = _ rand (& gs- > r); lib/gauss.c:28: sum + = _ rand (& gs- > r)% (gs- > nranges + 1); lib/rand.c:128: unsigned long r = _ rand (fs); lib/rand.c:131: r * = (unsigned long) _ rand (fs) Lib/rand.c:190: unsigned long r = _ rand (fs); lib/rand.c:193: r * = (unsigned long) _ rand (fs); lib/rand.h:96:static inline uint64_t _ rand (struct frand_state * state) lib/zipf.c:32: zs- > rand_off = _ rand (& zs- > rand); lib/zipf.c:55: rand_uni = (double) _ rand (& zs- > rand) / (double) FRAND32_MAX Lib/zipf.c:82: double rand = (double) _ rand (& zs- > rand) / (double) FRAND32_MAX;trim.c:77: r = _ rand (& td- > trim_state); verify.c:1370: io_u- > rand_seed = _ rand (& td- > verify_state) Verify.c:1372: io_u- > rand_seed * = _ _ rand (& td- > verify_state)

Looking at the implementation of this function, the final calculation result is only related to the input parameters, which is really pseudo-random:

Struct taus88_state {unsigned int S1, S2, S3; struct taus258_state {uint64_t S1, S2, S3, S4, S5; struct frand_state {unsigned int use64; union {struct taus88_state state32; struct taus258_state state64;};} Static inline unsigned int _ rand32 (struct taus88_state * state) {# define TAUSWORTHE ((strainc) s1,13,19, 4294967294UL, 12); state- > S2 = TAUSWORTHE (state- > S2, 25, 4294967288UL, 4); state- > S3 = TAUSWORTHE (state- > S3,3,11, 4294967280UL, 17); return (state- > S1 ^ state- > S2 ^ state- > S3) } static inline uint64_t _ rand64 (struct taus258_state * state) {uint64_t xval; xval = ((state- > S1) > 53; state- > S1 = ((state- > S1 & 18446744073709551614ULL) S2) > > 50; state- > S2 = ((state- > S2 & 18446744073709551104ULL) S3 S3) > 23; state- > S3 = (state- > S3 & 18446744073709547520ULL) S4 S4) > 24; state- > S4 = (state- > S4 & 18446744073709420544ULL) S5 S5) > 33 State- > S5 = ((state- > S5 & 18446744073701163008ULL) S1 ^ state- > S2 ^ state- > S2 ^ state- > S4 ^ state- > S5);} static inline uint64_t _ rand (struct frand_state * state) {if (state- > use64) return _ rand64 (& state- > state64); else return _ rand32 (& state- > state32);}

Again, look for the implementation of rand_max, which is still defined in lib/rand.h:

$grep-rHn "rand_max (" filesetup.c:336: frand_max = rand_max (& td- > file_size_state); io_u.c:546: frand_max = rand_max (& td- > bsrange_ state [ddir]); io_u.c:1162: uint64_t frand_max = rand_max (& td- > next_file_state) Lib/rand.h:27:static inline uint64_t rand_max (struct frand_state * state) trim.c:76: frand_max = rand_max (& td- > trim_state)

The output is also heavily dependent on input and is still not calculated randomly:

# define FRAND32_MAX (- 1U) # define FRAND64_MAX (- 1ULL) static inline uint64_t rand_max (struct frand_state * state) {if (state- > use64) return FRAND64_MAX; else return FRAND32_MAX;} Analysis seed initialization

As you can see from the above, the calculation result of the whole machine is only related to the input random_state. Then look up the initialization and reference of random_state, and finally find that random_state is initialized with init_rand_seed, and then only the _ _ rand and rand_max functions will change it:

$grep-rHn random_statefio.h:357: struct frand_state random_state;init.c:1056: init_rand_seed (& td- > random_state, td- > rand_ 's [FIO _ RAND_BLOCK_OFF], use64); io_u.c:96: r = _ rand (& td- > random_state) Io_u.c:100: * b = lastb * (r / (rand_max (& td- > random_state) + 1.0); verify.c:1623: if (td- > random_state.use64) {verify.c:1624: s-> rand.state64.s [0] = cpu_to_le64 (td- > random_state.state64.s1) Verify.c:1625: s-> rand.state64.s [1] = cpu_to_le64 (td- > random_state.state64.s2); verify.c:1626: s-> rand.state64.s [2] = cpu_to_le64 (td- > random_state.state64.s3); verify.c:1627: s-> rand.state64.s [3] = cpu_to_le64 (td- > random_state.state64.s4) Verify.c:1628: s-> rand.state64.s [4] = cpu_to_le64 (td- > random_state.state64.s5); verify.c:1632: s-> rand.state32.s [0] = cpu_to_le32 (td- > random_state.state32.s1); verify.c:1633: s-> rand.state32.s [1] = cpu_to_le32 (td- > random_state.state32.s2) Verify.c:1634: s-> rand.state32.s [2] = cpu_to_le32 (td- > random_state.state32.s3)

If you look at the implementation of init_rand_seed, it is still pseudo-random and is only related to the input parameter seed:

/ / lib/rand.hstatic inline uint64_t _ rand64 (struct taus258_state * state) {uint64_t xval; xval = ((state- > S1) > 53; state- > S1 = ((state- > S1 & 18446744073709551614ULL) S2) > > 50; state- > S2 = ((state- > S2 & 18446744073709551104ULL) S3 S3) > 23; state- > S3 = (state- > S2 & 18446744073709547520ULL) S4 S4) > 24 State- > S4 = ((state- > S4 & 18446744073709420544ULL) S5 S5) > 33; state- > S5 = ((state- > S5 & 18446744073701163008ULL) S1 ^ state- > S2 ^ state- > S3 ^ state- > S4 ^ state- > S5);} / / lib/rand.cstatic inline uint64_t _ seed (uint64_t x, uint64_t m) {return (x)

< m) ? x + m : x;}static void __init_rand32(struct taus88_state *state, unsigned int seed){ int cranks = 6;#define LCG(x, seed) ((x) * 69069 ^ (seed)) state->

S1 = _ seed (LCG ((2 ^ 31) + (2 ^ 17) + (2 ^ 7), seed), 1); state- > S2 = _ seed (LCG (state- > S1, seed), 7); state- > S3 = _ seed (LCG (state- > S2, seed), 15); while (cranks--) _ rand32 (state);} static void _ init_rand64 (struct taus258_state * state, uint64_t seed) {int cranks = 6 # define LCG64 (x, seed) ((x) * 6906969069ULL ^ (seed)) state- > S1 = _ seed (LCG64 ((2 ^ 31) + (2 ^ 17) + (2 ^ 7), seed), 1); state- > S2 = _ seed (LCG64 (state- > S1, seed), 7); state- > S3 = _ seed (state- > S2, seed), 15); state- > S4 = _ seed (LCG64 (state- > S2, seed), 33) State- > S5 = _ seed (LCG64 (state- > S4, seed), 49); while (cranks--) _ rand64 (state);} void init_rand (struct frand_state * state, bool use64) {state- > use64 = use64; if (! use64) _ init_rand32 (& state- > state32, 1) Else _ init_rand64 (& state- > state64, 1);} void init_rand_seed (struct frand_state * state, unsigned int seed, bool use64) {state- > use64 = use64; if (! use64) _ init_rand32 (& state- > state32, seed); else _ init_rand64 (& state- > state64, seed);} Analysis of seed sources

Take a look at the reference of the parameter seed, TD-> rand_ servers [FIO _ RAND_BLOCK_OFF], which is initialized at 1054 of init.c:

$grep-rHn "rand_seeds\ [FIO_RAND_BLOCK_OFF\]" filesetup.c:1292: seed = td- > rand_ 's [FIO _ RAND_BLOCK_OFF]; filesetup.c:1882: lfsr_reset (& f-> lfsr, td- > rand_ 's [FIO _ RAND_BLOCK_OFF]); init.c:1054: td- > rand_ 's [FIO _ RAND_BLOCK_OFF] = FIO_RANDSEED * td- > thread_number Init.c:1056: init_rand_seed (& td- > random_state, td- > rand_ 's [Fio _ RAND_BLOCK_OFF], use64)

Take a look at the implementation of the output function and find the thread_number variable that depends on the process:

/ / init.cstatic void td_fill_rand_seeds_internal (struct thread_data * td, bool use64) {unsigned int read_seed = td- > rand_ servers [FIO _ RAND_BS_OFF]; unsigned int write_seed = td- > rand_ servers [FIO _ RAND_BS1_OFF]; unsigned int trim_seed = td- > rand_ servers [FIO _ RAND_BS2_OFF]; int I / * trimwrite is special in that we need to generate the same * offsets to get the "write after trim" effect. If we are * using bssplit to set buffer length distributions, ensure that * we seed the trim and write generators identically. Ditto for * verify, read and writes must have the same seed, if we are doing * read verify. * / if (td- > o.verify! = VERIFY_NONE) write_seed = read_seed; if (td_trimwrite (td)) trim_seed = write_seed; init_rand_seed (& td- > bsrange_ state [DDIR _ READ], read_seed, use64); init_rand_seed (& td- > bsrange_ state [DDIR _ WRITE], write_seed, use64) Init_rand_seed (& td- > bsrange_ state [DDIR _ TRIM], trim_seed, use64); td_fill_verify_state_seed (td); init_rand_seed (& td- > rwmix_state, td- > rand_ [FIO _ RAND_MIX_OFF], false) If (td- > o.file_service_type = = FIO_FSERVICE_RANDOM) init_rand_seed (& td- > next_file_state, td- > rand_ 's [Fio _ RAND_FILE_OFF], use64); else if (td- > o.file_service_type & _ FIO_FSERVICE_NONUNIFORM) init_rand_file_service (td) Init_rand_seed (& td- > file_size_state, td- > rand_ 's [Fio _ RAND_FILE_SIZE_OFF], use64); init_rand_seed (& td- > trim_state, td- > rand_ 's [Fio _ RAND_TRIM_OFF], use64); init_rand_seed (& td- > delay_state, td- > rand_ 's [Fio _ RAND_START_DELAY], use64) Init_rand_seed (& td- > poisson_state [0], td- > rand_ 's [FIO _ RAND_POISSON_OFF], 0); init_rand_seed (& td- > poisson_state [1], td- > rand_ 's [FIO _ RAND_POISSON2_OFF], 0); init_rand_seed (& td- > poisson_state [2], td- > rand_ 's [FIO _ RAND_POISSON3_OFF], 0) Init_rand_seed (& td- > dedupe_state, td- > rand_ 's [Fio _ DEDUPE_OFF], false); init_rand_seed (& td- > zone_state, td- > rand_ 's [Fio _ RAND_ZONE_OFF], false); if (! td_random (td)) return If (td- > o.rand_repeatable) td- > rand_ 's [Fio _ RAND_BLOCK_OFF] = FIO_RANDSEED * td- > thread_number; init_rand_seed (& td- > random_state, td- > rand_ 's [Fio _ RAND_BLOCK_OFF], use64); for (I = 0; I

< DDIR_RWDIR_CNT; i++) { struct frand_state *s = &td->

Seq_rand_ [I]; init_rand_seed (s, td- > rand_ [Fio _ RAND_SEQ_RAND_READ_OFF], false);}}

In finding the initialization and reference location of the thread_number variable, it seems that there is an explanation in the HOWTO:

$grep-rHn thread_numberHOWTO:1245: * thread_ number`, where the thread number is a counter that starts at 0 andbackend.c:64:unsigned int thread_number = 0; ret = fio_cpus_split (& o-> cpumask, td- > thread_number-1); backend.c:1899: verify_save_state (td- > thread_number) Backend.c:2131: td- > thread_number-1, & data); backend.c:2248: todo = thread_number;backend.c:2254: print_status_init (td- > thread_number-1); backend.c:2488: if (! thread_number) client.c:923: pdu.thread_number = cpu_to_le32 (client- > thread_number) Client.c:948: dst- > thread_number = le32_to_cpu (src- > thread_number); client.c:1078: if (client- > opt_lists & & p-> ts.thread_number jobs) client.c:1079: opt_list = & client- > opt_ lists [p-> ts.thread_number-1]; client.c:1095: client_ts.thread_number = p-> ts.thread_number Client.c:1653: ret- > thread_number = le32_to_cpu (ret- > thread_number); client.c:1832: client- > thread_number = le32_to_cpu (pdu- > thread_number); client.h:60: uint32_t thread_number;eta.c:41: char c = _ run_ str[ TD-> thread_number-1]; eta.c:118: _ _ run_ stre [TD-> thread_number-1] = c Eta.c:411: eta_secs = malloc (thread_number * sizeof (uint64_t)); eta.c:412: memset (eta_secs, 0, thread_number * sizeof (uint64_t)); eta.c:530: je- > nr_threads = thread_number;eta.c:704: if (! thread_number) filesetup.c:1203: seed = jhash (f-> file_name, strlen (f-> file_name), 0) * td- > thread_number Fio.1:1012:* thread_number', where the thread number is a counter that starts at 0 andfio.h:183: unsigned int thread_number;fio.h:509:extern unsigned int thread_number;fio.h:701: for ((I) = 0, (td) = & threads [0]; (I)

< (int) thread_number; (i)++, (td)++)gclient.c:299: client_ts.thread_number = p->

Ts.thread_number;gclient.c:578: P-> thread_number = le32_to_cpu (p-> thread_number); init.c:480: if (thread_number > = max_jobs) {init.c:486: td = & thread [thread _ number++]; init.c:505: td- > thread_number = thread_number;init.c:536: memset (& thread [TD-> thread_number-1], 0, sizeof (* thread)) Init.c:537: thread_number--;init.c:1054: td- > rand_ 's [Fio _ RAND_BLOCK_OFF] = FIO_RANDSEED * td- > thread_number;init.c:1073: td- > rand_ 's [I] = FIO_RANDSEED * td- > thread_numberinit.c:1235: td- > rand_ 's [I] = seed * td- > thread_number + I Init.c:1565: td- > thread_number, suf, o-> per_job_logs); init.c:1569: td- > thread_number, suf, o-> per_job_logs); init.c:1573: td- > thread_number, suf, o-> per_job_logs) Init.c:1605: td- > thread_number, suf, o-> per_job_logs); init.c:1637: td- > thread_number, suf, o-> per_job_logs); init.c:1668: td- > thread_number, suf, o-> per_job_logs) Init.c:3004: if (! thread_number) {libfio.c:160: thread_number = 0scape server.cGrou.758: spdu.jobs = cpu_to_le32 (thread_number); server.c:801: spdu.jobs = cpu_to_le32 (thread_number); server.c:842: spdu.jobs = cpu_to_le32 (thread_number); server.c:943: tnumber = le32_to_cpu (pdu- > thread_number) Server.c:947: if (! tnumber | | tnumber > thread_number) {server.c:1478: p.ts.thread_number = cpu_to_le32 (ts- > thread_number); server.c:1958: .thread _ number = cpu_to_le32 (td- > thread_number), server.c:2029: .thread _ number = cpu_to_le32 (td- > thread_number), server.h:172: uint32_t thread_number Server.h:192: uint32_t thread_number;stat.c:1782: ts- > thread_number = td- > thread_number;stat.c:1998: rt = malloc (thread_number * sizeof (unsigned long long)); stat.h:152: uint32_t thread_number;stat.h:365:#define THREAD_RUNSTR_SZ _ THREAD_RUNSTR_SZ (thread_number) verify.c:1168: hdr- > thread = td- > thread_number Verify.c:1797: fd = open_state_file (td- > o.name, prefix, td- > thread_number-1,0)

Go to HOWTO and see that this variable is actually a process number. Ours has always been a single-process test, and this value is 0:

$grep-rHn thread_number HOWTO- C5HOWTOMI 1240-offset is aligned to the minimum block size.HOWTO-1241-HOWTO-1242-.. Option:: offset_increment=intHOWTO-1243-HOWTO-1244- If this is provided, then the real offset becomes `offset + offset_incrementHOWTO:1245: * thread_ number`, where the thread number is a counter that starts at 0 andHOWTO-1246- is incremented for each sub-job (i.e. When: option: `numjobs` option isHOWTO-1247- specified). This option is useful if there are several jobs which areHOWTO-1248- intended to operate on a file in parallel disjoint segments, with evenHOWTO-1249- spacing between the starting points.HOWTO-1250-

Look up his reference in init.c, and the statement near line 480 seems to be the main place to change this value:

$grep-rHn thread_number init.c init.c:480: if (thread_number > = max_jobs) {init.c:486: td = & thread [thread _ number++]; init.c:505: td- > thread_number = thread_number;init.c:536: memset (& thread [TD-> thread 1], 0, sizeof (* thread)); init.c:537: thread Init.c:1054: td- > rand_ 's [Fio _ RAND_BLOCK_OFF] = FIO_RANDSEED * td- > thread_number;init.c:1073: td- > rand_ 's [I] = FIO_RANDSEED * td- > thread_numberinit.c:1235: td- > rand_ 's [I] = seed * td- > thread_number + I Init.c:1565: td- > thread_number, suf, o-> per_job_logs); init.c:1569: td- > thread_number, suf, o-> per_job_logs); init.c:1573: td- > thread_number, suf, o-> per_job_logs) Init.c:1605: td- > thread_number, suf, o-> per_job_logs); init.c:1637: td- > thread_number, suf, o-> per_job_logs); init.c:1668: td- > thread_number, suf, o-> per_job_logs); init.c:3004: if (! thread_number) {

Check in init.c to see that the value of thread_number is indeed added with the increase in the number of age processes:

/ / init.c/* * Return a free job structure. * / static struct thread_data * get_new_job (bool global, struct thread_data * parent, bool preserve_eo, const char * jobname) {struct thread_data * td; if (global) return & def_thread; if (setup_thread_area ()) {log_err ("error: failed to setup shm segment\ n") Return NULL;} if (thread_number > = max_jobs) {log_err ("error: maximum number of jobs (% d) reached.\ n", max_jobs); return NULL;} td = & thread [thread _ number++]; * td = * parent INIT_FLIST_HEAD (& td- > opt_list); if (parent! = & def_thread) copy_opt_list (td, parent); td- > io_ops = NULL; td- > io_ops_init = 0; if (! preserve_eo) td- > eo = NULL; td- > o.uid = td- > o.gid =-1U Dup_files (td, parent); fio_options_mem_dupe (td); profile_add_hooks (td); td- > thread_number = thread_number; td- > subjob_number = 0; if (jobname) td- > o.name = strdup (jobname); if (! parent- > o.group_reporting | | parent = & def_thread) stat_number++ Return td;} final summary

The numerical generation of the entire random sequence depends only on the sequence number of the Job created and the number of the entire test sequence, which is really pseudorandom. The first column in the previous debug=random is PID, and the second column is the resulting sequence number.

First use randwrite to write data in the file system. If the file system is empty and no other files are written during the write, it is logically random, but actually assigned by the file system to contiguous physical sectors (at the same time, the write operation is not merged by the block device scheduler, so the number of random IO writes is not reduced)

Then, when randread is tested for IOPS, due to pseudorandom reasons, the logical random sequence of randwrite is the same, but when the file system reaches the block device, it is physically continuous sectors, which will be merged by the block device scheduler and become sequentially read, so the number of IO will be reduced, so the IOPS tested at this time will be too large (as for why the writes are not merged, but the reads are merged, which is the same as in the file system and scheduler algorithm. Read-write merge wait time and queue length are related to different parameters)

Before using fio for random read testing, the initialization files should be written in the order of dd or fio, so that the order of read and write is different, so that this problem can be avoided.

As for the random write test, it is best to add, delete, read and write a series of irregular files to the file system before the test, so that the file sector allocation of the file system is no longer continuous, in order to get more reliable test results.

After reading the above, have you mastered the method of analyzing the reasons why the IOPS test value of Fio random reading may be too large? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report