编译器优化导致xfs的assert失败问题追踪

 最近公司arm机子上,xfs文件系统在一个目录快速创建文件的时候,如果在同一目录vim 创建新文件,会偶尔出现xfs文件系统爆出assert失败,并导致内核oops,卡死当前vim进程,内核没有死,但xfs已经变得不可用,任何文件操作都会卡死操作进程。

 经过两周的排除,最终确定是Amlogic厂家提供的交叉编译工具gcc 4.9.0的bug,优化不当导致的。排查过程记录如下:

 (1)内核日志dmesg日志如下:

  1 [  673.286664] XFS: Assertion failed: index == 0 || be32_to_cpu(ents[index - 1].hashval) <= args->hashval, file: fs/xfs/xfs_dir2_node.c, line: 471
  2 [  673.299510] ------------[ cut here ]------------ 
  3 [  673.303979] kernel BUG at fs/xfs/xfs_message.c:108!
  4 [  673.308977] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
  5 [  673.314928] Modules linked in: ufsd(PO) jnl(O) mali aml_nftl_dev(P)
  6 [  673.321315] CPU: 2 PID: 8641 Comm: vim Tainted: P           O 3.10.33 #3
  7 [  673.328123] task: e0a4af80 ti: df992000 task.ti: df992000   
  8 [  673.333659] PC is at assfail+0x24/0x28
  9 [  673.337523] LR is at assfail+0x24/0x28
 10 [  673.341404] pc : [<c029e394>]    lr : [<c029e394>]    psr: 40000013
 11                sp : df993c50  ip : 00000000  fp : 00000000\x0000
 12 ..........................................................................
 13 [  673.371765] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
 14 [  674.354687] [<c029e394>] (assfail+0x24/0x28) from [<c02e1098>] (xfs_dir2_leafn_add+0x198/0x3bc)
 15 [  674.363477] [<c02e1098>] (xfs_dir2_leafn_add+0x198/0x3bc) from [<c02e3804>] (xfs_dir2_node_addname+0x154/0x200)
 16 [  674.373654] [<c02e3804>] (xfs_dir2_node_addname+0x154/0x200) from [<c02d8270>] (xfs_dir_createname+0x16c/0x174)
 17 [  674.383831] [<c02d8270>] (xfs_dir_createname+0x16c/0x174) from [<c02a2c5c>] (xfs_create+0x3a8/0x660)
 18 [  674.393058] [<c02a2c5c>] (xfs_create+0x3a8/0x660) from [<c029bb10>] (xfs_vn_mknod+0x140/0x1f4)
 19 [  674.401772] [<c029bb10>] (xfs_vn_mknod+0x140/0x1f4) from [<c011a59c>] (vfs_create+0x94/0xd8)
 20 [  674.410390] [<c011a59c>] (vfs_create+0x94/0xd8) from [<c011c11c>] (do_last.isra.29+0x604/0xc90)
 21 [  674.419117] [<c011c11c>] (do_last.isra.29+0x604/0xc90) from [<c011c854>] (path_openat.isra.30+0xac/0x49c)
 22 [  674.428774] [<c011c854>] (path_openat.isra.30+0xac/0x49c) from [<c011d994>] (do_filp_open+0x2c/0x78)
 23 [  674.438011] [<c011d994>] (do_filp_open+0x2c/0x78) from [<c010f018>] (do_sys_open+0xe8/0x170)
 24 [  674.446546] [<c010f018>] (do_sys_open+0xe8/0x170) from [<c000e040>] (ret_fast_syscall+0x0/0x30)

看日志记录是经典的创建文件的流程,从VFS到具体的文件系统xfs。

一般文件在目录创建文件的流程大致如下:

1.首先根据所在目录路径名字字符串,从顶层分量一次一次解析找到目录的inode号。

2.根据目录inode号,读取inode数据区。

3.将要创建文件的inode号、名字,名字长度和文件类型(目录\文件\符号链接\设备文件)写入所在目录文件的数据区。

(2)日志分析

从日志来看,是出在了xfs_dir2_leafn_add()函数的Assert断言处。

 466     ASSERT(index == 0 || be32_to_cpu(ents[index - 1].hashval) <= args->hashval);
 467     ASSERT(index == leafhdr.count ||
 468            be32_to_cpu(ents[index].hashval) >= args->hashval);

如图,看打印,问题应该出在xfs_dir2_leafn_add()函数的466行ASSERT处,此处断言为逻辑表达式,经添加打印确认,应该是 (be32_to_cpu(ents[index - 1].hashval) <= args->hashval)未满足从而出发的bug,也就是(be32_to_cpu(ents[index - 1].hashval) >args->hashval)。

xfs_dir2_leafn_add()函数源码如下:

 418 static int                  /* error */
 419 xfs_dir2_leafn_add(
 420     struct xfs_buf      *bp,        /* leaf buffer */
 421     xfs_da_args_t       *args,      /* operation arguments */
 422     int         index)      /* insertion pt for new entry */
 423 {
 424     int         compact;    /* compacting stale leaves */
 425     xfs_inode_t     *dp;        /* incore directory inode */
 426     int         highstale;  /* next stale entry */
 427     xfs_dir2_leaf_t     *leaf;      /* leaf structure */
 428     xfs_dir2_leaf_entry_t   *lep;       /* leaf entry */
 429     int         lfloghigh;  /* high leaf entry logging */
 430     int         lfloglow;   /* low leaf entry logging */
 431     int         lowstale;   /* previous stale entry */
 432     xfs_mount_t     *mp;        /* filesystem mount point */
 433     xfs_trans_t     *tp;        /* transaction pointer */
 434     struct xfs_dir3_icleaf_hdr leafhdr;
 435     struct xfs_dir2_leaf_entry *ents;
 436 
 437     trace_xfs_dir2_leafn_add(args, index);
 438 
 439     dp = args->dp;
 440     mp = dp->i_mount;
 441     tp = args->trans;
 442     leaf = bp->b_addr;
 443     xfs_dir3_leaf_hdr_from_disk(&leafhdr, leaf);
 444     ents = xfs_dir3_leaf_ents_p(leaf);
 445 
 446     /*
 447      * Quick check just to make sure we are not going to index
 448      * into other peoples memory
 449      */
 450     if (index < 0)
 451         return XFS_ERROR(EFSCORRUPTED);
 452 
 453     /*
 454      * If there are already the maximum number of leaf entries in
 455      * the block, if there are no stale entries it won't fit.
 456      * Caller will do a split.  If there are stale entries we'll do
 457      * a compact.
 458      */
 459 
 460     if (leafhdr.count == xfs_dir3_max_leaf_ents(mp, leaf)) {
 461         if (!leafhdr.stale)
 462             return XFS_ERROR(ENOSPC);
 463         compact = leafhdr.stale > 1;
 464     } else
 465         compact = 0;
 466     ASSERT(index == 0 || be32_to_cpu(ents[index - 1].hashval) <= args->hashval);
 467     ASSERT(index == leafhdr.count ||
 468            be32_to_cpu(ents[index].hashval) >= args->hashval);
 469 
 470     if (args->op_flags & XFS_DA_OP_JUSTCHECK)
 471         return 0;
 472 
 473     /*
 474      * Compact out all but one stale leaf entry.  Leaves behind
 475      * the entry closest to index.
 476      */
 477     if (compact)
 478         xfs_dir3_leaf_compact_x1(&leafhdr, ents, &index, &lowstale,
 479                      &highstale, &lfloglow, &lfloghigh);
 480     else if (leafhdr.stale) {
 481         /*
 482          * Set impossible logging indices for this case.
 483          */
 484         lfloglow = leafhdr.count;
 485         lfloghigh = -1;
 486     }
 487 
 488     /*
 489      * Insert the new entry, log everything.
 490      */
 491     lep = xfs_dir3_leaf_find_entry(&leafhdr, ents, index, compact, lowstale,
 492                        highstale, &lfloglow, &lfloghigh);
 493 
 494     lep->hashval = cpu_to_be32(args->hashval);
 495     lep->address = cpu_to_be32(xfs_dir2_db_off_to_dataptr(mp,
 496                 args->blkno, args->index));
 497 
 498     xfs_dir3_leaf_hdr_to_disk(leaf, &leafhdr);
 499     xfs_dir3_leaf_log_header(tp, bp);
 500     xfs_dir3_leaf_log_ents(tp, bp, lfloglow, lfloghigh);
 501     xfs_dir3_leaf_check(mp, bp);
 502     return 0;
 503 }

这个函数主要是根据上层函数找到的index(args->index),将创建的文件的hash/address对儿插入到xfs文件inode的hash数组ents的相应的位置。

xfs目录的添加文件时候需要做两件事:

1.在目录文件的空闲数据区,将创建文件的name 、namelen、inode号和文件类型等目录项信息添加到空闲数据区。

2.根据创建文件的name 和namelen计算出特定的hash值,并根据hash值的大小,在父目录的ents数组中找到合适的插入位置index,插入hash/address对儿。

该ents数组是按照hash升序排列,如果出现hash冲突,插入位置依次向后,直到不相等。ents hash数组数组项目是hash值和adress对儿,adress指向第(1)步,文件信息所在数据区的位置地址,

该地址是磁盘地址,包括磁盘块和磁盘内偏移。

如果目录的子文件比较多的话,查找特定文件时。根据该文件的hash值,在ents数组通过二分法可以快速查找到hash/address对儿,然后根据adrees可以从磁盘中读出文件的name和inode号,如果查找中遇到hash冲突,需要进一步比较name确认。

xfs_dir2_leafn_add()函数工作如下:

1.根据上层找出的插入index,做进一步的检查。既然是hash升序排列,那么插入文件的hash 必须 >= ents[index -1]的hash值。(此处assert 失败)。

2.hash 数组ents的空间整理。比如说发现数组项对应的文件已经删除,那么,需要将后面的所有数组项向前复制,覆盖此过期的hash/address对。

3.因为进行了ent数组的空间整理,所以此前index可能不是应该插入的正确位置,找出要插入的正确index,并把hash/address对儿插入到数组项。

其中,第1步,assert失败,首先想到index值的来源的出了问题。

追溯xfs文件的创建流流程,发现插入index值是在xfs_dir2_leafn_lookup_for_addname()函数里面得出的。

(3)xfs_dir2_leafn_lookup_for_addname()函数

该函数主要工作如下:

1.调用xfs_dir2_leaf_search_hash()函数通过二分法查找要插入的hash index。

2.如果index所在数组项的hash值和要插入文件的hash值相等,那么插入位置向后顺延。

3.查找目录的空闲数据区所在的数据块,用来写入创建文件的目录项。

在调试的过程中,发现问题出在了第2步的for循环中。

for循环如下:

 607      * Loop over leaf entries with the right hash value.
 608      */
 609     for (lep = &ents[index];
 610          index < leafhdr.count && be32_to_cpu(lep->hashval) == args->hashval;
 611          lep++, index++) {
 612         /*
 613          * Skip stale leaf entries.
 614          */
 615         if (be32_to_cpu(lep->address) == XFS_DIR2_NULL_DATAPTR)
 616             continue;

如上:

lep: 数组项指针;

ents: hash数组;

index:   二分法查找的到的插入位置;

leafhdr.count: 数组中占用的数组项数目(包括已经删除的文件的);

args->hashval: 创建文件的name的hash值。

问题具体为:在for循环中添加打印(continue之前),发现在assert失败的时候,be32_to_cpu(lep->hashval) 和args->hashval不相等。(不相等怎么进入的for)

首先想到的是程序并发导致数据被篡改或者内存越界被破坏。

在对目录文件进行操作的时候会获取目录文件的mutex,保证独占性访问。并且在for循环之前和for循环里,添加打印,发现数据并没有被破坏。

这样可以排除并发和内存越界修改数据的可能性。

所以此时怀疑是编译器优化导致的问题。

(4)xfs_dir2_leafn_lookup_for_addname()函数反汇编

首先函数源码如下:

 556 xfs_dir2_leafn_lookup_for_addname(
 557     struct xfs_buf      *bp,        /* leaf buffer */
 558     xfs_da_args_t       *args,      /* operation arguments */
 559     int         *indexp,    /* out: leaf entry index */
 560     xfs_da_state_t      *state)     /* state to fill in */
 561 {
 562     struct xfs_buf      *curbp = NULL;  /* current data/free buffer */
 563     xfs_dir2_db_t       curdb = -1; /* current data block number */
 564     xfs_dir2_db_t       curfdb = -1;    /* current free block number */
 565     xfs_inode_t     *dp;        /* incore directory inode */
 566     int         error;      /* error return value */
 567     int         fi;     /* free entry index */
 568     xfs_dir2_free_t     *free = NULL;   /* free block structure */
 569     int         index;      /* leaf entry index */
 570     xfs_dir2_leaf_t     *leaf;      /* leaf structure */
 571     int         length;     /* length of new data entry */
 572     xfs_dir2_leaf_entry_t   *lep;       /* leaf entry */
 573     xfs_mount_t     *mp;        /* filesystem mount point */
 574     xfs_dir2_db_t       newdb;      /* new data block number */
 575     xfs_dir2_db_t       newfdb;     /* new free block number */
 576     xfs_trans_t     *tp;        /* transaction pointer */
 577     struct xfs_dir2_leaf_entry *ents;
 578     struct xfs_dir3_icleaf_hdr leafhdr;
 579 
 580     dp = args->dp;
 581     tp = args->trans;
 582     mp = dp->i_mount;
 583     leaf = bp->b_addr;
 584     xfs_dir3_leaf_hdr_from_disk(&leafhdr, leaf);
 585     ents = xfs_dir3_leaf_ents_p(leaf);
 586 
 587     xfs_dir3_leaf_check(mp, bp);
 588     ASSERT(leafhdr.count > 0);
 589 
 590     /*
 591      * Look up the hash value in the leaf entries.
 592      */
 593     index = xfs_dir2_leaf_search_hash(args, bp);
 594     /*
 595      * Do we have a buffer coming in?
 596      */
 597     if (state->extravalid) {
 598         /* If so, it's a free block buffer, get the block number. */
 599         curbp = state->extrablk.bp;
 600         curfdb = state->extrablk.blkno;
 601         free = curbp->b_addr;
 602         ASSERT(free->hdr.magic == cpu_to_be32(XFS_DIR2_FREE_MAGIC) ||
 603                free->hdr.magic == cpu_to_be32(XFS_DIR3_FREE_MAGIC));
 604     }
 605     length = xfs_dir2_data_entsize(args->namelen);
 606     /*
 607      * Loop over leaf entries with the right hash value.
 608      */
 609     for (lep = &ents[index];
 610          index < leafhdr.count && be32_to_cpu(lep->hashval) == args->hashval;
 611          lep++, index++) {
 612         /*
 613          * Skip stale leaf entries.
 614          */
 615         if (be32_to_cpu(lep->address) == XFS_DIR2_NULL_DATAPTR)
 616             continue;
 617         /*
 618          * Pull the data block number from the entry.
 619          */
 620         newdb = xfs_dir2_dataptr_to_db(mp, be32_to_cpu(lep->address));
 621         /*
 622          * For addname, we're looking for a place to put the new entry.
 623          * We want to use a data block with an entry of equal
 624          * hash value to ours if there is one with room.
 625          *
 626          * If this block isn't the data block we already have
 627          * in hand, take a look at it.
 628          */
 629         if (newdb != curdb) {
 630             __be16 *bests;
 631 
 632             curdb = newdb;
 633             /*
 634              * Convert the data block to the free block
 635              * holding its freespace information.
 636              */
 637             newfdb = xfs_dir2_db_to_fdb(mp, newdb);
 638             /*
 639              * If it's not the one we have in hand, read it in.
 640              */
 641             if (newfdb != curfdb) {
 642                 /*
 643                  * If we had one before, drop it.
 644                  */
 645                 if (curbp)
 646                     xfs_trans_brelse(tp, curbp);
 647 
 648                 error = xfs_dir2_free_read(tp, dp,
 649                         xfs_dir2_db_to_da(mp, newfdb),
 650                         &curbp);
 651                 if (error)
 652                     return error;
 653                 free = curbp->b_addr;
 654 
 655                 xfs_dir2_free_hdr_check(mp, curbp, curdb);
 656             }
 657             /*
 658              * Get the index for our entry.
 659              */
 660             fi = xfs_dir2_db_to_fdindex(mp, curdb);
 661             /*
 662              * If it has room, return it.
 663              */
 664             bests = xfs_dir3_free_bests_p(mp, free);
 665             if (unlikely(bests[fi] == cpu_to_be16(NULLDATAOFF))) {
 666                 XFS_ERROR_REPORT("xfs_dir2_leafn_lookup_int",
 667                             XFS_ERRLEVEL_LOW, mp);
 668                 if (curfdb != newfdb)
 669                     xfs_trans_brelse(tp, curbp);
 670                 return XFS_ERROR(EFSCORRUPTED);
 671             }
 672             curfdb = newfdb;
 673             if (be16_to_cpu(bests[fi]) >= length)
 674                 goto out;
 675         }
 676     }
 677     /* Didn't find any space */
 678     fi = -1;
 679 out:
 680     ASSERT(args->op_flags & XFS_DA_OP_OKNOENT);
 681     if (curbp) {
 682         /* Giving back a free block. */
 683         state->extravalid = 1;
 684         state->extrablk.bp = curbp;
 685         state->extrablk.index = fi;
 686         state->extrablk.blkno = curfdb;
 687 
 688         /*
 689          * Important: this magic number is not in the buffer - it's for
 690          * buffer type information and therefore only the free/data type
 691          * matters here, not whether CRCs are enabled or not.
 692          */
 693         state->extrablk.magic = XFS_DIR2_FREE_MAGIC;
 694     } else {
 695         state->extravalid = 0;
 696     }
 697     /*
 698      * Return the index, that will be the insertion point.
 699      */
 700     *indexp = index;
 701     return XFS_ERROR(ENOENT);
 702 }
                                                                                                                                                                              

arm-linux-gnueabihf-gdb vmlinux,进入gdb后,disassemble xfs_dir2_leafn_lookup_for_addname查看反汇编。

其中有关for循环的代码如下:

 78    0xc02e2390 <+300>:   ldrh    r2, [sp, #66]   ; 0x42
 79    0xc02e2394 <+304>:   lsl r4, r7, #3
 80    0xc02e2398 <+308>:   ldr r3, [r11, #4]
 81    0xc02e239c <+312>:   cmp r7, r2
 82    0xc02e23a0 <+316>:   add r3, r3, #18
 83    0xc02e23a4 <+320>:   bic r3, r3, #7
 84    0xc02e23a8 <+324>:   str r3, [sp, #24]
 85    0xc02e23ac <+328>:   bge 0xc02e25fc <xfs_dir2_leafn_lookup_for_addname+920>
 86    0xc02e23b0 <+332>:   ldr r2, [r11, #20]
 87    0xc02e23b4 <+336>:   add r4, r4, #8
 88    0xc02e23b8 <+340>:   add r4, r5, r4
 89    0xc02e23bc <+344>:   ldr r3, [r4, #-8]
 90    0xc02e23c0 <+348>:   rev r3, r3
 91    0xc02e23c4 <+352>:   cmp r2, r3
 92    0xc02e23c8 <+356>:   bne 0xc02e25fc <xfs_dir2_leafn_lookup_for_addname+920>
 93    0xc02e23cc <+360>:   mvn r0, #0
 94    0xc02e23d0 <+364>:   str r11, [sp, #44]  ; 0x2c
 95    0xc02e23d4 <+368>:   ldr r3, [r4, #-4]
 96    0xc02e23d8 <+372>:   rev r3, r3
 97    0xc02e23dc <+376>:   cmp r3, #0
 98    0xc02e23e0 <+380>:   beq 0xc02e25dc <xfs_dir2_leafn_lookup_for_addname+888>
218    0xc02e25c0 <+860>:   ldr r1, [sp, #24]
219    0xc02e25c4 <+864>:   rev16   r2, r2
220    0xc02e25c8 <+868>:   uxth    r2, r2
221    0xc02e25cc <+872>:   cmp r2, r1
222    0xc02e25d0 <+876>:   bge 0xc02e2604 <xfs_dir2_leafn_lookup_for_addname+928>
223    0xc02e25d4 <+880>:   mov r10, r5
224    0xc02e25d8 <+884>:   mov r0, r6
225    0xc02e25dc <+888>:   ldrh    r3, [sp, #66]   ; 0x42
226    0xc02e25e0 <+892>:   add r7, r7, #1
227    0xc02e25e4 <+896>:   add r4, r4, #8
228    0xc02e25e8 <+900>:   cmp r3, r7
229    0xc02e25ec <+904>:   bgt 0xc02e23d4 <xfs_dir2_leafn_lookup_for_addname+368>
230    0xc02e25f0 <+908>:   ldr r11, [sp, #44]  ; 0x2c
231    0xc02e25f4 <+912>:   mvn r3, #0
232    0xc02e25f8 <+916>:   b   0xc02e260c <xfs_dir2_leafn_lookup_for_addname+936>
233    0xc02e25fc <+920>:   mvn r3, #0
234    0xc02e2600 <+924>:   b   0xc02e260c <xfs_dir2_leafn_lookup_for_addname+936>

如上图:

78行:r2是存储对sp +66地址的内容,是对函数局部变量的引用,这里为对应leafhdr.count

79行:r7为index的值,r4为8*index,r4后面会用到。

81行:r7对应的index,该行比较index 与leafhdr.count的大小

85行:如果index >= leafhdr.count,跳转到地址0xc02e25fc <xfs_dir2_leafn_lookup_for_addname+920>处执行。

gdb 查看 0xc02e25fc对应的代码行。

(gdb) l *(0xc02e25fc)
0xc02e25fc is in xfs_dir2_leafn_lookup_for_addname (fs/xfs/xfs_dir2_node.c:678).
673				if (be16_to_cpu(bests[fi]) >= length)
674					goto out;
675			}
676		}
677		/* Didn't find any space */
678		fi = -1;
679	out:
680		ASSERT(args->op_flags & XFS_DA_OP_OKNOENT);
681		if (curbp) {
682			/* Giving back a free block. */
(gdb) 

可以看到地址0xc02e25fc对应678行,即xfs_dir2_leafn_lookup_for_addname()的for循环后面的第一行,即跳出for循环。

86行:r11为函数第一个参数args的地址,该语句取args变量偏移20处的u32放入r2,args结构体如下:

176 typedef struct xfs_da_args {
177     const __uint8_t *name;      /* string (maybe not NULL terminated) */
178     int     namelen;    /* length of string (maybe no NULL) */
179     __uint8_t   *value;     /* set of bytes (maybe contain NULLs) */
180     int     valuelen;   /* length of value */
181     int     flags;      /* argument flags (eg: ATTR_NOCREATE) */
182     xfs_dahash_t    hashval;    /* hash value of name */
183     xfs_ino_t   inumber;    /* input/output inode number */
184     struct xfs_inode *dp;       /* directory inode to manipulate */
185     xfs_fsblock_t   *firstblock;    /* ptr to firstblock for bmap calls */
186     struct xfs_bmap_free *flist;    /* ptr to freelist for bmap_finish */
187     struct xfs_trans *trans;    /* current trans (changes over time) */
188     xfs_extlen_t    total;      /* total blocks needed, for 1st bmap */
189     int     whichfork;  /* data or attribute fork */
190     xfs_dablk_t blkno;      /* blkno of attr leaf of interest */
191     int     index;      /* index of attr of interest in blk */
192     xfs_dablk_t rmtblkno;   /* remote attr value starting blkno */
193     int     rmtblkcnt;  /* remote attr value block count */
194     xfs_dablk_t blkno2;     /* blkno of 2nd attr leaf of interest */
195     int     index2;     /* index of 2nd attr in blk */
196     xfs_dablk_t rmtblkno2;  /* remote attr value starting blkno */
197     int     rmtblkcnt2; /* remote attr value block count */
198     int     op_flags;   /* operation flags */
199     enum xfs_dacmp  cmpresult;  /* name compare result for lookups */
200 } xfs_da_args_t;

可见,偏移20处对应的为hashval,r2的为args->hashval。

88行:r5为ents数组的首地址,r4在79行为8*index,因为index和lep循环的时候同时++,但lep结构体大小为8,所以lep相对于ents数组的

绝对偏移为8*index,该语句之后r4为当前lep的绝对地址。lep结构体:

492 /*   
493  * Leaf block entry.
494  */  
495 typedef struct xfs_dir2_leaf_entry {
496     __be32          hashval;    /* hash value of name */
497     __be32          address;    /* address of data entry */
498 } xfs_dir2_leaf_entry_t;
499 
500 /*   

89:这里和87行结合来看,87行r4+8,这里取r4-8,相当于,取当前lep的头四个字节到r3,即r3为lep->hashval。

90:rev为四个字节翻转指令,对应b32_to_cpu,现在r3为b32_to_cpu(lep->hashval)。

91:比较r2 r3,即比较args->hashval 与b32)_to_cpu(lep->hashval)的大小。

92:如果不相等,跳到0xc02e25fc地址,前面说过,这是for外面的第一行,即跳出for循环。

95:结合87行,r3取r4+4处四个字节,即lep->address。

96:r3为b32_to_cpu(lep->address)。

97:对应函数如下行:

 615         if (be32_to_cpu(lep->address) == XFS_DIR2_NULL_DATAPTR)
 616             continue;

XFS_DIR2_NULL_DATAPTR宏为0.

如果等于,跳转0xc02e25dc处执行,对应汇编代码的225行,以下走的是continue分支的语句。

225行:取leafhdr.count到r3。

226行:index++;

227行:lep=lep+8;

228行:比较leafhdr.count和index。

229行:if(leafhdr.count > index)就是if(index<leafhdr.count ),跳转到0xc02e23d4,继续循环。

0xc02e23d4地址对应汇编95行的代码,继续继续执行如下语句:

 615         if (be32_to_cpu(lep->address) == XFS_DIR2_NULL_DATAPTR)
 616             continue;

注意,这里继续循环下一轮判断的时候只检查了if(index<leafhdr.count )而没有检查b32_to_cpu(lep->hashval==args->hashval。

 609     for (lep = &ents[index];
 610          index < leafhdr.count && be32_to_cpu(lep->hashval) == args->hashval;
 611          lep++, index++) {
 612         /*
 613          * Skip stale leaf entries.
 614          */
 615         if (be32_to_cpu(lep->address) == XFS_DIR2_NULL_DATAPTR)
 616             continue;

如上,在走continue分支的时候,&&后面的表达式被优化掉了。也可以分析下正常的不走continue分支的况,每一轮for循环结束时,

依然走到汇编225行,&&后面的表达式依然不起作用。但第一次for循环的时候&&后面的是起作用的,没有被优化掉。

结论,此上是Amlogic的arm-linux-gnueabihf-gcc4.9编译的,存在优化bug。

后来,林老师提供了,arm-linux-gnueabihf-gcc4.9.4的交叉编译工具,for循环没有再被优化。

下面分析下最新的编译工具生成的xfs_dir2_leafn_lookup_for_addname()函数反汇编代码。

(5)对比新工具链生成的xfs_dir2_leafn_lookup_for_addname()函数反汇编代码

主要分析for循环相关的汇编代码:

 80    0xc02e0174 <+308>:   ldrh    r0, [sp, #66]   ; 0x42
 81    0xc02e0178 <+312>:   lsl r4, r8, #3
 82    0xc02e017c <+316>:   ldr r3, [r10, #4]
 83    0xc02e0180 <+320>:   cmp r8, r0
 84    0xc02e0184 <+324>:   add r3, r3, #18
 85    0xc02e0188 <+328>:   bic r3, r3, #7
 86    0xc02e018c <+332>:   str r3, [sp, #28]
 87    0xc02e0190 <+336>:   bge 0xc02e03e4 <xfs_dir2_leafn_lookup_for_addname+932>
 88    0xc02e0194 <+340>:   add r4, r4, #8
 89    0xc02e0198 <+344>:   mvn r12, #0
 90    0xc02e019c <+348>:   add r4, r5, r4
 91    0xc02e01a0 <+352>:   b   0xc02e03c4 <xfs_dir2_leafn_lookup_for_addname+900>
 92    0xc02e01a4 <+356>:   ldr r3, [r4, #-4]
 93    0xc02e01a8 <+360>:   rev r3, r3
 94    0xc02e01ac <+364>:   cmp r3, #0
 95    0xc02e01b0 <+368>:   beq 0xc02e03b4 <xfs_dir2_leafn_lookup_for_addname+884>
228    0xc02e03c4 <+900>:   ldr r2, [r10, #20]
229    0xc02e03c8 <+904>:   ldr r3, [r4, #-8]
230    0xc02e03cc <+908>:   rev r3, r3
231    0xc02e03d0 <+912>:   cmp r2, r3
232    0xc02e03d4 <+916>:   beq 0xc02e01a4 <xfs_dir2_leafn_lookup_for_addname+356>
233    0xc02e03d8 <+920>:   b   0xc02e03e4 <xfs_dir2_leafn_lookup_for_addname+932>
 92    0xc02e01a4 <+356>:   ldr r3, [r4, #-4]
 93    0xc02e01a8 <+360>:   rev r3, r3
 94    0xc02e01ac <+364>:   cmp r3, #0
 95    0xc02e01b0 <+368>:   beq 0xc02e03b4 <xfs_dir2_leafn_lookup_for_addname+884>
224    0xc02e03b4 <+884>:   add r8, r8, #1
225    0xc02e03b8 <+888>:   add r4, r4, #8
226    0xc02e03bc <+892>:   cmp r0, r8
227    0xc02e03c0 <+896>:   ble 0xc02e03e4 <xfs_dir2_leafn_lookup_for_addname+932>
228    0xc02e03c4 <+900>:   ldr r2, [r10, #20]
229    0xc02e03c8 <+904>:   ldr r3, [r4, #-8]
230    0xc02e03cc <+908>:   rev r3, r3
231    0xc02e03d0 <+912>:   cmp r2, r3
232    0xc02e03d4 <+916>:   beq 0xc02e01a4 <xfs_dir2_leafn_lookup_for_addname+356>
233    0xc02e03d8 <+920>:   b   0xc02e03e4 <xfs_dir2_leafn_lookup_for_addname+932>

80行:r0为leafhdr.count

81行:r8为index

83行:比较index与eafhdr.count大小

87行:如果 index >= leafhdr.count,跳转到0xc02e03e4处执行。看看0xc02e03e4对应的函数代码:

(gdb) l *(0xc02e03e4)
0xc02e03e4 is in xfs_dir2_leafn_lookup_for_addname (fs/xfs/xfs_dir2_node.c:678).
673				if (be16_to_cpu(bests[fi]) >= length)
674					goto out;
675			}
676		}
677		/* Didn't find any space */
678		fi = -1;
679	out:
680		ASSERT(args->op_flags & XFS_DA_OP_OKNOENT);
681		if (curbp) {
682			/* Giving back a free block. */
(gdb) 

可见是对应for循环外的第一行,即跳出for循环。

88-90:如果index < eafhdr.count,lep=lep+8,跳转到0xc02e03c4处执行,即汇编代码228行。

228-232:判断be32_to_cpu(lep->hashval) 与args->hashval是否相等,如果相等跳转到0xc02e01a4处执行,否则跳转到0xc02e03e4处,结束循环。

0xc02e01a4出应汇编92行;

92-95:对应c代码如下:

 615         if (be32_to_cpu(lep->address) == XFS_DIR2_NULL_DATAPTR)
 616             continue;

如果条件成立,跳转到0xc02e03b4处执行,继续循环。0xc02e03b4对应汇编224行。

224-225行:index++;lep=lep+8;

226-227行:如果leafhdr.count < = index,跳转到0xc02e03e4处,结束for循环。

228-232行:如果leafhdr.count > index,那么判断b32_to_cpu(lep->hashval与args->hashval是否相等。

注意这里没有优化掉b32_to_cpu(lep->hashval与args->hashval是否相等的判断。

如果相等,跳转到0xc02e01a4处继续循环,如果不相等,跳转到0xc02e03e4处,跳出循环。

以上试分析了continue分支,正常分之也会判断b32_to_cpu(lep->hashval与args->hashval是否相等,不再分析。

可见,林老师提供的arm-linux-gnueabihf-gcc4.9.4的交叉编译工具,for循环没有再被优化。

结论,编译器优化,不能改变程序逻辑和函数输出,否则就是喧宾夺主,应该是编译器的bug。

猜你喜欢

转载自blog.csdn.net/hjkfcz/article/details/82082901