GCC源码分析（五）——指令生成

原文链接：http://blog.csdn.net/sonicling/article/details/8246231

一、前言

　　又有好久没写了，的确很忙。前篇介绍了GCC的pass格局，它是GCC中间语言部分的核心架构，也是贯穿整个编译流程的核心。在完成优化处理之后，GCC必须做的最后一步就是生成最后的编译结果，通常情况下就是汇编文件（文本或者二进制并不重要）。

　　前面也讲到了，GCC中间语言的核心数据结构是GENERIC、GIMPLE和RTL。其中的RTL就是和指令紧密相关的一种结构，它是指令生成的起点。

二、RTL和INSN

2.1 什么是RTL，什么是INSN

　　RTL叫做寄存器转移语言（Register Transfering Language）。说是寄存器，其实也包含内存操作。RTL被设计成一种函数式语言，由表达式和对象构成。其中对象指的是寄存器、内存和值（常数或者表达式的值），表达式就是对对象和子表达式的操作。这些在gcc internal里面都有介绍。

　　RTL对象和操作组成RTL表达式，子表达式加上操作组成复合RTL表达式。当一个RTL表达式表示一条中间语言指令时，这个RTL表达式叫做INSN。RTL表达式（RTL Expression）在gcc代码中缩写为RTX，代码中的rtx类型就是指向RTL表达式的指针。所以insn就是rtx，但是rtx不一定是insn。

2.2 INSN的生成

　　RTL是由gimple生成的，从gimple到RTL的转换叫做“expand”。在整个优化的pass链中，这一步由pass_expand完成。该pass实现在gcc/cfgexpand.c中。它的execute函数gimple_expand_cfg很长，但是核心工作是对每个basic block进行转换：

[cpp] view plaincopy 
      
    
 FOR_BB_BETWEEN (bb, init_block->next_bb, EXIT_BLOCK_PTR, next_bb)  
   bb = expand_gimple_basic_block (bb);  

expand_gimple_basic_block会调用expan_gimple_stmt来展开每一个gimple语句，并将展开后的rtx连接在一起。首先就有一个问题：insn是怎么生成的？

　　此外，每个expand_xxx函数只负责一部分工作，有些函数有rtx类型的返回值，有些函数没有返回值。那些有返回值的函数通常也不会有变量来保存它们返回的insn。那么就有另外一个问题：那些展开的insn到哪里去了？

　　为了弄清楚这两个问题，首先要找到生成insn的地方。这是一项工程浩大的体力活，不妨从某个点来研究这个问题，比如就从函数调用的语句来入手吧。我们可以从expand_gimple_basic_block开始顺藤摸瓜，来看看一个GIMPLE_CALL是如何翻译成insn的。

　　首先，expand_gimple_basic_block里有一个对basic block里的gimple statement的遍历循环，在这个循环里面，首先判断了一些特殊的情况，比如debug之类的，忽略之。直到循环最后一部分才进入正题：

[cpp] view plaincopy 
      
    
  if (is_gimple_call (stmt) && gimple_call_tail_p (stmt)) // 尾调用，特殊情况，忽略之  
    {  
      bool can_fallthru;  
      new_bb = expand_gimple_tailcall (bb, stmt, &can_fallthru);  
      if (new_bb)  
 {  
   if (can_fallthru)  
     bb = new_bb;  
   else  
     return new_bb;  
 }  
    }  
  else  
    {  
      def_operand_p def_p;  
      def_p = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF);  
   
      if (def_p != NULL)  
 {  
   /* Ignore this stmt if it is in the list of 
      replaceable expressions.  */  
   if (SA.values  
       && bitmap_bit_p (SA.values,  
                SSA_NAME_VERSION (DEF_FROM_PTR (def_p))))  
     continue;  
 }  
      last = expand_gimple_stmt (stmt); //<strong> </strong>这是真正干活的地方  
      maybe_dump_rtl_for_gimple_stmt (stmt, last);  
    }  

　　进入到expand_gimple_stmt里面，这个函数不长，一眼可以看出来，核心是expand_gimple_stmt_1 (stmt);，这个函数分情况展开了stmt。其中GIMPLE_CALL对应的是expand_call_stmt。这个函数也不长，关键在最后。

[cpp] view plaincopy 
      
    
 if (lhs)  
   expand_assignment (lhs, exp, false); // lhs = func(args)  
 else  
   expand_expr_real_1 (exp, const0_rtx, VOIDmode, EXPAND_NORMAL, NULL); // func(args)  

　　gimple call语句形如 lhs = func ( args ); 。其中，lhs是可以没有的。所以如果存在lhs的话，就按赋值语句展开。否则的话就按表达式展开。赋值语句的右边也是表达式，因此按赋值语句展开最终也会将“func(args)”部分按表达式展开。

　　expand_gimple_expr_1函数很长，因为要处理的表达式类型比较多。其中我们关注的是case CALL_EXPR:分支：

[cpp] view plaincopy 
      
    
    case CALL_EXPR:  
      /* All valid uses of __builtin_va_arg_pack () are removed during 
  inlining.  */  
      if (CALL_EXPR_VA_ARG_PACK (exp))  
 error ("%Kinvalid use of %<__builtin_va_arg_pack ()%>", exp);  
      {  
 tree fndecl = get_callee_fndecl (exp), attr;  
   
 if (fndecl  
     && (attr = lookup_attribute ("error",  
                  DECL_ATTRIBUTES (fndecl))) != NULL)  
   error ("%Kcall to %qs declared with attribute error: %s",  
      exp, identifier_to_locale (lang_hooks.decl_printable_name (fndecl, 1)),  
      TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr))));  
 if (fndecl  
     && (attr = lookup_attribute ("warning",  
                  DECL_ATTRIBUTES (fndecl))) != NULL)  
   warning_at (tree_nonartificial_location (exp),  
           0, "%Kcall to %qs declared with attribute warning: %s",  
           exp, identifier_to_locale (lang_hooks.decl_printable_name (fndecl, 1)),  
           TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr))));  
   
 /* Check for a built-in function.  */  
 if (fndecl && DECL_BUILT_IN (fndecl))  
   {  
     gcc_assert (DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_FRONTEND);  
     return expand_builtin (exp, target, subtarget, tmode, ignore); // 内置函数  
   }  
      }  
      return expand_call (exp, target, ignore); // 普通函数  

　　内置函数有内置函数的展开方法，这个以后有机会再讲。这里还是分析一下普通函数。前面的那个if 是用来检查的，展开是由expand_call函数来完成。这个函数相当长，因为函数的参数、堆栈等等事务很繁琐。但是至少可以确定的是，一句普通的函数调用绝对不是一个简单的insn能实现的，它应该对应了一串insn，而且至少包括压栈、调用、退栈这三部分。那么这一串insn在哪里？

　　为了弄清楚这一串insn在代码中的哪个地方，就必须提到start_sequence ()、get_insns()、end_sequence()这三个没有参数的函数。第一个函数开启了一个新的insn sequence，第二个函数获取这个sequence的第一个insn，因为sequence是双链表，所以由第一个insn就可以访问到后面的所有insn。最后一个函数关闭这个sequence，之后就不能再通过emit_xxx往这个sequence里面插入insn了。原因现在还说不清楚，因为这个跟第二个问题相关，就是insn去哪里了？

　　那么insn到哪里去了？在expand_call这个函数最后就有答案：

[cpp] view plaincopy 
      
    
 /* If tail call production succeeded, we need to remove REG_EQUIV notes on 
    arguments too, as argument area is now clobbered by the call.  */  
 if (tail_call_insns)  
   {  
     emit_insn (tail_call_insns); // 尾调用的rtx  
     crtl->tail_call_emit = true;  
   }  
 else  
   emit_insn (normal_call_insns); // 正常调用的rtx  
   
 currently_expanding_call--;  
   
 if (stack_usage_map_buf)  
   free (stack_usage_map_buf);  
   
 return target;  

　　所谓尾调用就相当于 return tail_call(...);。这个是有专门优化的。但不管怎么优化，最后的insn被发射（emit）了：

[cpp] view plaincopy 
      
    
 rtx  
 emit_insn (rtx x)  
 {  
   rtx last = last_insn;  
   rtx insn;  
   
   if (x == NULL_RTX)  
     return last;  
   
   switch (GET_CODE (x))  
     {  
     // 忽略那些特殊的case  
     default:  
       last = make_insn_raw (x);  
       add_insn (last); // 这里  
       break;  
     }  
   
   return last;  
 }  
 void  
 add_insn (rtx insn) // 一个标准的双链表插入算法  
 {  
   PREV_INSN (insn) = last_insn;  
   NEXT_INSN (insn) = 0;  
   
   if (NULL != last_insn)  
     NEXT_INSN (last_insn) = insn;  
   
   if (NULL == first_insn)  
     first_insn = insn;  
   
   last_insn = insn;  
 }  

其中first_insn和last_insn是宏定义：

[cpp] view plaincopy 
      
    
 #define first_insn (crtl->emit.x_first_insn)  
 #define last_insn (crtl->emit.x_last_insn)  
   
 /* Datastructures maintained for currently processed<strong> function</strong> in RTL form.  */  
 struct rtl_data x_rtl;  
   
 // 在function.h中定义的宏  
 #define crtl (&x_rtl)  

　　原来，生成的insns被插入了当前函数的insn链表中。这个链表包含了当前函数的所有insn，而且是按存储顺序存放的。如果有跳转的话，会有对应的jump insn和label insn。如果把insn就看作是汇编的话，这个链表其实就是“汇编”序列了。

　　ok，回到前面提到的start_sequence/get_insns/end_sequence这一组函数。由于emit_xxx函数都是向first_insn/last_insn插入，而新的sequence也要借助于emit_xxx来插入，也就是说在start_sequence和end_sequence这两个调用中间，所有的emit_xxx必须向这个sequence发射insn。方法只有一个：那就是让first_insn/last_insn指向当前正在构建的sequence，当这个sequence构建完成之后，再把它还原。（相当笨拙而无奈的设计，因为emit_xxx数量众多，不容得罪）

　　至此，insn去哪里的问题解决了，但是第一个问题还在：insn如何被构建出来的？继续顺藤摸瓜。在expand_call函数中，有一句特别显眼：

[cpp] view plaincopy 
      
    
  /* Generate the actual call instruction.  */  
  emit_call_1 (funexp, exp, fndecl, funtype, unadjusted_args_size,  
 adjusted_args_size.constant, struct_value_size,  
 next_arg_reg, valreg, old_inhibit_defer_pop, call_fusage,  
 flags, & args_so_far);  

　　看不懂代码，看注释也明白了，这不就是生成一个call insn吗？进入看看：

[cpp] view plaincopy 
      
    
 #if defined (HAVE_call) && defined (HAVE_call_value)  
   if (HAVE_call && HAVE_call_value)  
     {  
       if (valreg)  
     emit_call_insn (GEN_CALL_VALUE (valreg,  
                     gen_rtx_MEM (FUNCTION_MODE, funexp),  
                     rounded_stack_size_rtx, next_arg_reg,  
                     NULL_RTX));  
       else  
     emit_call_insn (GEN_CALL (gen_rtx_MEM (FUNCTION_MODE, funexp),  
                   rounded_stack_size_rtx, next_arg_reg,  
                   GEN_INT (struct_value_size)));  
     }  
   else  
 #endif  

　　这只是emit_call_1的一小部分。gen_rtx_MEM就是创建一个内存地址对应的rtx，这里用来获取被调用的函数地址（注意，这里的地址使用符号表示，因为函数到底会被安排在哪里目前还不知道，给它安排个符号，让汇编器和连接器去翻译成真实的地址）。那么这个GEN_CALL是什么？至少在gcc 被 built 之前是不知道的。但是可以告诉你的是，它由一个叫做Machine Description的东西来决定。这里的GEN_CALL调用的是gen_call函数，这个函数定义在insn-emit.c中，而这个文件实在build的时候由Machine Description生成的。在i386平台的Machine Description中，gen_call函数转而去调用ix86_expand_call，因此真正的call insn是由这个函数来完成的。而这个函数又调用了一堆 gen_rtx_XXX来组装insn，这一堆gen_rtx_XXX是从gcc/rtl.def文件自动生成的。

　　rtl.def 文件是由一串宏组成的，这个宏形如DEF_RTL_EXPR(ENUM, NAME, FORMAT, CLASS)。ENUM是枚举名，gen_rtx_XXX中的XXX部分就是这个枚举名；NAME是识别名，用在其他地方识别rtl；FORMAT是参数格式，代表这个rtx有多少个参数，每个参数是什么类型。比如0代表常数0，e代表表达式等等。CLASS是类型。

　　在gcc目录下有个叫做gengenrtl.c的文件，他有自己的main函数，所以是一个独立的程序。该程序就是将rtl.def翻译成genrtl.h和genrtl.c两个文件，前者声明了gen_rtx_XXX到gen_rtx_fmt_FFF_stat的对应关系，其中FFF就是宏里面的FORMAT参数，gen_rtx_CALL对应的就是gen_rtx_fmt_ee_stat；后者定义了gen_rtx_fmt_FFF_stat的实现。

[cpp] view plaincopy 
      
    
 /* Write the declarations for the routine to allocate RTL with FORMAT.  */  
   
 static void  
 gendecl (const char *format) // <strong>为每个gen_rtx_fmt_FFF_stat创建声明</strong>  
 {  
   const char *p;  
   int i, pos;  
   
   printf ("extern rtx gen_rtx_fmt_%s_stat\t (RTX_CODE, ", format);  
   printf ("enum machine_mode mode");  
   
   /* Write each parameter that is needed and start a new line when the line 
      would overflow.  */  
   for (p = format, i = 0, pos = 75; *p != 0; p++)  
     if (*p != '0')  
       {  
     int ourlen = strlen (type_from_format (*p)) + 6 + (i > 9);  
   
     printf (",");  
     if (pos + ourlen > 76)  
       printf ("\n\t\t\t\t      "), pos = 39;  
   
     printf (" %sarg%d", type_from_format (*p), i++);  
     pos += ourlen;  
       }  
   printf (" MEM_STAT_DECL");  
   
   printf (");\n");  
   printf ("#define gen_rtx_fmt_%s(c, m", format);  // <strong>定义gen_rtx_fmt_FFF 到 gen_rtx_fmt_FFF_stat</strong>  
   for (p = format, i = 0; *p != 0; p++)  
     if (*p != '0')  
       printf (", p%i",i++);  
   printf (")\\\n        gen_rtx_fmt_%s_stat (c, m", format);  
   for (p = format, i = 0; *p != 0; p++)  
     if (*p != '0')  
       printf (", p%i",i++);  
   printf (" MEM_STAT_INFO)\n\n");  
 }  
   
 /* Generate macros to generate RTL of code IDX using the functions we 
    write.  */  
   
 static void  
 genmacro (int idx)  
 {  
   const char *p;  
   int i;  
   
   /* We write a macro that defines gen_rtx_RTLCODE to be an equivalent to 
      gen_rtx_fmt_FORMAT where FORMAT is the RTX_FORMAT of RTLCODE.  */  
   
   if (excluded_rtx (idx))  
     /* Don't define a macro for this code.  */  
     return;  
   
   printf ("#define gen_rtx_%s%s(MODE",  
        special_rtx (idx) ? "raw_" : "", defs[idx].enumname); // <strong>定义gen_rtx_ENUM 到 gen_rtx_fmt_FFF</strong>  
   
   for (p = defs[idx].format, i = 0; *p != 0; p++)  
     if (*p != '0')  
       printf (", ARG%d", i++);  
   
   printf (") \\\n  gen_rtx_fmt_%s (%s, (MODE)",  
       defs[idx].format, defs[idx].enumname);  
   
   for (p = defs[idx].format, i = 0; *p != 0; p++)  
     if (*p != '0')  
       printf (", (ARG%d)", i++);  
   
   puts (")");  
 }  
   
 /* Generate the code for the function to generate RTL whose 
    format is FORMAT.  */  
   
 static void  
 gendef (const char *format) // <strong>为每个gen_rtx_fmt_FFF_stat创建定义</strong>  
 {  
   const char *p;  
   int i, j;  
   
   /* Start by writing the definition of the function name and the types 
      of the arguments.  */  
   
   printf ("rtx\ngen_rtx_fmt_%s_stat (RTX_CODE code, enum machine_mode mode", format);  
   for (p = format, i = 0; *p != 0; p++) // <strong>遍历format中的字符，每个字符对应一个参数</strong>  
     if (*p != '0')  
       printf (",\n\t%sarg%d", type_from_format (*p), i++);  
   
   puts (" MEM_STAT_DECL)");  
   
   /* Now write out the body of the function itself, which allocates 
      the memory and initializes it.  */  
   puts ("{");  
   puts ("  rtx rt;");  
   puts ("  rt = rtx_alloc_stat (code PASS_MEM_STAT);\n");  
   
   puts ("  PUT_MODE (rt, mode);");  
   
   for (p = format, i = j = 0; *p ; ++p, ++i)  // <strong>每个参数对应一个insn成员赋值语句。</strong>  
     if (*p != '0')  
       printf ("  %s (rt, %d) = arg%d;\n", accessor_from_format (*p), i, j++);  
     else  
       printf ("  X0EXP (rt, %d) = NULL_RTX;\n", i);  
   
   puts ("\n  return rt;\n}\n");  
 }  

　　所以总的说来，一个insn自底向上的构建的话，先由rtl.def构建原子的rtx，然后由Machine Description组装insn或者insn 序列。

2.3 Basic Block中的insn

　　前面提到过，basic block中有两套指令系统：gimple和RTL。那么basic block中的RTL是从哪里来的呢？还是回到expand_gimple_basic_block函数：

[cpp] view plaincopy 
      
    
     if (stmt || elt)  
     {  
       last = get_last_insn ();  
   
       // 此处省略若干字  
   
       /* Java emits line number notes in the top of labels. 
      ??? Make this go away once line number notes are obsoleted.  */  
       BB_HEAD (bb) = NEXT_INSN (last);  
       if (NOTE_P (BB_HEAD (bb)))  
     BB_HEAD (bb) = NEXT_INSN (BB_HEAD (bb)); // <strong>看这里</strong>  
       note = emit_note_after (NOTE_INSN_BASIC_BLOCK, BB_HEAD (bb));  
   
       maybe_dump_rtl_for_gimple_stmt (stmt, last);  
     }  
   else  
     note = BB_HEAD (bb) = emit_note (NOTE_INSN_BASIC_BLOCK); // <strong>或者这里</strong>  
   
   // 此处省略1000字  
   
   last = get_last_insn ();  
   if (BARRIER_P (last))  
     last = PREV_INSN (last);  
   if (JUMP_TABLE_DATA_P (last))  
     last = PREV_INSN (PREV_INSN (last));  
   BB_END (bb) = last; // <strong>还有这里</strong>  

　　对应的，在函数体中间也有对BB_HEAD(bb)的赋值，是设置basic block的insn序列的起始。BB_HEAD 排除了基本块开头的LABEL，BB_END排除了基本块最后的跳转表。所以每个基本块的insn序列就是函数insn序列的子序列。不同基本块的insn序列不会相交，甚至可能不会连着，因为中间还隔着LABEL和跳转表。

　　pass_expand之后的pass基本上都是RTL Pass了。这些pass要么通过get_first_insn()/get_last_insn()来遍历整个函数的insn列表（包含Label和跳转），要么用FOREACH_BB、BB_HEAD、BB_END来遍历每个基本块内部的insn（不包含Label和跳转）。

三、Machine Description

　　针对每个CPU平台，gcc有对应的Machine Description用指导指令生成。这些代码放在gcc/config/<平台名称>的目录下，比如intel平台的在gcc/config/i386/。一个Machine Description文件是对应平台的核心，比如gcc/config/i386/i386.md文件。

　　一个md文件中可以定义很多东西，比如constant、attr、insn、expand等等。constant是给一个编号起一个名字，其他地方如果要用到这个编号，可以用名字代替。比如i386.md中每个寄存器有一个编号；attr是目标平台的属性，比如有些什么扩展指令集、有些什么功能、或者被禁用了那些功能等等；insn和expand是md文件的主体，用来定义insn，不同的是前者的输出是asm，用于指令生成；后者的输出是insn sequence；用于GIMPLE转RTL。

　　每个insn和expand有这么几个要素：名字、RTL模板、条件、输出模板。名字是insn的识别名，比如rtl.def中CALL的识别名是call，所以对应的insn就是md文件里的define_expand call；RTL模板是RTX的规格，它有两个作用：1.判断是否匹配某个insn，2.指出每个操作数的属性（大小、使用情况，前置后置条件）；条件被用来检查该insn的前置条件，如果不符合，那就有问题；输出模板是该insn的汇编输出格式，用于最后的指令发射。

　　要注意的是md文件定义的是insn pattern，具体的insn是由expand_xxx、emit_xxx、gen_rtx_xxx、gen_xxx那一堆函数生成的。所以md文件里的insn只有两个作用：1.检查insn；2.输出asm

　　那么md文件是如何融入到gcc中的呢？还是靠build！和前面讲的rtl.def生成genrtl.h、genrtl.c类似，md文件被一系列工具翻译成不同作用的代码：

[plain] view plaincopy 
      
    
 [root@localhost gcc]# ls insn-*.h   
 insn-attr.h insn-codes.h insn-config.h insn-constants.h insn-flags.h insn-modes.h   
 [root@localhost gcc]# ls insn-*.c   
 insn-attrtab.c insn-emit.c insn-modes.c insn-output.c insn-preds.c   
 insn-automata.c insn-extract.c insn-opinit.c insn-peep.c insn-recog.c  

　　这里只说三个文件：insn-recog.c包含了RTL模板匹配的代码，用来检查rtx的合法性；insn-emit.c包含了insn的构建代码；insn-output.c包含了insn对应的asm输出。这三个文件分别由gcc/genrecog.c、gcc/genemit.c 和 gcc/genoutput.c编译出来的三个程序来生成，不妨还是那上面的call来举例子：

[plain] view plaincopy 
      
    
 (define_expand "call"  
   [(call (match_operand:QI 0 "" "")  
      (match_operand 1 "" ""))  
    (use (match_operand 2 "" ""))]  
   ""  
 {  
   ix86_expand_call (NULL, operands[0], operands[1], operands[2], NULL, 0);  
   DONE;  
 })  

　　这个call insn要求第一个操作数是一个整数（QI），第二个和第三个参数自便，但是第三个参数是程序要使用的。从expand_call可以看出，第一个操作数是调用函数的地址，第二个操作数是参数堆栈大小，第三个操作数是参数列表（所有参数都在这第三个操作数里）。这个expand被用于gimple_call到insn的转换。

　　这条md定义被genemit工具转换成了一个叫做gen_call的函数，函数体中除了准备参数之外，最核心的就是调用ix86_expand_call。这是转换之后的结果：

[cpp] view plaincopy 
      
    
 /* /usr/src/develop/gcc-4.5.2/gcc/config/i386/i386.md:13574 */  
 rtx  
 gen_call (rtx operand0,  
         rtx operand1,  
         rtx operand2)  
 {  
   rtx _val = 0;  
   start_sequence ();  
   {  
     rtx operands[3];  
     operands[0] = operand0;  
     operands[1] = operand1;  
     operands[2] = operand2;  
 #line 13579 "/usr/src/develop/gcc-4.5.2/gcc/config/i386/i386.md"  
 {  
   ix86_expand_call (NULL, operands[0], operands[1], operands[2], NULL, 0); // expand 的输出代码会出现在gen_xxx函数中  
   DONE;  
 }  
     operand0 = operands[0];  
     operand1 = operands[1];  
     operand2 = operands[2];  
   }  
   emit_call_insn (gen_rtx_CALL (VOIDmode,  
         operand0,  
         operand1));  
   emit_insn (gen_rtx_USE (VOIDmode,  
         operand2));  
   _val = get_insns ();  
   end_sequence ();  
   return _val;  
 }  

这是一个expand，用来生成insn，所以没有对应的output。再看一个insn的例子：

[plain] view plaincopy 
      
    
 (define_insn "x86_fnstsw_1"  
   [(set (match_operand:HI 0 "register_operand" "=a")  
     (unspec:HI [(reg:CCFP FPSR_REG)] UNSPEC_FNSTSW))]  
   "TARGET_80387" // 只能在允许80387指令情况下使用  
   "fnstsw\t%0"  // asm指令模板  
   [(set (attr "length") (symbol_ref "ix86_attr_length_address_default (insn) + 2"))  
    (set_attr "mode" "SI")  
    (set_attr "unit" "i387")])  

转换成gen_xxx之后变成：

[cpp] view plaincopy 
      
    
 /* /usr/src/develop/gcc-4.5.2/gcc/config/i386/i386.md:1361 */  
 rtx  
 gen_x86_fnstsw_1 (rtx operand0 ATTRIBUTE_UNUSED)  
 {  
   return gen_rtx_SET (VOIDmode,  
         operand0,  
         gen_rtx_UNSPEC (HImode,  
         gen_rtvec (1,  
                 gen_rtx_REG (CCFPmode,  
         18)),  
         31));  
 }  

asm模板不会出现在gen_xxx中，因为这个函数pass_expand是用来构建insn的。asm模板会转换到insn-output.c中：

[cpp] view plaincopy 
      
    
   // struct insn_data 的初始化。  
   /* /usr/src/develop/gcc-4.5.2/gcc/config/i386/i386.md:1361 */  
   {  
     "x86_fnstsw_1",  
 #if HAVE_DESIGNATED_INITIALIZERS  
     { .single = // 单一的指令对应single，如果是多行指令，会生成对应的output函数，这里就是 .function = { output_nnn }  
 #else  
     {  
 #endif  
       "fnstsw\t%0", // ASM输出模板  
 #if HAVE_DESIGNATED_INITIALIZERS  
     },  
 #else  
     0,  
     0  
     },  
 #endif  
     (insn_gen_fn) gen_x86_fnstsw_1,  
     &operand_data[24],  
     1,  
     0,  
     1,  
     1  
   }  

四、指令生成

　　在优化的pass序列的最后，有一个叫做pass_final的RTL Pass，这个pass负责将RTL翻译为ASM。它的execute函数最核心的三行：

[cpp] view plaincopy 
      
    
 final_start_function (get_insns (), asm_out_file, optimize);  
 final (get_insns (), asm_out_file, optimize);  
 final_end_function ();  

　　第一行输出函数的头，包括函数的汇编说明、stack frame的建立。第二行输出指令序列；第三行结束函数，包括stack frame的销毁、结束说明等。

final函数遍历整个函数的insn序列，调用final_scan_insn输出每一个insn。这个函数太长，要处理note、debug、frame等等乱七八糟的东西。但是中间最关键的一段是调用Machine Description来输出ASM：

[cpp] view plaincopy 
      
    
     insn_code_number = recog_memoized (insn); // 找insn code number，就是insn的编号  
     cleanup_subreg_operands (insn);  
   
 // 此处省略若干行  
   
         /* Find the proper template for this insn.  */  
     templ = get_insn_template (insn_code_number, insn); // 获取define_insn的ASM输出模板  
   
     /* If the C code returns 0, it means that it is a jump insn 
        which follows a deleted test insn, and that test insn 
        needs to be reinserted.  */  
     if (templ == 0)  
       {  
         rtx prev;  
   
 // 继续省略若干行  
   
             return prev;  
       }  
   
     /* If the template is the string "#", it means that this insn must 
        be split.  */  
     if (templ[0] == '#' && templ[1] == '\0')  
       {  
         rtx new_rtx = try_split (body, insn, 0); // 去调用define_split  
   
 //又省略若干行  
   
             return new_rtx;  
       }  
   
 // 无关紧要的还是省略吧  
   
         /* Output assembler code from the template.  */  
     output_asm_insn (templ, recog_data.operand); // 按照模板输出asm  

　　指令生成的最关键一步是这段代码的第一个工作：识别insn。这一个工作很令人费解：既然insn是由md来生成的，那么生成的时候就应该知道这个insn该由md里面的哪一条定义提供asm输出，为什么还要识别呢？因为有的insn并不是全靠RTL来生成。就比如上面说的call，虽然他提供了expand的方法，但是真实的工作是由定义在gcc/config/i386/i386.c文件中的ix86_expand_call函数来完成。这个函数手工生成了一系列insn来完成函数调用的工作，那么这些insn如何来输出？

　　所以gcc提供了genrecog生成recog函数来完成insn的识别。识别的方法就是将md文件中的所有RTL表达式当作模式串集合，看真实的insn复合哪一个RTL表达式，那么这个insn就有对应的定义输出。recog函数返回对应insn的编号，然后按这个编号去找md的定义，并找到asm输出模板，于是有了上面这段输出代码。

　　recog函数的核心就是一棵硬编码的决策树。genrecog首先会扫描全部的md定义，抽取所有的RTL模式串，分解为一串predicates，然后将这些predicates插入到决策树中。recog函数就是一边输入未知insn的predicates，一边从树根开始做决策（其实就是跳转），直到遇到树叶完成决策。

　　在此之后的两个pass只是清理一下数据结构。由此整个pass链调用完毕，gcc完成了从GENERIC到GIMPLE，再到RTL，最后到ASM的转换。

五、总结

　　这个系列对gcc从输入到输出的流程进行了粗略的分析。一个编译器最核心的是优化部分。具体的优化步骤在本系列中没有提到，因为太多、太繁琐、也太理论。以后可以考虑把教科书中提到的优化挑出来分析一下，但最近是没有时间了，就此告一段落。