字符串资源

字符串资源，特别是多语言资源，有些Windows应用将字符串资源保存在微软可执行文件中。如何将这些字符串资源从可执行文件中提取出来？

Python中，有开源的pefile，可以从网上下载，https://github.com/erocarrera/pefile。

安装后，可以通过下面的代码，快速提取出可执行文件中的多语言字符串，这些字符串被保存到strings.txt文件中：

#!/usr/bin/env pathon3
#! -*- encoding:utf-8 -*-
import pefile
  
def pe_extract_strings(pathfile):
    '''
    .rcsc资源段(section)中，字符串是按照类型(6),名称（即编号，每个段中最多包含16个连续的字符串id）,语言（language）三个dictionary来查找的。
    '''
    
    pe = pefile.PE(pathfile)
    # The List will contain all the extracted Unicode strings
    #
    strings = list()
    
    # Fetch the index of the resource directory entry containing the strings
    #
    rt_string_idx = [
      entry.id for entry in 
      pe.DIRECTORY_ENTRY_RESOURCE.entries].index(pefile.RESOURCE_TYPE['RT_STRING'])
    
    # Get the directory entry
    #
    rt_string_directory = pe.DIRECTORY_ENTRY_RESOURCE.entries[rt_string_idx]
    
    # For each of the entries (which will each contain a block of 16 strings)
    #
    for name_entry in rt_string_directory.directory.entries:
        print("DataIsDir:", name_entry.struct.DataIsDirectory, "Name:", name_entry.struct.Name, "Id:", name_entry.struct.Id, "OffsetToData:", hex(name_entry.struct.OffsetToData), "OffsetToDirectory:", hex(name_entry.struct.OffsetToDirectory))
        name_id = (name_entry.struct.Id-1)*16
        print("Name index:", name_id)
        
        for lang_entry in name_entry.directory.entries:
            print("Lang id:", lang_entry.struct.Id)
            data_rva = lang_entry.data.struct.OffsetToData
            size = lang_entry.data.struct.Size
            print( 'Directory entry at RVA', hex(data_rva), 'of size', hex(size) )
            
            data = pe.get_memory_mapped_image()[data_rva:data_rva+size]
            offset = 0
            lang_name_id = name_id
            while True:
                # Exit once there's no more data to read
                if offset>=size:
                  break
                # Fetch the length of the unicode string
                #
                ustr_length = pe.get_word_from_data(data[offset:offset+2], 0)
                offset += 2
            
                # If the string is empty, skip it
                if ustr_length==0:
                  lang_name_id += 1
                  continue
            
                # Get the Unicode string
                #
                ustr = bytes('<'+ str(lang_entry.struct.Id) +'_'+ str(lang_name_id) + '>', encoding='utf-8') + pe.get_string_u_at_rva(data_rva+offset, max_length=ustr_length)
                offset += ustr_length*2
                strings.append(ustr)
                print( 'String of length', ustr_length, 'at offset', offset )
                lang_name_id += 1
            
    with open('strings.txt', 'wb') as f_strings:
        for l in strings:
            f_strings.write(l)
            f_strings.write(b'\n')
                
if __name__ == '__main__':
    pe_extract_strings('可执行文件路径')

导出Windows可执行文件中的字符串资源

导出Windows 可执行文件中的字符串资源

字符串资源

猜你喜欢