03扫描模块搜索路径

import sys,os,pprint
trace=0 #1代表目录 ，2代表加上文件

visited={}
allsizes=[]
for srcdir in sys.path:
    for (thisDir,subsHere,filesHere) in os.walk(srcdir):
        if trace>0:print(thisDir)
        thisDir=os.path.normpath(thisDir)
        fixcase=os.path.normcase(thisDir)
        if fixcase in visited:
            continue
        else:
            visited[fixcase]=True

        for filename in filesHere:
            if filename.endswith('.py'):
                if trace>1:print('...',filename)
                pypath=os.path.join(thisDir,filename)
                try:
                    pysize=os.path.getsize(pypath)
                except os.error:
                    print('skipping',pypath,sys.exc_info())
                else:
                    pylines=len(open(pypath,'rb').readlines())
                allsizes.append((pysize,pylines,pypath))
print('By size...')
allsizes.sort()
pprint.pprint(allsizes[:3])
pprint.pprint(allsizes[-3:])

print('By lines...')
allsizes.sort(key=lambda x:x[1])
pprint.pprint(allsizes[:3])
pprint.pprint(allsizes[-3:])

　　运行的时候，这个脚本遍历模块导入路径及其下所有有效的目录，试图对这棵树进行整体搜索。事实上，它包含了三层嵌套循环，分别针对路径下的每一项，该项的每个目录，以及该目录下的每个文件。因为模块路径可能包括随意命名的目录，所以在搜索过程中脚本必须注意：

　　统一目录路径格式。解决斜杠和句点的问题，将目录统一成一种风格。

　　统一目录名大小写。在对大小写不敏感的windows系统下转化成小写

　　检测重复情况以避免同一个目录访问两次（同一目录可能通过多条路径从sys.path链接到）

　　在二进制模式下打开文件以获取行数，可以避免文件内中潜在的Unicode解码错误。

　　这个版本的脚步还添加了行计数器，这一点也可能延长脚步运行的时间，不过报告这个数字是个有用的功能。　　

03扫描模块搜索路径

猜你喜欢