Liangmeng Yao 20190919-3 Effectiveness Analysis

This job requires See https://edu.cnblogs.com/campus/nenu/2019fall/homework/7628

git address https://e.coding.net/hahaa/wf.git

Requirements 0 to war and peace as the input file, and re-read by the file system to read into. Three consecutive runs, given time, CPU consumption of each parameter. (2 minutes)

First run time

 

 

 Second run time

 

 

 The third time running

 

 

 

 

Requirements 1 shows the bottleneck in your program's guess. Do you think there will be optimized for best results, or last week's optimized here (or take into account optimization, and therefore worse code not written).

要求 给出代码片断,并说明为什么你会认为此处是瓶颈,预计优化会有达到多好的效果
1.第一处我猜测获取文件中的单词个数的时候,由于上一周我的正则表达式出现了一点错误,导致统计字数不对,这周我修改了,修改方法是:先将所有单词转换为小写,然后再筛选。
但我认为此处还可以优化,原因是将大写字母转换为小写字母在进行筛选会很浪费时间。预计优化后运行时间会快那么一点点。

2.第二处是功能四的函数,为了输出格式问题,我又将单词统计的方法重新书写了一遍,如果考虑优化的话,这个地方还可以修改,可直接运用单词统计的封装函数
def redirect(txt): #功能四
    words = re.findall(r'[a-z0-9^-]+', txt.lower())
    user_counters=Counter(words)
    total=0
    for user_counter in user_counters:
        total+=1
    print("total %d words\n"%total)
    lsts=user_counters.most_common(10)
    for lst in lsts:
        print("%s  %d"%(lst[0],lst[1]))
 

Requirement 2 bottlenecks profile to find out the program. We are given the most time running three functions (or code fragment). Requirements include a screenshot. (5 points)

要求 分析为什么此处是瓶颈。

要求 profile需要得到函数的运行时间和次数。仅得到CPU和内存占用,不得分。

 

 

 

 

 Snippet:

def redirect(txt): #功能四
    words = re.findall(r'[a-z0-9^-]+', txt.lower())
    user_counters=Counter(words)
    total=0
    for user_counter in user_counters:
        total+=1
    print("total %d words\n"%total)
    lsts=user_counters.most_common(10)
    for lst in lsts:
        print("%s  %d"%(lst[0],lst[1]))

The bottleneck in claim 3, "best effort" to optimize the performance of the program.

1. before optimization:

 

def redirect(txt): #功能四
    words = re.findall(r'[a-z0-9^-]+', txt.lower())
    user_counters=Counter(words)
    total=0
    for user_counter in user_counters:
        total+=1
    print("total %d words\n"%total)
    lsts=user_counters.most_common(10)
    for lst in lsts:
        print("%s  %d"%(lst[0],lst[1]))

 

After optimization:
The findAll () function is optimized, encapsulation function.

def redirect(txt): #功能四
    words = re.findall(r'[a-zA-Z0-9^-]+', txt)
    coTotal(words)

2. Before optimization

DEF file_name (path): 
    path = path + ' .txt ' 
    the try : 
        with Open (path, encoding = ' UTF-. 8 ' ) AS F: 
            Content = reached, f.read ()
     the except FileNotFoundError: # exception handling, file not found, output file does not exist 
        MSG = " of The file " + + path " does not exist. " 
        Print (MSG)
     the else : 
        words = the re.findall (R & lt ' [A-Z0-9 ^ -] + ' , content.lower ()) 
        coTotal (words)

Optimized

DEF file_name (path): # function implemented two, enter a file name without a suffix 
    path = path + ' .txt ' 
    with Open (path, encoding = ' UTF-. 8 ' ) AS F: 
        Content = reached, f.read () 
        words = the re.findall (R & lt ' [A-Z0-9 ^ -] + ' , content.lower ()) 
        coTotal (words)

Again in claim 4 profile, spent most time three functions in claim 1 at this time is given. Requirements include a screenshot. (2 minutes)

 

 

 

We spent three functions at this time:

 

 Test three times again:

 

 

 

Guess you like

Origin www.cnblogs.com/summerkingy/p/11568471.html