python--多线程编程 threading(一)

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/monkeysheep1234/article/details/84383176

先假设一应用场景:

爬虫爬取淘宝店铺的店铺列表页,获取到所有的店铺地址,根据每一个店铺地址,进去各店铺获取数据。

该场景便非常适合使用多线程。因为爬取店铺列表页和爬取详情页,其实互不影响,只要处理好线程间通信即可。

上示意代码:

#-*-coding:utf-8-*-
import threading
import time
def get_detail_html(url):
    print("get detail html started")
    time.sleep(2)
    print("get detail html end")

def get_detail_url(url):
    print("get detail url started")
    time.sleep(4)
    print("get detail url end")

执行以下不同调用方法,看执行的结果。先补充个知识点,再看执行结果

thread.setDaemon(True):

使用setDaemon()和守护线程这方面知识有关, 比如在启动线程前设置thread.setDaemon(True),就是设置该线程为守护线程, 表示该线程是不重要的,进程退出时不需要等待这个线程执行完成。 这样做的意义在于:避免子线程无限死循环,导致退不出程序。 thread.setDaemon()设置为True, 则设为true的话 则主线程执行完毕后会将子线程回收掉, 设置为false,主进程执行结束时不会回收子线程

thread.join():

join所完成的工作就是线程同步,即主线程任务结束之后,进入阻塞状态,一直等待其他的子线程执行结束之后,主线程再终止

if __name__=="__main__":
    thread1 = threading.Thread(target=get_detail_html,args=("",))
    thread2 = threading.Thread(target=get_detail_url, args=("",))
    start_time = time.time()
    thread1.start()
    thread2.start()

    print("last time:{}".format(time.time()-start_time))

执行结果:

get detail html started
get detail url started
last time:0.002000093460083008
get detail html end
get detail url end

if __name__=="__main__":
    thread1 = threading.Thread(target=get_detail_html,args=("",))
    thread2 = threading.Thread(target=get_detail_url, args=("",))
    start_time = time.time()

    thread1.setDaemon(True)
    thread2.setDaemon(True)

    thread1.start()
    thread2.start()

    print("last time:{}".format(time.time()-start_time))

执行结果:

get detail html started
get detail url started
last time:0.0

因为主线程很快执行完毕,守护线程直接退出 ,未执行time.sleep()

if __name__=="__main__":
    thread1 = threading.Thread(target=get_detail_html,args=("",))
    thread2 = threading.Thread(target=get_detail_url, args=("",))
    start_time = time.time()
    #
    # thread1.setDaemon(True)
    # thread2.setDaemon(True)

    thread1.start()
    thread2.start()
    thread1.join()
    thread2.join()
    print("last time:{}".format(time.time()-start_time))

执行结果:

get detail html started
get detail url started
get detail html end
get detail url end
last time:4.0012288093566895

if __name__=="__main__":
    thread1 = threading.Thread(target=get_detail_html,args=("",))
    thread2 = threading.Thread(target=get_detail_url, args=("",))
    start_time = time.time()

    thread1.setDaemon(True)
    thread2.setDaemon(True)

    thread1.start()
    thread2.start()
    thread1.join()
    thread2.join()
    print("last time:{}".format(time.time()-start_time))

执行结果:

get detail html started
get detail url started
get detail html end
get detail url end
last time:4.0012290477752686

猜你喜欢

转载自blog.csdn.net/monkeysheep1234/article/details/84383176
今日推荐