下载 https://github.com/android 上的全部源代码

从 https://android.googlesource.com/ 上弄源代码下来真是不容易,不但得用 vpn,repo 还得半天。github 的速度就快多了,关键是不用 vpn,随时可以下载。而且 github 还可以直接下载 zip 包,那速度可不是 repo 能比的。下面写了个代码批量下载 zip 包:


#coding:cp936
import re, requests

download_path = '.' # 压缩包下载后的存放位置
tag = 'android-4.1.2_r2.1' # 分支或标签的名称,如果是主版本就写 master

base_url = 'https://github.com/android'
archive_url = 'https://github.com/android/%s/archive/%s.zip'
pagination_re = '<a href="/android\?page=.*?">(.*?)</a>'
repo_re = '<a href="/android/.*?" itemprop="name codeRepository">(.*?)</a>'
page_count = 1
repo_items = []

session = requests.Session()
session.headers.update({
    'User-Agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 (.NET CLR 3.5.30729)',
})

html = session.get(base_url).text
page_result = re.findall(pagination_re, html, re.S)
if page_result != []:
    page_count = int(page_result[-2])

repo_result = re.findall(repo_re, html, re.S)
if repo_result != []:
    repo_items += repo_result

repo_items = map(lambda x: x.strip(), repo_items)
    
if page_count > 1:
    current_page = 2
    while current_page <= page_count:
        html = session.get(base_url + "/?page=%d" % current_page).text
        repo_result = re.findall(repo_re, html, re.S)
        if repo_result != []:
            repo_items = repo_items + repo_result
        current_page += 1

运行完后,会生成一个 bat 文件,里面是用 wget 来下载的,结果如下。运行 bat 等待下载完成就行了。


wget "https://github.com/android/platform_frameworks_base/archive/android-4.1.2_r2.1.zip" -c --output-document=".\platform_frameworks_base.zip" --no-check-certificate
wget "https://github.com/android/kernel_common/archive/android-4.1.2_r2.1.zip" -c --output-document=".\kernel_common.zip" --no-check-certificate


猜你喜欢

转载自blog.csdn.net/kowity/article/details/19006775