【Scrapy 五分钟撸网站】[综合类信息新闻]Scrapy实战中国经济网全站数据抓取

目标网站介绍

中国经济网 是国家重点新闻网站中唯一以经济报道为中心的综合新闻网站,每日采写大量经济新闻,同时整合国内主要媒体经济新闻及信息,为政府部门、企业决策提供权威的参考…
在这里插入图片描述

开始Scrapy

数据采集准备

1. 不了解5分钟快速抓网站思路的小伙伴先看
【Scrapy 五分钟撸网站】全站数据必备基础知识

2. 不了解数据抓取业务管理整理小伙伴先看
【Scrapy 五分钟撸网站】爬虫目标整理和数据准备

3. 不了解Scrapy模板量产的小伙伴先看(必看)
【Scrapy 五分钟撸网站】数据抓取项目框架通用模板

数据整理结果

1. Excel保存截图
在这里插入图片描述

模板套用

Spider下的<项目>.py文件

1. 创建spider项目

scrapy genspider www_ce_cn " "

2. 整理全站css样式
先来看下页面的CSS样式,全站统一基础样式六种,其余特殊样式比较多统一交给gerapy_auto_extractor.extractors 的 extract_list处理。

在这里插入图片描述

3. 修改 www_ce_cn.py 的的内容

这里将需要修改的地方进行说明,其他地方参考模板,不需修改。

  • 作用域&自定义说明
    allowed_domains = []
    web_name = "中国经济网"
  • 添加抓取数据信息
    start_menu = [
        # 金融证券
        [
            {
    
    "channel_name": "金融证券-互联网金融观察", "url": "http://finance.ce.cn/hlwjr/", },
            {
    
    "channel_name": "金融证券-专题精选", "url": "http://finance.ce.cn/home/zt/", },
            {
    
    "channel_name": "金融证券-财经滚动新闻", "url": "http://finance.ce.cn/rolling/", },
            {
    
    "channel_name": "金融证券-公司聚焦", "url": "http://finance.ce.cn/home/jrzq/dc/", },
            {
    
    "channel_name": "金融证券-私募观点", "url": "http://finance.ce.cn/jjpd/jjdsp/jjsyzn/", },
            {
    
    "channel_name": "金融证券-板块研究", "url": "http://finance.ce.cn/10cjsy/bk/", },
        ],
        # 期货频道
        [
            {
    
    "channel_name": "金融证券-期货频道-交易所&协会通知", "url": "http://finance.ce.cn/futures/jysjxhtz/", },
            {
    
    "channel_name": "金融证券-期货频道-公司专栏", "url": "http://finance.ce.cn/futures/qhgszl/", },
            {
    
    "channel_name": "金融证券-期货频道-会议论坛", "url": "http://finance.ce.cn/futures/qhhyjlt/", },
            {
    
    "channel_name": "金融证券-期货频道-投资顾问", "url": "http://finance.ce.cn/futures/qhtzgw/", },
            {
    
    "channel_name": "金融证券-期货频道-评论&研报", "url": "http://finance.ce.cn/futures/qhscpl/", },
            {
    
    "channel_name": "金融证券-期货频道-资讯&公告", "url": "http://finance.ce.cn/futures/qtzx/", },
            {
    
    "channel_name": "金融证券-期货频道-现货资讯", "url": "http://finance.ce.cn/futures/xhzx/", },
            {
    
    "channel_name": "金融证券-期货频道-期货滚动报道", "url": "http://finance.ce.cn/futures/qhgdbd/", },
            {
    
    "channel_name": "金融证券-期货频道-期市要闻区", "url": "http://finance.ce.cn/futures/qhywq/", },
            {
    
    "channel_name": "金融证券-期货频道-期货交易所", "url": "http://finance.ce.cn/futures/gjqhjyslj/", },
            {
    
    "channel_name": "金融证券-期货频道-期货", "url": "http://finance.ce.cn/futures/zjqhxy/", },
        ],
        # 新三板
        [
            {
    
    "channel_name": "金融证券-新三板-新三板滚动新闻", "url": "http://finance.ce.cn/xsb/xsbgdxw/", },
        ],
        ##理财
        [
            {
    
    "channel_name": "金融证券-理财-热点聚焦", "url": "http://finance.ce.cn/money/2016lc/rdjj/", },
        ],
        # 基金频道
        [
            {
    
    "channel_name": "金融证券-基金频道-基金滚动新闻", "url": "http://finance.ce.cn/jjpd/jjpdgd/", },
            {
    
    "channel_name": "金融证券-基金频道-基金人物秀", "url": "http://finance.ce.cn/jjpd/jjpddyp/rwx/", },
            {
    
    "channel_name": "金融证券-基金频道-中经视点", "url": "http://finance.ce.cn/jjpd/jjpddyp/zjsd/", },
            {
    
    "channel_name": "金融证券-基金频道-基金研报", "url": "http://finance.ce.cn/jjpd/jjdsp/yb/", },
            {
    
    "channel_name": "金融证券-基金频道-基金看市", "url": "http://finance.ce.cn/jjpd/jjpddyp/jjks/", },
            {
    
    "channel_name": "金融证券-基金频道-基金公告", "url": "http://finance.ce.cn/jjpd/jjdsp/jjgg/", },
            {
    
    "channel_name": "金融证券-基金频道-机构专栏", "url": "http://finance.ce.cn/jjpd/jjpddyp/jjpdzl/jjpdjg/", },
            {
    
    "channel_name": "金融证券-基金频道-私募动态", "url": "http://finance.ce.cn/jjpd/jjdep/dt/", },
            {
    
    "channel_name": "金融证券-基金频道-要闻", "url": "http://finance.ce.cn/jjpd/jjpddyp/jjpdyw/", },
            {
    
    "channel_name": "金融证券-基金频道-基金经理风采", "url": "http://finance.ce.cn/jjpd/jjdsp/jjfc/", },
            {
    
    "channel_name": "金融证券-基金频道-基金申赎异动", "url": "http://finance.ce.cn/jjpd/jjdsp/ssyd/", },
            {
    
    "channel_name": "金融证券-基金频道-新基速递", "url": "http://finance.ce.cn/jjpd/jjdsp/tj/", },
            {
    
    "channel_name": "金融证券-基金频道-基金创新", "url": "http://finance.ce.cn/jjpd/jjdsp/jjcx/", },
            {
    
    "channel_name": "金融证券-基金频道-私募研报", "url": "http://finance.ce.cn/jjpd/jjdsp/zcg/", },
            {
    
    "channel_name": "金融证券-基金频道-基金学堂", "url": "http://finance.ce.cn/jjpd/jjdep/xt/", },
        ],
        ##保险频道
        [
            {
    
    "channel_name": "金融证券-保险频道-业内交流", "url": "http://finance.ce.cn/insurance/jjdt/", },
            {
    
    "channel_name": "金融证券-保险频道-记者观察", "url": "http://finance.ce.cn/insurance/zzbx/", },
            {
    
    "channel_name": "金融证券-保险频道-保险专题", "url": "http://finance.ce.cn/insurance/bxzt/", },
            {
    
    "channel_name": "金融证券-保险频道-政策法规", "url": "http://finance.ce.cn/insurance/zcfg/", },
            {
    
    "channel_name": "金融证券-保险频道-保险课堂", "url": "http://finance.ce.cn/insurance/bxlp/", },
            {
    
    "channel_name": "金融证券-保险频道-险企新闻", "url": "http://finance.ce.cn/insurance/ylbx/", },
            {
    
    "channel_name": "金融证券-保险频道-行业动态", "url": "http://finance.ce.cn/insurance/ccbx/", },
            {
    
    "channel_name": "金融证券-保险频道-理赔维权", "url": "http://finance.ce.cn/insurance/jkx/", },
            {
    
    "channel_name": "金融证券-保险频道-险种产品", "url": "http://finance.ce.cn/insurance/ywx/", },
            {
    
    "channel_name": "金融证券-保险频道-保险数据", "url": "http://finance.ce.cn/insurance/zbx/", },
            {
    
    "channel_name": "金融证券-保险频道-2021保险", "url": "http://finance.ce.cn/insurance1/scrollnews/", },
            {
    
    "channel_name": "金融证券-保险频道-要闻", "url": "http://finance.ce.cn/insurance/yw/", },
            {
    
    "channel_name": "金融证券-保险频道-黑名单", "url": "http://finance.ce.cn/insurance/wzbx/", },
            {
    
    "channel_name": "金融证券-保险频道-深度报道", "url": "http://finance.ce.cn/sub/gssj/sdbd/", },
        ],
        ##银行频道
        [
            {
    
    "channel_name": "金融证券-银行频道-要闻关注", "url": "http://finance.ce.cn/bank/yw/", },
            {
    
    "channel_name": "金融证券-银行频道-信贷融资", "url": "http://finance.ce.cn/bank/xdfx/", },
            {
    
    "channel_name": "金融证券-银行频道-理财产品", "url": "http://finance.ce.cn/bank/lccp/", },
            {
    
    "channel_name": "金融证券-银行频道-上市银行", "url": "http://finance.ce.cn/bank/dzyh/", },
            {
    
    "channel_name": "金融证券-银行频道-行业新闻", "url": "http://finance.ce.cn/bank/sryh/", },
            {
    
    "channel_name": "金融证券-银行频道-优惠信息", "url": "http://finance.ce.cn/bank/yhk/", },
            {
    
    "channel_name": "金融证券-银行频道-滚动新闻", "url": "http://finance.ce.cn/bank12/scroll/", },
            {
    
    "channel_name": "金融证券-银行频道-银行专题", "url": "http://finance.ce.cn/bank/yhzt/", },
            {
    
    "channel_name": "金融证券-银行频道-政策法规", "url": "http://finance.ce.cn/bank/zcfg/", },
            {
    
    "channel_name": "金融证券-银行频道-银行课堂", "url": "http://finance.ce.cn/bank/hqjr/", },
            {
    
    "channel_name": "金融证券-银行频道-独家报道", "url": "http://finance.ce.cn/bank/jgdt/", },
            {
    
    "channel_name": "金融证券-银行频道-机构专栏", "url": "http://finance.ce.cn/bank/zzyh/", },
            {
    
    "channel_name": "金融证券-银行频道-业绩一览", "url": "http://finance.ce.cn/bank/wzyh/", },
        ],
        ##股市频道
        [
            {
    
    "channel_name": "金融证券-股市频道-债市聚焦", "url": "http://finance.ce.cn/home/cfzq/zq/", },
            {
    
    "channel_name": "金融证券-股市频道-股指期货", "url": "http://finance.ce.cn/10cjsy/qt/", },
            {
    
    "channel_name": "金融证券-股市频道-海外市场", "url": "http://finance.ce.cn/10cjsy/hw/", },
            {
    
    "channel_name": "金融证券-股市频道-并购重组", "url": "http://finance.ce.cn/10cjsy/bg/", },
            {
    
    "channel_name": "金融证券-股市频道-大势研判", "url": "http://finance.ce.cn/home/zqzq/dp/", },
            {
    
    "channel_name": "金融证券-股市频道-即时解盘", "url": "http://finance.ce.cn/stock/jsjp/", },
            {
    
    "channel_name": "金融证券-股市频道-上市全观察", "url": "http://finance.ce.cn/shqgc/", },
            {
    
    "channel_name": "金融证券-股市频道-金融证券", "url": "http://finance.ce.cn/", },
            {
    
    "channel_name": "金融证券-股市频道-滚动资讯", "url": "http://finance.ce.cn/stock/gsgdbd/", },
        ],
        # 滚动新闻
        [
            {
    
    "channel_name": "金融证券-外汇滚动新闻", "url": "http://finance.ce.cn/fe/gdxw/", },
            {
    
    "channel_name": "金融证券-小微金融滚动", "url": "http://finance.ce.cn/xwjr/gd/", },
            {
    
    "channel_name": "金融证券-债券滚动报道", "url": "http://finance.ce.cn/bond/zqgdbd/", },
            {
    
    "channel_name": "金融证券-新三板评论", "url": "http://finance.ce.cn/xsb/xsbpl/", },
            {
    
    "channel_name": "金融证券-新三板知识", "url": "http://finance.ce.cn/xsb/xsbzs/", },
            {
    
    "channel_name": "金融证券-新三板公司动态", "url": "http://finance.ce.cn/xsb/xsbgsdt/", },
            {
    
    "channel_name": "金融证券-上市动态", "url": "http://finance.ce.cn/shqgc/sc/", },
            {
    
    "channel_name": "金融证券-公司解析", "url": "http://finance.ce.cn/shqgc/pl/", },
            {
    
    "channel_name": "金融证券-融资滚动", "url": "http://finance.ce.cn/rz/rzgd/", },
            {
    
    "channel_name": "金融证券-股市七日谈", "url": "http://finance.ce.cn/sub/qrt/gs/", },
            {
    
    "channel_name": "金融证券-银行七日谈", "url": "http://finance.ce.cn/sub/qrt/yh/", },
            {
    
    "channel_name": "金融证券-保险七日谈", "url": "http://finance.ce.cn/sub/qrt/bx/", },
            {
    
    "channel_name": "金融证券-上市动态", "url": "http://finance.ce.cn/shqgc/sc/", },
            {
    
    "channel_name": "金融证券-上市公司人事更多报道", "url": "http://finance.ce.cn/sub/ssgsrs/gd/", },
            {
    
    "channel_name": "金融证券-最新报道", "url": "http://finance.ce.cn/sub/ggttk/zx/", },
            {
    
    "channel_name": "金融证券-小微金融滚动", "url": "http://finance.ce.cn/xwjr/gd/", },
            {
    
    "channel_name": "金融证券-热点聚焦", "url": "http://finance.ce.cn/2015home/jj/", },
            {
    
    "channel_name": "金融证券-焦点财讯", "url": "http://finance.ce.cn/sub/cj2009/", },
            {
    
    "channel_name": "金融证券-焦点财讯", "url": "http://finance.ce.cn/sub/cj2009/", },
            {
    
    "channel_name": "金融证券-最新报道", "url": "http://finance.ce.cn/sub/ggttk/zx/", },
        ],
        # 食品
        [
            {
    
    "channel_name": "产业市场-食品-食品专题", "url": "http://www.ce.cn/cysc/sp/subject/", },
            {
    
    "channel_name": "产业市场-食品-曝光台", "url": "http://www.ce.cn/cysc/sp/baoguantai/", },
            {
    
    "channel_name": "产业市场-食品-公司观察", "url": "http://www.ce.cn/cysc/sp/ssgs/", },
            {
    
    "channel_name": "产业市场-食品-食品行业动态", "url": "http://www.ce.cn/cysc/sp/info/", },
            {
    
    "channel_name": "产业市场-食品-中经舆情", "url": "http://www.ce.cn/cysc/sp/zhongjingyuqing/", },
            {
    
    "channel_name": "产业市场-食品-食品安全大讲堂", "url": "http://www.ce.cn/cysc/sp/djt/", },
            {
    
    "channel_name": "产业市场-食品-食品监管动态", "url": "http://www.ce.cn/cysc/sp/shiyaojianju/", },
            {
    
    "channel_name": "产业市场-食品-老年食品与营养", "url": "http://www.ce.cn/cysc/sp/lnsp/", },
            {
    
    "channel_name": "产业市场-食品-各地美食", "url": "http://www.ce.cn/cysc/sp/wy/", },
            {
    
    "channel_name": "产业市场-食品-科学用药", "url": "http://www.ce.cn/cysc/sp/aqts/", },
            {
    
    "channel_name": "产业市场-食品-中经调查", "url": "http://www.ce.cn/cysc/sp/dc/", },
            {
    
    "channel_name": "产业市场-食品-酒业", "url": "http://www.ce.cn/cysc/sp/jiu/", },
            {
    
    "channel_name": "产业市场-食品-会展报道", "url": "http://www.ce.cn/cysc/sp/xcbd/", },
            {
    
    "channel_name": "产业市场-食品-本网专稿", "url": "http://www.ce.cn/cysc/sp/bwzg/", },
            {
    
    "channel_name": "产业市场-食品-医疗器械", "url": "http://www.ce.cn/cysc/sp/tjj/", },
            {
    
    "channel_name": "产业市场-食品-餐饮", "url": "http://www.ce.cn/cysc/sp/cyaq/", },
            {
    
    "channel_name": "产业市场-食品-饮料", "url": "http://www.ce.cn/cysc/sp/cy/", },
            {
    
    "channel_name": "产业市场-食品-乳业", "url": "http://www.ce.cn/cysc/sp/ry/", },
            {
    
    "channel_name": "产业市场-食品-保健食品", "url": "http://www.ce.cn/cysc/sp/ly/", },
            {
    
    "channel_name": "产业市场-食品-药品", "url": "http://www.ce.cn/cysc/sp/bk/", },
        ],
        # 房产
        [
            {
    
    "channel_name": "产业市场-房产-房产资讯", "url": "http://www.ce.cn/cysc/fdc/fc/", },
            {
    
    "channel_name": "产业市场-房产-商业地产", "url": "http://www.ce.cn/cysc/fdc/jn/sy/", },
            {
    
    "channel_name": "产业市场-房产-本网专稿", "url": "http://www.ce.cn/cysc/fdc/12/", },
        ],
        # 能源
        [
            {
    
    "channel_name": "产业市场-能源-滚动新闻", "url": "http://www.ce.cn/cysc/ny/gdxw/", },
            {
    
    "channel_name": "产业市场-能源-冶金", "url": "http://www.ce.cn/cysc/newmain/jdpd/yj/", },
            {
    
    "channel_name": "产业市场-能源-本网专稿", "url": "http://www.ce.cn/cysc/newmain/right/zg/", },
            {
    
    "channel_name": "产业市场-能源-专题列表", "url": "http://www.ce.cn/cysc/newmain/yc/zt/", },
        ],
        # IT
        [
            {
    
    "channel_name": "产业市场-IT-本网专稿", "url": "http://www.ce.cn/cysc/newmain/right/zg/", },
            {
    
    "channel_name": "产业市场-IT-滚动新闻", "url": "http://www.ce.cn/cysc/tech/gd2012/", },
        ],
        # 家电
        [
            {
    
    "channel_name": "产业市场-家电-网购卖场", "url": "http://www.ce.cn/cysc/zgjd/wgsv/", },
            {
    
    "channel_name": "产业市场-家电-政策法规", "url": "http://www.ce.cn/cysc/zgjd/zcfg/", },
            {
    
    "channel_name": "产业市场-家电-业绩财报", "url": "http://www.ce.cn/cysc/zgjd/yjcb/", },
            {
    
    "channel_name": "产业市场-家电-质量曝光", "url": "http://www.ce.cn/cysc/zgjd/jdsh/", },
            {
    
    "channel_name": "产业市场-家电-行业新闻", "url": "http://www.ce.cn/cysc/zgjd/hyfx/", },
            {
    
    "channel_name": "产业市场-家电-公司观察", "url": "http://www.ce.cn/cysc/zgjd/qycz/", },
            {
    
    "channel_name": "产业市场-家电-业界动态", "url": "http://www.ce.cn/cysc/zgjd/yjxw/", },
            {
    
    "channel_name": "产业市场-家电-今日更新", "url": "http://www.ce.cn/cysc/zgjd/kx/", },
        ],
        # 交通
        [
            {
    
    "channel_name": "产业市场-交通-要闻", "url": "http://www.ce.cn/cysc/jtys/yw/", },
            {
    
    "channel_name": "产业市场-交通-铁路", "url": "http://www.ce.cn/cysc/jtys/tielu/", },
            {
    
    "channel_name": "产业市场-交通-航空", "url": "http://www.ce.cn/cysc/jtys/hangkong/", },
            {
    
    "channel_name": "产业市场-交通-公路", "url": "http://www.ce.cn/cysc/jtys/gonglu/", },
            {
    
    "channel_name": "产业市场-交通-海运", "url": "http://www.ce.cn/cysc/jtys/haiyun/", },
            {
    
    "channel_name": "产业市场-交通-城市交通", "url": "http://www.ce.cn/cysc/jtys/csjt/", },
            {
    
    "channel_name": "产业市场-交通-综合物流", "url": "http://www.ce.cn/cysc/jtys/zhwl/", },
            {
    
    "channel_name": "产业市场-交通-交通法规", "url": "http://www.ce.cn/cysc/jtys/fgjd/", },
            {
    
    "channel_name": "产业市场-交通-交通运输", "url": "http://www.ce.cn/cysc/jtys/", },
        ],
        # 质量安全
        [
            {
    
    "channel_name": "产业市场-质量安全-每日更新", "url": "http://www.ce.cn/cysc/zljd/gd/", },
            {
    
    "channel_name": "产业市场-质量安全-权威发布", "url": "http://www.ce.cn/cysc/zljd/qwfb/", },
            {
    
    "channel_name": "产业市场-质量安全-消费预警", "url": "http://www.ce.cn/cysc/zljd/xfyj/", },
            {
    
    "channel_name": "产业市场-质量安全-黑榜", "url": "http://www.ce.cn/cysc/zljd/hb/", },
            {
    
    "channel_name": "产业市场-质量安全-红榜", "url": "http://www.ce.cn/cysc/zljd/hong/", },
            {
    
    "channel_name": "产业市场-质量安全-电子商务", "url": "http://www.ce.cn/cysc/zljd/dzsw/", },
            {
    
    "channel_name": "产业市场-质量安全-召回信息", "url": "http://www.ce.cn/cysc/zljd/zhxx/", },
            {
    
    "channel_name": "产业市场-质量安全-各地市场信息", "url": "http://www.ce.cn/cysc/zljd/zlxx/", },
            {
    
    "channel_name": "产业市场-质量安全-本网原创", "url": "http://www.ce.cn/cysc/zljd/yc/", },
            {
    
    "channel_name": "产业市场-质量安全-关注度", "url": "http://www.ce.cn/cysc/zljd/gzd/", },
            {
    
    "channel_name": "产业市场-质量安全-质量观察", "url": "http://www.ce.cn/cysc/zljd/yqhz/", },
            {
    
    "channel_name": "产业市场-质量安全-服务质量", "url": "http://www.ce.cn/cysc/zljd/fwzl/", },
            {
    
    "channel_name": "产业市场-质量安全-消协资讯", "url": "http://www.ce.cn/cysc/zljd/xxzx/", },
            {
    
    "channel_name": "产业市场-质量安全-标准纵览", "url": "http://www.ce.cn/cysc/zljd/bz/", },
            {
    
    "channel_name": "产业市场-质量安全-政策法规", "url": "http://www.ce.cn/cysc/zljd/zcfg/", },
            {
    
    "channel_name": "产业市场-质量安全-质量知识大讲堂", "url": "http://www.ce.cn/cysc/zljd/djt/", },
            {
    
    "channel_name": "产业市场-质量安全-滚动", "url": "http://www.ce.cn/cysc/zljd/gd/", },
        ],
        # 质量经济
        [
            {
    
    "channel_name": "产业市场-质量经济-曝光台", "url": "http://12365.ce.cn/zlpd/bgtd/", },
            {
    
    "channel_name": "产业市场-质量经济-质量专题", "url": "http://12365.ce.cn/zlpd/rdzt/", },
            {
    
    "channel_name": "产业市场-质量经济-质量舆论", "url": "http://12365.ce.cn/zlpd/zlsp/", },
            {
    
    "channel_name": "产业市场-质量经济-质量管理", "url": "http://12365.ce.cn/zlpd/jdgl/", },
            {
    
    "channel_name": "产业市场-质量经济-品牌建设", "url": "http://12365.ce.cn/zlpd/bytx/", },
            {
    
    "channel_name": "产业市场-质量经济-地方质检", "url": "http://12365.ce.cn/zlpd/dfzj/", },
            {
    
    "channel_name": "产业市场-质量经济-权威发布", "url": "http://12365.ce.cn/zlpd/qwfb/", },
            {
    
    "channel_name": "产业市场-质量经济-质量资讯", "url": "http://12365.ce.cn/zlpd/jsxx/", },
            {
    
    "channel_name": "产业市场-质量经济-质量提升", "url": "http://12365.ce.cn/zlpd/yw/yw/", },
            {
    
    "channel_name": "产业市场-质量经济-高度关注", "url": "http://12365.ce.cn/zlpd/ldr/", },
            {
    
    "channel_name": "产业市场-质量经济-质量技术基础", "url": "http://12365.ce.cn/zlpd/rzbz/", },
            {
    
    "channel_name": "产业市场-质量经济-诚信责任", "url": "http://12365.ce.cn/zlpd/ppjs/", },
        ],
        # 医药频道
        [
            {
    
    "channel_name": "产业市场-医药频道-大咖谈", "url": "http://www.ce.cn/cysc/yy/qyjzf/", },
            {
    
    "channel_name": "产业市场-医药频道-行业动态", "url": "http://www.ce.cn/cysc/yy/hydt/", },
            {
    
    "channel_name": "产业市场-医药频道-权威发布", "url": "http://www.ce.cn/cysc/yy/qwfb/", },
            {
    
    "channel_name": "产业市场-医药频道-医药科普", "url": "http://www.ce.cn/cysc/yy/yydjt/", },
            {
    
    "channel_name": "产业市场-医药频道-资本市场", "url": "http://www.ce.cn/cysc/yy/ssgs/", },
            {
    
    "channel_name": "产业市场-医药频道-监督报道", "url": "http://www.ce.cn/cysc/yy/yyhhb/", },
            {
    
    "channel_name": "产业市场-医药频道-公司新闻", "url": "http://www.ce.cn/cysc/yy/gdpl/", },
            {
    
    "channel_name": "产业市场-医药频道-中医药", "url": "http://www.ce.cn/cysc/yy/zy/", },
            {
    
    "channel_name": "产业市场-医药频道-药店", "url": "http://www.ce.cn/cysc/yy/yd/", },
            {
    
    "channel_name": "产业市场-医药频道-临床研究", "url": "http://www.ce.cn/cysc/yy/hzp/", },
            {
    
    "channel_name": "产业市场-医药频道-医美·化妆品", "url": "http://www.ce.cn/cysc/yy/lcyj/", },
            {
    
    "channel_name": "产业市场-医药频道-医疗器械", "url": "http://www.ce.cn/cysc/yy/ylqx/", },
            {
    
    "channel_name": "产业市场-医药频道-医疗新闻", "url": "http://www.ce.cn/cysc/yy/ylxw/", },
            {
    
    "channel_name": "产业市场-医药频道-创新药", "url": "http://www.ce.cn/cysc/yy/hwkx/", },
        ],
        # 生态文明
        [
            {
    
    "channel_name": "产业市场-生态文明-滚动新闻", "url": "http://www.ce.cn/cysc/stwm/gd/", },
            {
    
    "channel_name": "产业市场-生态文明-美丽中国", "url": "http://www.ce.cn/cysc/stwm/mlzg/", },
            {
    
    "channel_name": "产业市场-生态文明-环境监管", "url": "http://www.ce.cn/cysc/stwm/qygc/", },
            {
    
    "channel_name": "产业市场-生态文明-绿色发展", "url": "http://www.ce.cn/cysc/stwm/lsjj/", },
            {
    
    "channel_name": "产业市场-生态文明-污染防治", "url": "http://www.ce.cn/cysc/stwm/wrfz/", },
            {
    
    "channel_name": "产业市场-生态文明-生态保护", "url": "http://www.ce.cn/cysc/stwm/zxdt/", },
            {
    
    "channel_name": "产业市场-生态文明-政策解读", "url": "http://www.ce.cn/cysc/stwm/zc/", },
            {
    
    "channel_name": "产业市场-生态文明-本网专稿", "url": "http://www.ce.cn/cysc/stwm/zg/", },
        ],
        # 旅游频道
        [
            {
    
    "channel_name": "产业市场-旅游频道-滚动", "url": "http://travel.ce.cn/gdtj/", },
            {
    
    "channel_name": "产业市场-旅游频道-文化旅游", "url": "http://travel.ce.cn/xsy/gl/", },
            {
    
    "channel_name": "产业市场-旅游频道-舆情投诉", "url": "http://travel.ce.cn/xsy/yq/", },
            {
    
    "channel_name": "产业市场-旅游频道-酒店航空", "url": "http://travel.ce.cn/xsy/jd/", },
            {
    
    "channel_name": "产业市场-旅游频道-在线旅游", "url": "http://travel.ce.cn/xsy/zx/", },
            {
    
    "channel_name": "产业市场-旅游频道-旅游经济信息联播", "url": "http://travel.ce.cn/xsy/sp/", },
            {
    
    "channel_name": "产业市场-旅游频道-权威发布", "url": "http://travel.ce.cn/xsy/fb/", },
            {
    
    "channel_name": "产业市场-旅游频道-产业经济", "url": "http://travel.ce.cn/xsy/cy/", },
        ],
        # 文化产业
        [
            {
    
    "channel_name": "产业市场-文化产业-中经文化产业", "url": "http://www.ce.cn//culture/whcyk/zjwhcy/", },
            {
    
    "channel_name": "产业市场-文化产业-独家专稿", "url": "http://www.ce.cn/culture/whcyk/zg/", },
            {
    
    "channel_name": "产业市场-文化产业-专题", "url": "http://www.ce.cn/culture/whcyk/zt/", },
            {
    
    "channel_name": "产业市场-文化产业-文化名人访", "url": "http://www.ce.cn/culture/whmrf/", },
            {
    
    "channel_name": "产业市场-文化产业-文化达人", "url": "http://www.ce.cn/culture/dr/", },
            {
    
    "channel_name": "产业市场-文化产业-文化月报", "url": "http://www.ce.cn/culture/yb/", },
            {
    
    "channel_name": "产业市场-文化产业-文化舆情", "url": "http://www.ce.cn/culture/whcyk/jrht/", },
            {
    
    "channel_name": "产业市场-文化产业-文化要闻", "url": "http://www.ce.cn/culture/whcyk/yaowen/", },
            {
    
    "channel_name": "产业市场-文化产业-滚动", "url": "http://www.ce.cn/culture/gd/", },
        ],
        # 书画
        [
            {
    
    "channel_name": "产业市场-书画-文化名人访", "url": "http://www.ce.cn/culture/whmrf/", },
            {
    
    "channel_name": "产业市场-书画-文化产业频道", "url": "http://www.ce.cn/culture/", },
            {
    
    "channel_name": "产业市场-书画-书画高清图", "url": "http://shuhua.ce.cn/dtbf/", },
            {
    
    "channel_name": "产业市场-书画-名人库", "url": "http://shuhua.ce.cn/ren/", },
            {
    
    "channel_name": "产业市场-书画-展览", "url": "http://shuhua.ce.cn/sy2015/zhan/", },
            {
    
    "channel_name": "产业市场-书画-艺术市场", "url": "http://shuhua.ce.cn/sy2015/pmxw/", },
            {
    
    "channel_name": "产业市场-书画-要闻", "url": "http://shuhua.ce.cn/sy2015/yw/", },
            {
    
    "channel_name": "产业市场-书画-书画快报", "url": "http://shuhua.ce.cn/xinxi/", },
        ],
        # 时政社会
        [
            {
    
    "channel_name": "时政社会-人事动态", "url": "http://district.ce.cn/newarea/sddy/", },
            {
    
    "channel_name": "时政社会-宏观经济", "url": "http://www.ce.cn/macro/more/", },
            {
    
    "channel_name": "时政社会-时政", "url": "http://www.ce.cn/xwzx/gnsz/gdxw/", },
            {
    
    "channel_name": "时政社会-要闻", "url": "http://www.ce.cn/xwzx/gnsz/szyw/", },
            {
    
    "channel_name": "时政社会-社会", "url": "http://www.ce.cn/xwzx/shgj/", },
            {
    
    "channel_name": "时政社会-法制", "url": "http://www.ce.cn/xwzx/fazhi/", },
            {
    
    "channel_name": "时政社会-地方党政人物库", "url": "http://district.ce.cn/zt/rwk/", },
            {
    
    "channel_name": "时政社会-专题", "url": "http://www.ce.cn/zt/sz/", },
            {
    
    "channel_name": "时政社会-专稿", "url": "http://www.ce.cn/xwzx/xinwen/bwzg/", },
            {
    
    "channel_name": "时政社会-即时要闻", "url": "http://www.ce.cn/xwzx/xinwen/jsyw/", },
            {
    
    "channel_name": "时政社会-社会广角", "url": "http://www.ce.cn/xwzx/shgj/gdxw/", },
            {
    
    "channel_name": "时政社会-科教", "url": "http://www.ce.cn/xwzx/kj/", },
            {
    
    "channel_name": "时政社会-科普知识", "url": "http://www.ce.cn/xwzx/xinwen/kjjy/kpzs/", },
            {
    
    "channel_name": "时政社会-教育资讯", "url": "http://www.ce.cn/xwzx/xinwen/kjjy/jyzx/", },
            {
    
    "channel_name": "时政社会-图片中心", "url": "http://www.ce.cn/xwzx/photo/", },
        ],
        # 中经视频
        [
            {
    
    "channel_name": "中经视频-最新", "url": "http://cen.ce.cn/more/", },
            {
    
    "channel_name": "中经视频-中韩专线直击", "url": "http://cen.ce.cn/cevideo/cen/zj/", },
            {
    
    "channel_name": "中经视频-巴中特快", "url": "http://cen.ce.cn/cevideo/cen/ct/", },
            {
    
    "channel_name": "中经视频-一带一路·面对面", "url": "http://cen.ce.cn/cevideo/cen/ff/", },
            {
    
    "channel_name": "中经视频-中巴经贸热线", "url": "http://cen.ce.cn/cevideo/cen/rx/", },
            {
    
    "channel_name": "中经视频-短视频", "url": "http://cen.ce.cn/cevideo/sv/", },
            {
    
    "channel_name": "中经视频-每周中国经济", "url": "http://cen.ce.cn/cevideo/cen/mz/", },
            {
    
    "channel_name": "中经视频-巴基斯坦人在中国", "url": "http://cen.ce.cn/cevideo/cen/wic/", },
            {
    
    "channel_name": "中经视频-直播", "url": "http://cen.ce.cn/cevideo/zb/h/", },
            {
    
    "channel_name": "中经视频-中巴经贸企业名录", "url": "http://cen.ce.cn/cevideo/cen/qy/", },
            {
    
    "channel_name": "中经视频-专题·活动", "url": "http://cen.ce.cn/cevideo/sc/", },
            {
    
    "channel_name": "中经视频-关于中经网韩国(株)", "url": "http://cen.ce.cn/cevideo/cek/", },
            {
    
    "channel_name": "中经视频-关于中经视频", "url": "http://cen.ce.cn/cevideo/cevideo/", },
        ],
        # 评论理论
        [
            {
    
    "channel_name": "评论理论-专题", "url": "http://views.ce.cn/main/zt/", },
            {
    
    "channel_name": "评论理论-经济大讲堂", "url": "http://views.ce.cn/view/society/", },
            {
    
    "channel_name": "评论理论-观察家", "url": "http://views.ce.cn/view/obs/", },
            {
    
    "channel_name": "评论理论-经济眼", "url": "http://views.ce.cn/view/economy/", },
            {
    
    "channel_name": "评论理论-经济学人", "url": "http://views.ce.cn/fun/who/", },
            {
    
    "channel_name": "评论理论-声音", "url": "http://views.ce.cn/main/net/", },
            {
    
    "channel_name": "评论理论-理论前沿", "url": "http://views.ce.cn/main/qy/", },
            {
    
    "channel_name": "评论理论-经点热评", "url": "http://views.ce.cn/main/jdrp/", },
            {
    
    "channel_name": "评论理论-网言众议", "url": "http://views.ce.cn/main/disc/", },
            {
    
    "channel_name": "评论理论-中经天天评", "url": "http://views.ce.cn/main/yc/", },
            {
    
    "channel_name": "评论理论-理论动态", "url": "http://views.ce.cn/main/lldt/", },
            {
    
    "channel_name": "评论理论-今日看点", "url": "http://views.ce.cn/main/kd/", },
            {
    
    "channel_name": "评论理论-理论百科", "url": "http://views.ce.cn/fun/llbk/", },
        ],
        # 脱贫攻坚
        [
            {
    
    "channel_name": "脱贫攻坚-攻坚先锋", "url": "http://tuopin.ce.cn/rw/", },
            {
    
    "channel_name": "脱贫攻坚-书记县长纵横谈", "url": "http://tuopin.ce.cn/sjxz/", },
            {
    
    "channel_name": "脱贫攻坚-政策指南", "url": "http://tuopin.ce.cn/zczn/", },
            {
    
    "channel_name": "脱贫攻坚-产业兴县", "url": "http://tuopin.ce.cn/cyxx/", },
            {
    
    "channel_name": "脱贫攻坚-独家视角", "url": "http://tuopin.ce.cn/exclusive/", },
            {
    
    "channel_name": "脱贫攻坚-热点话题", "url": "http://tuopin.ce.cn/rdht/", },
            {
    
    "channel_name": "脱贫攻坚-省部动态", "url": "http://tuopin.ce.cn/sbdt/", },
            {
    
    "channel_name": "脱贫攻坚-今日要闻", "url": "http://tuopin.ce.cn/yw/", },
            {
    
    "channel_name": "脱贫攻坚-专稿", "url": "http://tuopin.ce.cn/zg/", },
            {
    
    "channel_name": "脱贫攻坚-滚动资讯", "url": "http://tuopin.ce.cn/news/", },
            {
    
    "channel_name": "脱贫攻坚-美丽乡村", "url": "http://tuopin.ce.cn/mlxc/", },
            {
    
    "channel_name": "脱贫攻坚-实用信息", "url": "http://tuopin.ce.cn/syxx/", },
            {
    
    "channel_name": "脱贫攻坚-谈贫论富", "url": "http://tuopin.ce.cn/pfl/", },
            {
    
    "channel_name": "脱贫攻坚-国际扶贫", "url": "http://tuopin.ce.cn/gjfp/", },
            {
    
    "channel_name": "脱贫攻坚-驻村帮扶", "url": "http://tuopin.ce.cn/zcbf/", },
            {
    
    "channel_name": "脱贫攻坚-培训讲坛", "url": "http://tuopin.ce.cn/pxjt/", },
            {
    
    "channel_name": "脱贫攻坚-社会扶贫", "url": "http://tuopin.ce.cn/sh/", },
        ],
        # 汽车
        [
            {
    
    "channel_name": "汽车频道-滚动", "url": "http://auto.ce.cn/auto/gundong/", },
            {
    
    "channel_name": "汽车频道-社会责任", "url": "http://auto.ce.cn/car/csr/", },
            {
    
    "channel_name": "汽车频道-后市场", "url": "http://auto.ce.cn/car/hsc/", },
            {
    
    "channel_name": "汽车频道-观察家", "url": "http://auto.ce.cn/car/gcj/", },
            {
    
    "channel_name": "汽车频道-领袖说", "url": "http://auto.ce.cn/car/lx/", },
            {
    
    "channel_name": "汽车频道-新车", "url": "http://auto.ce.cn/car/xc/", },
            {
    
    "channel_name": "汽车频道-产经", "url": "http://auto.ce.cn/car/cj/", },
            {
    
    "channel_name": "汽车频道-资讯", "url": "http://auto.ce.cn/car/zx/", },
            {
    
    "channel_name": "汽车频道-试驾", "url": "http://auto.ce.cn/auto/shijia/", },
            {
    
    "channel_name": "汽车频道-特别报道", "url": "http://auto.ce.cn/car/tbbd/", },
            {
    
    "channel_name": "汽车频道-特别策划", "url": "http://auto.ce.cn/car/ch/", },
            {
    
    "channel_name": "汽车频道-原创观点", "url": "http://auto.ce.cn/car/yc/", },

        ],
        # 会展
        [
            {
    
    "channel_name": "会展中国-滚动", "url": "http://expo.ce.cn/gd/", },
            {
    
    "channel_name": "会展中国-直播", "url": "http://expo.ce.cn/shy/zb/", },
            {
    
    "channel_name": "会展中国-专题", "url": "http://expo.ce.cn/shy/zt/", },
            {
    
    "channel_name": "会展中国-论道", "url": "http://expo.ce.cn/shy/ld/", },
            {
    
    "channel_name": "会展中国-政策", "url": "http://expo.ce.cn/shy/zc/", },
            {
    
    "channel_name": "会展中国-会展名人堂", "url": "http://expo.ce.cn/shy/mrt/", },
            {
    
    "channel_name": "会展中国-艺术博览", "url": "http://expo.ce.cn/shy/ys/01/", },
            {
    
    "channel_name": "会展中国-节庆活动", "url": "http://expo.ce.cn/shy/jq/01/", },
            {
    
    "channel_name": "会展中国-会奖商旅", "url": "http://expo.ce.cn/shy/MICE/01/", },
            {
    
    "channel_name": "会展中国-名企", "url": "http://expo.ce.cn/shy/mq/", },
            {
    
    "channel_name": "会展中国-产业会展", "url": "http://expo.ce.cn/shy/cy/", },
        ],
        # 城市频道
        [
            {
    
    "channel_name": "城市频道-城市建设", "url": "http://city.ce.cn/main/build/", },
            {
    
    "channel_name": "城市频道-城市探索者", "url": "http://city.ce.cn/main/exclusive/", },
            {
    
    "channel_name": "城市频道-生态城市", "url": "http://city.ce.cn/main/ecological/", },
            {
    
    "channel_name": "城市频道-城市经济", "url": "http://city.ce.cn/main/economy/", },
            {
    
    "channel_name": "城市频道-环球观察", "url": "http://city.ce.cn/main/observation/", },
            {
    
    "channel_name": "城市频道-省市动态", "url": "http://city.ce.cn/main/news/", },
            {
    
    "channel_name": "城市频道-城市周刊", "url": "http://city.ce.cn/main/cityweek/", },
        ],
        # 公益频道
        [
            {
    
    "channel_name": "公益频道-社会责任", "url": "http://gongyi.ce.cn/gy/zr/", },
            {
    
    "channel_name": "公益频道-公益行动", "url": "http://gongyi.ce.cn/gy/gyxd/", },
            {
    
    "channel_name": "公益频道-公益新闻", "url": "http://gongyi.ce.cn/news/", },
        ],
        # 生活频道
        [
            {
    
    "channel_name": "生活频道", "url": "http://fashion.ce.cn/", },
        ],
        # 健康频道
        [
            {
    
    "channel_name": "健康频道-专稿", "url": "http://health.ce.cn/zg/", },
            {
    
    "channel_name": "健康频道-资讯", "url": "http://health.ce.cn/news/", },
            {
    
    "channel_name": "健康频道-养老咨询", "url": "http://health.ce.cn/sy2015/ylzx/", },
            {
    
    "channel_name": "健康频道-权威发布", "url": "http://health.ce.cn/sy2015/qwfb/", },
            {
    
    "channel_name": "健康频道-家庭护理", "url": "http://health.ce.cn/sy2015/jthl/", },
            {
    
    "channel_name": "健康频道-高端访谈", "url": "http://health.ce.cn/sy2015/gdft/", },
            {
    
    "channel_name": "健康频道-休闲健身", "url": "http://health.ce.cn/sy2015/xxjs/", },
            {
    
    "channel_name": "健康频道-健康产业", "url": "http://health.ce.cn/sy2015/jkcy/", },
            {
    
    "channel_name": "健康频道-图片", "url": "http://health.ce.cn/sy2015/tp/", },
            {
    
    "channel_name": "健康频道-医药科技", "url": "http://health.ce.cn/sy2015/yykj/", },
            {
    
    "channel_name": "健康频道-健康名人堂", "url": "http://health.ce.cn/sy2015/qwzj/", },
            {
    
    "channel_name": "健康频道-养生保健", "url": "http://health.ce.cn/sy2015/ysbj/", },
            {
    
    "channel_name": "健康频道-曝光台", "url": "http://health.ce.cn/sy2015/hyxw/", },
            {
    
    "channel_name": "健康频道-健康资讯", "url": "http://health.ce.cn/sy2015/jkzx/", },
        ],
        # 科技频道
        [
            {
    
    "channel_name": "科技频道-产经资讯", "url": "http://tech.ce.cn/cjzx/", },
            {
    
    "channel_name": "科技频道-在线教育", "url": "http://tech.ce.cn/tech2018/zxjy/", },
            {
    
    "channel_name": "科技频道-网络安全", "url": "http://tech.ce.cn/tech2018/safe/", },
            {
    
    "channel_name": "科技频道-创新科技", "url": "http://tech.ce.cn/tech2018/newtech/", },
            {
    
    "channel_name": "科技频道-科技生活", "url": "http://tech.ce.cn/tech2018/life/", },
            {
    
    "channel_name": "科技频道-人工智能", "url": "http://tech.ce.cn/tech2018/rgzn/", },
            {
    
    "channel_name": "科技频道-科学新知", "url": "http://tech.ce.cn/tech2018/kx/", },
            {
    
    "channel_name": "科技频道-科技名企", "url": "http://tech.ce.cn/tech2018/kjmq/", },
            {
    
    "channel_name": "科技频道-科技新闻", "url": "http://tech.ce.cn/news/", },
        ],
        # 旅游经济
        [
            {
    
    "channel_name": "旅游经济-滚动", "url": "http://travel.ce.cn/gdtj/", },
            {
    
    "channel_name": "旅游经济-海南", "url": "http://travel.ce.cn/ztk/hainan/", },
            {
    
    "channel_name": "旅游经济-文化旅游", "url": "http://travel.ce.cn/xsy/gl/", },
            {
    
    "channel_name": "旅游经济-舆情投诉", "url": "http://travel.ce.cn/xsy/yq/", },
            {
    
    "channel_name": "旅游经济-酒店航空", "url": "http://travel.ce.cn/xsy/jd/", },
            {
    
    "channel_name": "旅游经济-在线旅游", "url": "http://travel.ce.cn/xsy/zx/", },
            {
    
    "channel_name": "旅游经济-旅游经济信息联播", "url": "http://travel.ce.cn/xsy/sp/", },
            {
    
    "channel_name": "旅游经济-权威发布", "url": "http://travel.ce.cn/xsy/fb/", },
            {
    
    "channel_name": "旅游经济-产业经济", "url": "http://travel.ce.cn/xsy/cy/", },
        ],
        # 中国商用汽车网
        [
            {
    
    "channel_name": "中国商用汽车网-滚动新闻", "url": "http://cv.ce.cn/news/", },
            {
    
    "channel_name": "中国商用汽车网-交通新闻", "url": "http://cv.ce.cn/2020/jtxw/", },
            {
    
    "channel_name": "中国商用汽车网-专题推荐", "url": "http://cv.ce.cn/2020/zttj/", },
            {
    
    "channel_name": "中国商用汽车网-试驾报告", "url": "http://cv.ce.cn/2020/xcsj/", },
            {
    
    "channel_name": "中国商用汽车网-新车上市", "url": "http://cv.ce.cn/2020/xcsj/", },
            {
    
    "channel_name": "中国商用汽车网-企业动态", "url": "http://cv.ce.cn/2020/qydt/", },
            {
    
    "channel_name": "中国商用汽车网-行业资讯", "url": "http://cv.ce.cn/2020/hyzx/", },
            {
    
    "channel_name": "中国商用汽车网-本网专稿", "url": "http://cv.ce.cn/2020/bwzg/", },
        ],
        # 家电频道
        [
            {
    
    "channel_name": "家电频道-网购卖场", "url": "http://www.ce.cn/cysc/zgjd/wgsv/", },
            {
    
    "channel_name": "家电频道-政策法规", "url": "http://www.ce.cn/cysc/zgjd/zcfg/", },
            {
    
    "channel_name": "家电频道-业绩财报", "url": "http://www.ce.cn/cysc/zgjd/yjcb/", },
            {
    
    "channel_name": "家电频道-质量曝光", "url": "http://www.ce.cn/cysc/zgjd/jdsh/", },
            {
    
    "channel_name": "家电频道-行业新闻", "url": "http://www.ce.cn/cysc/zgjd/hyfx/", },
            {
    
    "channel_name": "家电频道-公司观察", "url": "http://www.ce.cn/cysc/zgjd/qycz/", },
            {
    
    "channel_name": "家电频道-业界动态", "url": "http://www.ce.cn/cysc/zgjd/yjxw/", },
            {
    
    "channel_name": "家电频道-今日更新", "url": "http://www.ce.cn/cysc/zgjd/kx/", },
        ],
    ]
  • 样式整理

整体网站数据列表有多少种样式就要做多少个parseX,并添加到

        parse_list = [
            self.parse1,  # 金融证券
            self.parse1,  # 期货频道
            self.parse1,  # 新三板
            self.parse1,  # 理财
            self.parse1,  # 基金频道
            self.parse1,  # 保险频道
            self.parse1,  # 银行频道
            self.parse1,  # 股市频道
            self.parse1,  # 滚动新闻
            self.parse2,  # 食品
            self.parse3,  # 房产
            self.parse2,  # 能源
            self.parse2,  # IT
            self.parse2,  # 家电
            self.parse2,  # 交通
            self.parse2,  # 质量安全
            self.parse4,  # 质量经济
            self.parse5,  # 医药频道
            self.parse2,  # 生态文明
            self.parse6,  # 旅游频道
            self.parse0,  # 文化产业
            self.parse0,  # 书画
            self.parse0,  # 时政社会
            self.parse0,  # 中经视频
            self.parse0,  # 评论理论
            self.parse6,  # 脱贫攻坚
            self.parse7,  # 汽车
            self.parse0,  # 会展
            self.parse0,  # 城市频道
            self.parse6,  # 公益频道
            self.parse0,  # 中国经济网-生活频道
            self.parse6,  # 健康频道
            self.parse6,  # 科技频道
            self.parse6,  # 旅游经济
            self.parse2,  # 中国商用汽车网
            self.parse2,  # 家电频道
        ]
  • 标题&链接&封面
    由于整体网站内容列表没有图片因此不使用Item_thumbImg
# 样式1通用
        data = extract_list(response.text)
        for each in range(len(data)):
			item['title'] = data[each]["title"].strip()  # 内容标题
			item['url'] = parse.urljoin(response.url, data[each]["url"])  # 拼接正文url

# 样式1
        Item_title = response.xpath('//tr/td[@class="font14"]/a/text()').extract()  # 文章标题列表
        Item_url = response.xpath('//tr/td[@class="font14"]/a/@href').extract()  # 文章链接列表

# 样式2
        Item_title = response.xpath('//div[@class="left"]/ul/li/a/text()').extract()  # 文章标题列表
        Item_url = response.xpath('//div[@class="left"]/ul/li/a/@href').extract()  # 文章链接列表

# 样式3
       Item_title = response.xpath('//div[@class="sec_left"]/ul/li/span/a/text()').extract()  # 文章标题列表
        Item_url = response.xpath('//div[@class="sec_left"]/ul/li/span/a/@href').extract()  # 文章链接列表

# 样式4
        Item_title = response.xpath('//div[@class="listf"]/ul/li/span/a/text()').extract()  # 文章标题列表
        Item_url = response.xpath('//div[@class="listf"]/ul/li/span/a/@href').extract()  # 文章链接列表

# 样式5
       Item_title = response.xpath('//tr/td[@style="font-size:14px"]/a/text()').extract()  # 文章标题列表
        Item_url = response.xpath('//tr/td[@style="font-size:14px"]/a/@href').extract()  # 文章链接列表

# 样式6
        Item_title = response.xpath('//div[@class="list"]/ul/li/a/text()').extract()  # 文章标题列表
        Item_url = response.xpath('//div[@class="list"]/ul/li/a/@href').extract()  # 文章链接列表

# 样式7
        Item_title = response.xpath('//div[@class="piclist plno clearfix"]/h2/a/text()').extract()  # 文章标题列表
        Item_url = response.xpath('//div[@class="piclist plno clearfix"]/h2/a/@href').extract()  # 文章链接列表
        Item_thumbImg = response.xpath('//div[@class="piclist plno clearfix"]/a/img/@src').extract()  # 文章封面图片列表


Spider下的parse_detail.py文件

1. 抓取详情页内容

修改列表数据详情页的CSS抓取样式,总结了3种样式。
在这里插入图片描述

    # 处理详情页带格式,这里整个页面进行抓取
    	item['content'] = ""
        if 'class="content"' in response.text and len(None2Str(item['content'])) < 5:
            item['content'] = response.xpath('//div[@class="content"]').extract_first()
        if 'tbody' in response.text and len(None2Str(item['content'])) < 5:
            item['content'] = response.xpath('//tbody').extract_first()
        if 'body' in response.text and len(None2Str(item['content'])) < 5:
            item['content'] = response.xpath('//body').extract_first()

2. 特别说明

有些网站的程序员丧心病狂到一定程度10个页面9种样式这种,由于我们不可能每个页面都打开看一下详情页的CSS格式,因此有个通用的解决办法。

  • 第一次抓取完内容之后打开MongoDB数据库执行下面的命令会把包含body的页面数据筛选出来,这些是没有根据指定样式抓取的数据,而是直接抓的全部页面的数据。
db.你的表名.find({content:/body/})

在这里插入图片描述

  • 打开任意的link循环处理详情页的内容直到mongo命令没有筛选出来内容为止即可。

猜你喜欢

转载自blog.csdn.net/qq_20288327/article/details/114133616
今日推荐