Python3 爬虫--公司代理问题解决

废话

好久没有造过轮子了,突发奇想解决一下一进公司写爬虫就遇到的代理的问题

正文

如果没有代理问题,如下代码就可以获取到网页 html 源码

import urllib
import urllib.request
from bs4 import BeautifulSoup

url = "http://wintersmilesb101.online/"

user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
req = urllib.request.Request(url, headers={
    'User-Agent': 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
})
response = urllib.request.urlopen(req)
content = conn.read().decode('utf-8')
print(content)

运行:

报错误信息

"C:\Program Files (x86)\Anaconda3\python.exe" D:/Alvin/PersonalProjects/Python/Spider/WinterSmileSB101Blog/main_error.py
Traceback (most recent call last):
  File "D:/Alvin/PersonalProjects/Python/Spider/WinterSmileSB101Blog/main_error.py", line 12, in <module>
    response = urllib.request.urlopen(req)
  File "C:\Program Files (x86)\Anaconda3\lib\urllib\request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Program Files (x86)\Anaconda3\lib\urllib\request.py", line 532, in open
    response = meth(req, response)
  File "C:\Program Files (x86)\Anaconda3\lib\urllib\request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Program Files (x86)\Anaconda3\lib\urllib\request.py", line 570, in error
    return self._call_chain(*args)
  File "C:\Program Files (x86)\Anaconda3\lib\urllib\request.py", line 504, in _call_chain
    result = func(*args)
  File "C:\Program Files (x86)\Anaconda3\lib\urllib\request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 407: Proxy Authentication Required ( Forefront TMG requires authorization to fulfill the request. Access to the Web Proxy filter is denied.  )
Process finished with exit code 1

从信息 urllib.error.HTTPError: HTTP Error 407: Proxy Authentication Required ( Forefront TMG requires authorization to fulfill the request. Access to the Web Proxy filter is denied. )

来看,是需要我们设置代理验证。

通过 request 中的 ProxyHandler 来设置我们的代理,

proxy = req.ProxyHandler({‘https’: ‘s1firewall:8080’}) 这个是 公司的代理设置方式,也就是前面是 链接的方式 http 或者 https,我试过 http 无效,所以这里使用 https,后面就是代理的 Address 和端口号

有些代理还可能需要 用户名和密码,就会写成类似这样,不过这里公司并不需要这样设置,这样设置反而会连不上代理服务器:

proxy = req.ProxyHandler({‘http’: r’http://username:password@url:port‘})

完整的设置代码如下:

设置代理之后

import urllib
import urllib.request as req
from bs4 import BeautifulSoup

url = "http://wintersmilesb101.online/"

user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
# 设置代理 IP,http 不行,使用 https
proxy = req.ProxyHandler({'https': 's1firewall:8080'})
auth = req.HTTPBasicAuthHandler()
# 构造 opener
opener = req.build_opener(proxy, auth, req.HTTPHandler)
# 添加 header
opener.addheaders = [('User-Agent', user_agent)]
# 安装 opener
req.install_opener(opener)
# 打开链接
conn = req.urlopen(url)
# 以 utf-8 编码获取网页内容
content = conn.read().decode('utf-8')
# 输出
print(content)

运行:

最终输出 Collapse source

"C:\Program Files (x86)\Anaconda3\python.exe" D:/Alvin/PersonalProjects/Python/Spider/WinterSmileSB101Blog/main.py
<!doctype html>


<html class="theme-next mist use-motion" lang="zh-Hans,zh-hk,en,fr-FR,ru,de,ja,id,ko,default">
<head>
  <meta charset="UTF-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1"/>
<meta http-equiv="Cache-Control" content="no-transform" />
<meta http-equiv="Cache-Control" content="no-siteapp" />


  <link href="/lib/fancybox/source/jquery.fancybox.css?v=2.1.5" rel="stylesheet" type="text/css" />































    <link href="//fonts.googleapis.com/css?family=Monda:300,300italic,400,400italic,700,700italic|Roboto Slab:300,300italic,400,400italic,700,700italic|Lobster Two:300,300italic,400,400italic,700,700italic|PT Mono:300,300italic,400,400italic,700,700italic&subset=latin,latin-ext" rel="stylesheet" type="text/css">


<link href="/lib/font-awesome/css/font-awesome.min.css?v=4.6.2" rel="stylesheet" type="text/css" />
<link href="/css/main.css?v=5.1.0" rel="stylesheet" type="text/css" />

  <meta name="keywords" content="Android,JAVA,Unity3D,C#,javaScript,开发者,程序猿,极客,编程,开源,IT网站,Developer,Programmer,Coder,Geek,html,用户体验" />
  <link rel="alternate" href="http://blog.csdn.net/qq_21265915/rss/list" title="WinterSmileSB101 的个人房间" type="application/atom+xml" />

  <link rel="shortcut icon" type="image/x-icon" href="/images/myHeadImg.jpeg?v=5.1.0" />

<meta name="description" content="成都工业学院14级,学习了各种后台技能,对前端也甚是抱有好感,准备再入坑前端。">
<meta property="og:type" content="website">
<meta property="og:title" content="WinterSmileSB101 的个人房间">
<meta property="og:url" content="http://WinterSmileSB101.online/index.html">
<meta property="og:site_name" content="WinterSmileSB101 的个人房间">
<meta property="og:description" content="成都工业学院14级,学习了各种后台技能,对前端也甚是抱有好感,准备再入坑前端。">
<meta name="twitter:card" content="summary">
<meta name="twitter:title" content="WinterSmileSB101 的个人房间">
<meta name="twitter:description" content="成都工业学院14级,学习了各种后台技能,对前端也甚是抱有好感,准备再入坑前端。">
<script type="text/javascript" id="hexo.configurations">
  var NexT = window.NexT || {};
  var CONFIG = {
    root: '/',
    scheme: 'Mist',
    sidebar: {"position":"right","display":"post","offset":12,"offset_float":0,"b2t":true,"scrollpercent":true},
    fancybox: true,
    motion: true,
    duoshuo: {
      userId: '6376853978663093000',
      author: 'WinterSmileSB101'
    },
    algolia: {
      applicationID: '',
      apiKey: '',
      indexName: '',
      hits: {"per_page":10},
      labels: {"input_placeholder":"Search for Posts","hits_empty":"We didn't find any results for the search: ${query}","hits_stats":"${hits} results found in ${time} ms"}
    }
  };
</script>
  <link rel="canonical" href="http://WinterSmileSB101.online/"/>
  <title> WinterSmileSB101 的个人房间 </title>
</head>
<body itemscope itemtype="http://schema.org/WebPage" lang="zh-Hans">






  <div class="container sidebar-position-right
   page-home
 ">
    <div class="headband"></div>
    <header id="header" class="header" itemscope itemtype="http://schema.org/WPHeader">
      <div class="header-inner"><div class="site-brand-wrapper">
  <div class="site-meta ">

    <div class="custom-logo-site-title">
      <a href="/"  class="brand" rel="start">
        <span class="logo-line-before"><i></i></span>
        <span class="site-title">WinterSmileSB101 的个人房间</span>
        <span class="logo-line-after"><i></i></span>
      </a>
    </div>

        <h1 class="site-subtitle" itemprop="description">胆小认生,不易相处</h1>

  </div>
  <div class="site-nav-toggle">
    <button>
      <span class="btn-bar"></span>
      <span class="btn-bar"></span>
      <span class="btn-bar"></span>
    </button>
  </div>
</div>
<nav class="site-nav">


    <ul id="menu" class="menu">


        <li class="menu-item menu-item-home">
          <a href="/" rel="section">

              <i class="menu-item-icon fa fa-fw fa-home"></i> <br />

            首页
          </a>
        </li>


        <li class="menu-item menu-item-categories">
          <a href="/categories" rel="section">

              <i class="menu-item-icon fa fa-fw fa-th"></i> <br />

            分类
          </a>
        </li>


        <li class="menu-item menu-item-about">
          <a href="/about" rel="section">

              <i class="menu-item-icon fa fa-fw fa-user"></i> <br />

            关于
          </a>
        </li>


        <li class="menu-item menu-item-archives">
          <a href="/archives" rel="section">

              <i class="menu-item-icon fa fa-fw fa-archive"></i> <br />

            归档
          </a>
        </li>


        <li class="menu-item menu-item-tags">
          <a href="/tags" rel="section">

              <i class="menu-item-icon fa fa-fw fa-tags"></i> <br />

            标签
          </a>
        </li>


        <li class="menu-item menu-item-commonweal">
          <a href="/404.html" rel="section">

              <i class="menu-item-icon fa fa-fw fa-heartbeat"></i> <br />

            公益404
          </a>
        </li>


        <li class="menu-item menu-item-search">

            <a href="javascript:;" class="popup-trigger">


              <i class="menu-item-icon fa fa-search fa-fw"></i> <br />

            搜索
          </a>
        </li>

    </ul>


    <div class="site-search">

  <div class="popup search-popup local-search-popup">
  <div class="local-search-header clearfix">
    <span class="search-icon">
      <i class="fa fa-search"></i>
    </span>
    <span class="popup-btn-close">
      <i class="fa fa-times-circle"></i>
    </span>
    <div class="local-search-input-wrapper">
      <input autocapitalize="off" autocomplete="off" autocorrect="off"
             placeholder="搜索..." spellcheck="false"
             type="text" id="local-search-input">
    </div>
  </div>
  <div id="local-search-result"></div>
</div>
    </div>

</nav>
 </div>
    </header>
    <main id="main" class="main">
      <div class="main-inner">
        <div class="content-wrap">
          <div id="content" class="content">

  <section id="posts" class="posts-expand">






  <article class="post post-type-normal " itemscope itemtype="http://schema.org/Article">
    <link itemprop="mainEntityOfPage" href="http://WinterSmileSB101.online/2017/04/08/Python3.7 爬虫(二)使用 Urllib2 与 BeautifulSoup 抓取解析网页/">
    <span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
      <meta itemprop="name" content="WinterSmileSB101">
      <meta itemprop="description" content="">
      <meta itemprop="image" content="http://on792ofrp.bkt.clouddn.com/17-3-22/29073846-file_1490159480452_d2de.jpg">
    </span>
    <span hidden itemprop="publisher" itemscope itemtype="http://schema.org/Organization">
      <meta itemprop="name" content="WinterSmileSB101 的个人房间">
    </span>

      <header class="post-header">


          <h2 class="post-title" itemprop="name headline">




                <a class="post-title-link" href="/2017/04/08/Python3.7 爬虫(二)使用 Urllib2 与 BeautifulSoup 抓取解析网页/" itemprop="url">
                  Python3.7 爬虫(二)使用 Urllib2 与 BeautifulSoup4 抓取解析网页
                </a>


          </h2>

        <div class="post-meta">
          <span class="post-time">

              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-o"></i>
              </span>

                <span class="post-meta-item-text">发表于</span>

              <time title="创建于" itemprop="dateCreated datePublished" datetime="2017-04-08T16:55:47+08:00">
                2017-04-08
              </time>


              <span class="post-meta-divider">|</span>


              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-check-o"></i>
              </span>

                <span class="post-meta-item-text">更新于</span>

              <time title="更新于" itemprop="dateModified" datetime="2017-04-09T14:17:52+08:00">
                2017-04-09
              </time>

          </span>

            <span class="post-category" >

              <span class="post-meta-divider">|</span>

              <span class="post-meta-item-icon">
                <i class="fa fa-folder-o"></i>
              </span>

                <span class="post-meta-item-text">分类于</span>


                <span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/爬虫/" itemprop="url" rel="index">
                    <span itemprop="name">爬虫</span>
                  </a>
                </span><span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/爬虫/Python-爬虫/" itemprop="url" rel="index">
                    <span itemprop="name">Python 爬虫</span>
                  </a>
                </span>



            </span>



              <span class="post-comments-count">
                <span class="post-meta-divider">|</span>
                <span class="post-meta-item-icon">
                  <i class="fa fa-comment-o"></i>
                </span>
                <a href="/2017/04/08/Python3.7 爬虫(二)使用 Urllib2 与 BeautifulSoup 抓取解析网页/#comments" itemprop="discussionUrl">
                  <span class="post-comments-count ds-thread-count" data-thread-key="2017/04/08/Python3.7 爬虫(二)使用 Urllib2 与 BeautifulSoup 抓取解析网页/" itemprop="commentCount"></span>
                </a>
              </span>




             <span id="/2017/04/08/Python3.7 爬虫(二)使用 Urllib2 与 BeautifulSoup 抓取解析网页/" class="leancloud_visitors" data-flag-title="Python3.7 爬虫(二)使用 Urllib2 与 BeautifulSoup4 抓取解析网页">
               <span class="post-meta-divider">|</span>
               <span class="post-meta-item-icon">
                 <i class="fa fa-eye"></i>
               </span>

                 <span class="post-meta-item-text">阅读次数 </span>

                 <span class="leancloud-visitors-count"></span>
             </span>




        </div>
      </header>


    <div class="post-body" itemprop="articleBody">






版权声明:本文为 wintersmilesb101 -(个人独立博客– http://wintersmilesb101.online 欢迎访问)博主原创文章,未经博主允许不得转载。
开篇上一篇中我们通过原生的 re 模块已经完成了网页的解析,对于熟悉正则表达式的童鞋来说很好上手,但是对于萌新来说
          ...
          <!--noindex-->
          <div class="post-button text-center">
            <a class="btn" href="/2017/04/08/Python3.7 爬虫(二)使用 Urllib2 与 BeautifulSoup 抓取解析网页/#more" rel="contents">
              阅读全文 &raquo;
            </a>
          </div>
          <!--/noindex-->


    </div>
    <div>

    </div>
    <div>

    </div>
    <div>

    </div>
    <footer class="post-footer">





        <div class="post-eof"></div>

    </footer>
  </article>







  <article class="post post-type-normal " itemscope itemtype="http://schema.org/Article">
    <link itemprop="mainEntityOfPage" href="http://WinterSmileSB101.online/2017/04/08/Python3.7 爬虫(一)使用 Urllib 与正则表达式抓取/">
    <span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
      <meta itemprop="name" content="WinterSmileSB101">
      <meta itemprop="description" content="">
      <meta itemprop="image" content="http://on792ofrp.bkt.clouddn.com/17-3-22/29073846-file_1490159480452_d2de.jpg">
    </span>
    <span hidden itemprop="publisher" itemscope itemtype="http://schema.org/Organization">
      <meta itemprop="name" content="WinterSmileSB101 的个人房间">
    </span>

      <header class="post-header">


          <h2 class="post-title" itemprop="name headline">




                <a class="post-title-link" href="/2017/04/08/Python3.7 爬虫(一)使用 Urllib 与正则表达式抓取/" itemprop="url">
                  Python3.7 爬虫(一)使用 Urllib2 与正则表达式抓取
                </a>


          </h2>

        <div class="post-meta">
          <span class="post-time">

              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-o"></i>
              </span>

                <span class="post-meta-item-text">发表于</span>

              <time title="创建于" itemprop="dateCreated datePublished" datetime="2017-04-08T16:55:47+08:00">
                2017-04-08
              </time>


              <span class="post-meta-divider">|</span>


              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-check-o"></i>
              </span>

                <span class="post-meta-item-text">更新于</span>

              <time title="更新于" itemprop="dateModified" datetime="2017-04-09T10:25:07+08:00">
                2017-04-09
              </time>

          </span>

            <span class="post-category" >

              <span class="post-meta-divider">|</span>

              <span class="post-meta-item-icon">
                <i class="fa fa-folder-o"></i>
              </span>

                <span class="post-meta-item-text">分类于</span>


                <span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/爬虫/" itemprop="url" rel="index">
                    <span itemprop="name">爬虫</span>
                  </a>
                </span><span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/爬虫/Python-爬虫/" itemprop="url" rel="index">
                    <span itemprop="name">Python 爬虫</span>
                  </a>
                </span>



            </span>



              <span class="post-comments-count">
                <span class="post-meta-divider">|</span>
                <span class="post-meta-item-icon">
                  <i class="fa fa-comment-o"></i>
                </span>
                <a href="/2017/04/08/Python3.7 爬虫(一)使用 Urllib 与正则表达式抓取/#comments" itemprop="discussionUrl">
                  <span class="post-comments-count ds-thread-count" data-thread-key="2017/04/08/Python3.7 爬虫(一)使用 Urllib 与正则表达式抓取/" itemprop="commentCount"></span>
                </a>
              </span>




             <span id="/2017/04/08/Python3.7 爬虫(一)使用 Urllib 与正则表达式抓取/" class="leancloud_visitors" data-flag-title="Python3.7 爬虫(一)使用 Urllib2 与正则表达式抓取">
               <span class="post-meta-divider">|</span>
               <span class="post-meta-item-icon">
                 <i class="fa fa-eye"></i>
               </span>

                 <span class="post-meta-item-text">阅读次数 </span>

                 <span class="leancloud-visitors-count"></span>
             </span>




        </div>
      </header>


    <div class="post-body" itemprop="articleBody">






版权声明:本文为 wintersmilesb101 -(个人独立博客– http://wintersmilesb101.online 欢迎访问)博主原创文章,未经博主允许不得转载。
我们今天就一起来通过 Python3 自带库 Urllib 与正则表达式来抓取糗事百科。废话不多说,下面正题:分析
          ...
          <!--noindex-->
          <div class="post-button text-center">
            <a class="btn" href="/2017/04/08/Python3.7 爬虫(一)使用 Urllib 与正则表达式抓取/#more" rel="contents">
              阅读全文 &raquo;
            </a>
          </div>
          <!--/noindex-->


    </div>
    <div>

    </div>
    <div>

    </div>
    <div>

    </div>
    <footer class="post-footer">





        <div class="post-eof"></div>

    </footer>
  </article>







  <article class="post post-type-normal " itemscope itemtype="http://schema.org/Article">
    <link itemprop="mainEntityOfPage" href="http://WinterSmileSB101.online/2017/04/08/Python3.7 爬虫(三)使用 Urllib2 与 BeautifulSoup 爬取网易云音乐歌单/">
    <span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
      <meta itemprop="name" content="WinterSmileSB101">
      <meta itemprop="description" content="">
      <meta itemprop="image" content="http://on792ofrp.bkt.clouddn.com/17-3-22/29073846-file_1490159480452_d2de.jpg">
    </span>
    <span hidden itemprop="publisher" itemscope itemtype="http://schema.org/Organization">
      <meta itemprop="name" content="WinterSmileSB101 的个人房间">
    </span>

      <header class="post-header">


          <h2 class="post-title" itemprop="name headline">




                <a class="post-title-link" href="/2017/04/08/Python3.7 爬虫(三)使用 Urllib2 与 BeautifulSoup 爬取网易云音乐歌单/" itemprop="url">
                  Python3.7 爬虫(三)使用 Urllib2 与 BeautifulSoup4 爬取网易云音乐歌单
                </a>


          </h2>

        <div class="post-meta">
          <span class="post-time">

              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-o"></i>
              </span>

                <span class="post-meta-item-text">发表于</span>

              <time title="创建于" itemprop="dateCreated datePublished" datetime="2017-04-08T16:55:47+08:00">
                2017-04-08
              </time>


              <span class="post-meta-divider">|</span>


              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-check-o"></i>
              </span>

                <span class="post-meta-item-text">更新于</span>

              <time title="更新于" itemprop="dateModified" datetime="2017-04-09T20:03:39+08:00">
                2017-04-09
              </time>

          </span>

            <span class="post-category" >

              <span class="post-meta-divider">|</span>

              <span class="post-meta-item-icon">
                <i class="fa fa-folder-o"></i>
              </span>

                <span class="post-meta-item-text">分类于</span>


                <span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/爬虫/" itemprop="url" rel="index">
                    <span itemprop="name">爬虫</span>
                  </a>
                </span><span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/爬虫/Python-爬虫/" itemprop="url" rel="index">
                    <span itemprop="name">Python 爬虫</span>
                  </a>
                </span>



            </span>



              <span class="post-comments-count">
                <span class="post-meta-divider">|</span>
                <span class="post-meta-item-icon">
                  <i class="fa fa-comment-o"></i>
                </span>
                <a href="/2017/04/08/Python3.7 爬虫(三)使用 Urllib2 与 BeautifulSoup 爬取网易云音乐歌单/#comments" itemprop="discussionUrl">
                  <span class="post-comments-count ds-thread-count" data-thread-key="2017/04/08/Python3.7 爬虫(三)使用 Urllib2 与 BeautifulSoup 爬取网易云音乐歌单/" itemprop="commentCount"></span>
                </a>
              </span>




             <span id="/2017/04/08/Python3.7 爬虫(三)使用 Urllib2 与 BeautifulSoup 爬取网易云音乐歌单/" class="leancloud_visitors" data-flag-title="Python3.7 爬虫(三)使用 Urllib2 与 BeautifulSoup4 爬取网易云音乐歌单">
               <span class="post-meta-divider">|</span>
               <span class="post-meta-item-icon">
                 <i class="fa fa-eye"></i>
               </span>

                 <span class="post-meta-item-text">阅读次数 </span>

                 <span class="leancloud-visitors-count"></span>
             </span>




        </div>
      </header>


    <div class="post-body" itemprop="articleBody">






版权声明:本文为 wintersmilesb101 -(个人独立博客– http://wintersmilesb101.online 欢迎访问)博主原创文章,未经博主允许不得转载。
废话在前面的的博客中我们已经能够使用 python3 配合自带的库或者第三方库抓取以及解析网页,我们今天来试试抓取
          ...
          <!--noindex-->
          <div class="post-button text-center">
            <a class="btn" href="/2017/04/08/Python3.7 爬虫(三)使用 Urllib2 与 BeautifulSoup 爬取网易云音乐歌单/#more" rel="contents">
              阅读全文 &raquo;
            </a>
          </div>
          <!--/noindex-->


    </div>
    <div>

    </div>
    <div>

    </div>
    <div>

    </div>
    <footer class="post-footer">





        <div class="post-eof"></div>

    </footer>
  </article>







  <article class="post post-type-normal " itemscope itemtype="http://schema.org/Article">
    <link itemprop="mainEntityOfPage" href="http://WinterSmileSB101.online/2017/03/29/css-els/">
    <span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
      <meta itemprop="name" content="WinterSmileSB101">
      <meta itemprop="description" content="">
      <meta itemprop="image" content="http://on792ofrp.bkt.clouddn.com/17-3-22/29073846-file_1490159480452_d2de.jpg">
    </span>
    <span hidden itemprop="publisher" itemscope itemtype="http://schema.org/Organization">
      <meta itemprop="name" content="WinterSmileSB101 的个人房间">
    </span>

      <header class="post-header">


          <h2 class="post-title" itemprop="name headline">




                <a class="post-title-link" href="/2017/03/29/css-els/" itemprop="url">
                  Css 文字省略样式(单行/多行)
                </a>


          </h2>

        <div class="post-meta">
          <span class="post-time">

              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-o"></i>
              </span>

                <span class="post-meta-item-text">发表于</span>

              <time title="创建于" itemprop="dateCreated datePublished" datetime="2017-03-29T08:47:44+08:00">
                2017-03-29
              </time>


              <span class="post-meta-divider">|</span>


              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-check-o"></i>
              </span>

                <span class="post-meta-item-text">更新于</span>

              <time title="更新于" itemprop="dateModified" datetime="2017-03-29T09:03:16+08:00">
                2017-03-29
              </time>

          </span>

            <span class="post-category" >

              <span class="post-meta-divider">|</span>

              <span class="post-meta-item-icon">
                <i class="fa fa-folder-o"></i>
              </span>

                <span class="post-meta-item-text">分类于</span>


                <span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/WEB/" itemprop="url" rel="index">
                    <span itemprop="name">WEB</span>
                  </a>
                </span><span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/WEB/前端开发/" itemprop="url" rel="index">
                    <span itemprop="name">前端开发</span>
                  </a>
                </span><span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/WEB/前端开发/CSS/" itemprop="url" rel="index">
                    <span itemprop="name">CSS</span>
                  </a>
                </span>



            </span>



              <span class="post-comments-count">
                <span class="post-meta-divider">|</span>
                <span class="post-meta-item-icon">
                  <i class="fa fa-comment-o"></i>
                </span>
                <a href="/2017/03/29/css-els/#comments" itemprop="discussionUrl">
                  <span class="post-comments-count ds-thread-count" data-thread-key="2017/03/29/css-els/" itemprop="commentCount"></span>
                </a>
              </span>




             <span id="/2017/03/29/css-els/" class="leancloud_visitors" data-flag-title="Css 文字省略样式(单行/多行)">
               <span class="post-meta-divider">|</span>
               <span class="post-meta-item-icon">
                 <i class="fa fa-eye"></i>
               </span>

                 <span class="post-meta-item-text">阅读次数 </span>

                 <span class="leancloud-visitors-count"></span>
             </span>




        </div>
      </header>


    <div class="post-body" itemprop="articleBody">






版权声明:本文为 wintersmilesb101 -(个人独立博客– http://wintersmilesb101.online 欢迎访问)博主转载文章,原文地址。
效果图
上面的效果实现代码如下:12345678910111213141516171819202122232425262728
          ...
          <!--noindex-->
          <div class="post-button text-center">
            <a class="btn" href="/2017/03/29/css-els/#more" rel="contents">
              阅读全文 &raquo;
            </a>
          </div>
          <!--/noindex-->


    </div>
    <div>

    </div>
    <div>

    </div>
    <div>

    </div>
    <footer class="post-footer">





        <div class="post-eof"></div>

    </footer>
  </article>







  <article class="post post-type-normal " itemscope itemtype="http://schema.org/Article">
    <link itemprop="mainEntityOfPage" href="http://WinterSmileSB101.online/2017/03/28/mui-tab-pages/">
    <span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
      <meta itemprop="name" content="WinterSmileSB101">
      <meta itemprop="description" content="">
      <meta itemprop="image" content="http://on792ofrp.bkt.clouddn.com/17-3-22/29073846-file_1490159480452_d2de.jpg">
    </span>
    <span hidden itemprop="publisher" itemscope itemtype="http://schema.org/Organization">
      <meta itemprop="name" content="WinterSmileSB101 的个人房间">
    </span>

      <header class="post-header">


          <h2 class="post-title" itemprop="name headline">




                <a class="post-title-link" href="/2017/03/28/mui-tab-pages/" itemprop="url">
                  MUI 使用爬坑之路之 tab 多页面操作
                </a>


          </h2>

        <div class="post-meta">
          <span class="post-time">

              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-o"></i>
              </span>

                <span class="post-meta-item-text">发表于</span>

              <time title="创建于" itemprop="dateCreated datePublished" datetime="2017-03-28T13:08:23+08:00">
                2017-03-28
              </time>


              <span class="post-meta-divider">|</span>


              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-check-o"></i>
              </span>

                <span class="post-meta-item-text">更新于</span>

              <time title="更新于" itemprop="dateModified" datetime="2017-03-29T09:04:49+08:00">
                2017-03-29
              </time>

          </span>

            <span class="post-category" >

              <span class="post-meta-divider">|</span>

              <span class="post-meta-item-icon">
                <i class="fa fa-folder-o"></i>
              </span>

                <span class="post-meta-item-text">分类于</span>


                <span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/WEB/" itemprop="url" rel="index">
                    <span itemprop="name">WEB</span>
                  </a>
                </span><span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/WEB/前端开发/" itemprop="url" rel="index">
                    <span itemprop="name">前端开发</span>
                  </a>
                </span><span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/WEB/前端开发/Hbuilder/" itemprop="url" rel="index">
                    <span itemprop="name">Hbuilder</span>
                  </a>
                </span><span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/WEB/前端开发/Hbuilder/MUI/" itemprop="url" rel="index">
                    <span itemprop="name">MUI</span>
                  </a>
                </span>



            </span>



              <span class="post-comments-count">
                <span class="post-meta-divider">|</span>
                <span class="post-meta-item-icon">
                  <i class="fa fa-comment-o"></i>
                </span>
                <a href="/2017/03/28/mui-tab-pages/#comments" itemprop="discussionUrl">
                  <span class="post-comments-count ds-thread-count" data-thread-key="2017/03/28/mui-tab-pages/" itemprop="commentCount"></span>
                </a>
              </span>




             <span id="/2017/03/28/mui-tab-pages/" class="leancloud_visitors" data-flag-title="MUI 使用爬坑之路之 tab 多页面操作">
               <span class="post-meta-divider">|</span>
               <span class="post-meta-item-icon">
                 <i class="fa fa-eye"></i>
               </span>

                 <span class="post-meta-item-text">阅读次数 </span>

                 <span class="leancloud-visitors-count"></span>
             </span>




        </div>
      </header>


    <div class="post-body" itemprop="articleBody">






版权声明:本文为 wintersmilesb101 -(个人独立博客– http://wintersmilesb101.online 欢迎访问)博主原创文章,未经博主允许不得转载。
最近想入坑前端开发,也是为了开发 App 更快更接地气。在各种前端框架的纠结中我还是决定先入坑 MUI ,开坑不易
          ...
          <!--noindex-->
          <div class="post-button text-center">
            <a class="btn" href="/2017/03/28/mui-tab-pages/#more" rel="contents">
              阅读全文 &raquo;
            </a>
          </div>
          <!--/noindex-->


    </div>
    <div>

    </div>
    <div>

    </div>
    <div>

    </div>
    <footer class="post-footer">





        <div class="post-eof"></div>

    </footer>
  </article>







  <article class="post post-type-normal " itemscope itemtype="http://schema.org/Article">
    <link itemprop="mainEntityOfPage" href="http://WinterSmileSB101.online/2017/03/27/IOnic-first/">
    <span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
      <meta itemprop="name" content="WinterSmileSB101">
      <meta itemprop="description" content="">
      <meta itemprop="image" content="http://on792ofrp.bkt.clouddn.com/17-3-22/29073846-file_1490159480452_d2de.jpg">
    </span>
    <span hidden itemprop="publisher" itemscope itemtype="http://schema.org/Organization">
      <meta itemprop="name" content="WinterSmileSB101 的个人房间">
    </span>

      <header class="post-header">


          <h2 class="post-title" itemprop="name headline">




                <a class="post-title-link" href="/2017/03/27/IOnic-first/" itemprop="url">
                  Ionic2 的使用之坑
                </a>


          </h2>

        <div class="post-meta">
          <span class="post-time">

              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-o"></i>
              </span>

                <span class="post-meta-item-text">发表于</span>

              <time title="创建于" itemprop="dateCreated datePublished" datetime="2017-03-27T19:21:20+08:00">
                2017-03-27
              </time>


              <span class="post-meta-divider">|</span>


              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-check-o"></i>
              </span>

                <span class="post-meta-item-text">更新于</span>

              <time title="更新于" itemprop="dateModified" datetime="2017-03-27T22:20:20+08:00">
                2017-03-27
              </time>

          </span>

            <span class="post-category" >

              <span class="post-meta-divider">|</span>

              <span class="post-meta-item-icon">
                <i class="fa fa-folder-o"></i>
              </span>

                <span class="post-meta-item-text">分类于</span>


                <span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/WEB/" itemprop="url" rel="index">
                    <span itemprop="name">WEB</span>
                  </a>
                </span><span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/WEB/前端开发/" itemprop="url" rel="index">
                    <span itemprop="name">前端开发</span>
                  </a>
                </span><span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/WEB/前端开发/IOnic-AngularJS/" itemprop="url" rel="index">
                    <span itemprop="name">IOnic AngularJS</span>
                  </a>
                </span>



            </span>



              <span class="post-comments-count">
                <span class="post-meta-divider">|</span>
                <span class="post-meta-item-icon">
                  <i class="fa fa-comment-o"></i>
                </span>
                <a href="/2017/03/27/IOnic-first/#comments" itemprop="discussionUrl">
                  <span class="post-comments-count ds-thread-count" data-thread-key="2017/03/27/IOnic-first/" itemprop="commentCount"></span>
                </a>
              </span>




             <span id="/2017/03/27/IOnic-first/" class="leancloud_visitors" data-flag-title="Ionic2 的使用之坑">
               <span class="post-meta-divider">|</span>
               <span class="post-meta-item-icon">
                 <i class="fa fa-eye"></i>
               </span>

                 <span class="post-meta-item-text">阅读次数 </span>

                 <span class="leancloud-visitors-count"></span>
             </span>




        </div>
      </header>


    <div class="post-body" itemprop="articleBody">






版权声明:本文为 wintersmilesb101 -(个人独立博客– http://wintersmilesb101.online 欢迎访问)博主原创文章,未经博主允许不得转载。
在这里引用学习 IOnic 的地方,菜鸟驿站,不仅仅有 IOnic 还有很多其他的比如 Node.js、vue、Re
          ...
          <!--noindex-->
          <div class="post-button text-center">
            <a class="btn" href="/2017/03/27/IOnic-first/#more" rel="contents">
              阅读全文 &raquo;
            </a>
          </div>
          <!--/noindex-->


    </div>
    <div>

    </div>
    <div>

    </div>
    <div>

    </div>
    <footer class="post-footer">





        <div class="post-eof"></div>

    </footer>
  </article>







  <article class="post post-type-normal " itemscope itemtype="http://schema.org/Article">
    <link itemprop="mainEntityOfPage" href="http://WinterSmileSB101.online/2017/03/24/use-phantomjs-dynamic/">
    <span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
      <meta itemprop="name" content="WinterSmileSB101">
      <meta itemprop="description" content="">
      <meta itemprop="image" content="http://on792ofrp.bkt.clouddn.com/17-3-22/29073846-file_1490159480452_d2de.jpg">
    </span>
    <span hidden itemprop="publisher" itemscope itemtype="http://schema.org/Organization">
      <meta itemprop="name" content="WinterSmileSB101 的个人房间">
    </span>

      <header class="post-header">


          <h2 class="post-title" itemprop="name headline">




                <a class="post-title-link" href="/2017/03/24/use-phantomjs-dynamic/" itemprop="url">
                  一起学爬虫 Node.js 爬虫篇(三)使用 PhantomJS 爬取动态页面
                </a>


          </h2>

        <div class="post-meta">
          <span class="post-time">

              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-o"></i>
              </span>

                <span class="post-meta-item-text">发表于</span>

              <time title="创建于" itemprop="dateCreated datePublished" datetime="2017-03-24T09:29:38+08:00">
                2017-03-24
              </time>


              <span class="post-meta-divider">|</span>


              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-check-o"></i>
              </span>

                <span class="post-meta-item-text">更新于</span>

              <time title="更新于" itemprop="dateModified" datetime="2017-03-24T12:57:00+08:00">
                2017-03-24
              </time>

          </span>

            <span class="post-category" >

              <span class="post-meta-divider">|</span>

              <span class="post-meta-item-icon">
                <i class="fa fa-folder-o"></i>
              </span>

                <span class="post-meta-item-text">分类于</span>


                <span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/爬虫/" itemprop="url" rel="index">
                    <span itemprop="name">爬虫</span>
                  </a>
                </span><span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/爬虫/Node-js-爬虫/" itemprop="url" rel="index">
                    <span itemprop="name">Node.js 爬虫</span>
                  </a>
                </span>



            </span>



              <span class="post-comments-count">
                <span class="post-meta-divider">|</span>
                <span class="post-meta-item-icon">
                  <i class="fa fa-comment-o"></i>
                </span>
                <a href="/2017/03/24/use-phantomjs-dynamic/#comments" itemprop="discussionUrl">
                  <span class="post-comments-count ds-thread-count" data-thread-key="2017/03/24/use-phantomjs-dynamic/" itemprop="commentCount"></span>
                </a>
              </span>




             <span id="/2017/03/24/use-phantomjs-dynamic/" class="leancloud_visitors" data-flag-title="一起学爬虫 Node.js 爬虫篇(三)使用 PhantomJS 爬取动态页面">
               <span class="post-meta-divider">|</span>
               <span class="post-meta-item-icon">
                 <i class="fa fa-eye"></i>
               </span>

                 <span class="post-meta-item-text">阅读次数 </span>

                 <span class="leancloud-visitors-count"></span>
             </span>




        </div>
      </header>


    <div class="post-body" itemprop="articleBody">






版权声明:本文为 wintersmilesb101 -(个人独立博客– http://wintersmilesb101.online 欢迎访问)博主原创文章,未经博主允许不得转载。
今天我们来学习如何使用 PhantomJS 来抓取动态网页,至于 PhantomJS 是啥啊什么的,看这里 我们这
          ...
          <!--noindex-->
          <div class="post-button text-center">
            <a class="btn" href="/2017/03/24/use-phantomjs-dynamic/#more" rel="contents">
              阅读全文 &raquo;
            </a>
          </div>
          <!--/noindex-->


    </div>
    <div>

    </div>
    <div>

    </div>
    <div>

    </div>
    <footer class="post-footer">





        <div class="post-eof"></div>

    </footer>
  </article>







  <article class="post post-type-normal " itemscope itemtype="http://schema.org/Article">
    <link itemprop="mainEntityOfPage" href="http://WinterSmileSB101.online/2017/03/24/get-phantomJS-start/">
    <span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
      <meta itemprop="name" content="WinterSmileSB101">
      <meta itemprop="description" content="">
      <meta itemprop="image" content="http://on792ofrp.bkt.clouddn.com/17-3-22/29073846-file_1490159480452_d2de.jpg">
    </span>
    <span hidden itemprop="publisher" itemscope itemtype="http://schema.org/Organization">
      <meta itemprop="name" content="WinterSmileSB101 的个人房间">
    </span>

      <header class="post-header">


          <h2 class="post-title" itemprop="name headline">




                <a class="post-title-link" href="/2017/03/24/get-phantomJS-start/" itemprop="url">
                  Node.js 动态网页爬取 PhantomJS 使用入门
                </a>


          </h2>

        <div class="post-meta">
          <span class="post-time">

              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-o"></i>
              </span>

                <span class="post-meta-item-text">发表于</span>

              <time title="创建于" itemprop="dateCreated datePublished" datetime="2017-03-24T08:43:25+08:00">
                2017-03-24
              </time>


              <span class="post-meta-divider">|</span>


              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-check-o"></i>
              </span>

                <span class="post-meta-item-text">更新于</span>

              <time title="更新于" itemprop="dateModified" datetime="2017-03-24T10:22:14+08:00">
                2017-03-24
              </time>

          </span>

            <span class="post-category" >

              <span class="post-meta-divider">|</span>

              <span class="post-meta-item-icon">
                <i class="fa fa-folder-o"></i>
              </span>

                <span class="post-meta-item-text">分类于</span>


                <span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/爬虫/" itemprop="url" rel="index">
                    <span itemprop="name">爬虫</span>
                  </a>
                </span><span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/爬虫/Node-js-爬虫/" itemprop="url" rel="index">
                    <span itemprop="name">Node.js 爬虫</span>
                  </a>
                </span>



            </span>



              <span class="post-comments-count">
                <span class="post-meta-divider">|</span>
                <span class="post-meta-item-icon">
                  <i class="fa fa-comment-o"></i>
                </span>
                <a href="/2017/03/24/get-phantomJS-start/#comments" itemprop="discussionUrl">
                  <span class="post-comments-count ds-thread-count" data-thread-key="2017/03/24/get-phantomJS-start/" itemprop="commentCount"></span>
                </a>
              </span>




             <span id="/2017/03/24/get-phantomJS-start/" class="leancloud_visitors" data-flag-title="Node.js 动态网页爬取 PhantomJS 使用入门">
               <span class="post-meta-divider">|</span>
               <span class="post-meta-item-icon">
                 <i class="fa fa-eye"></i>
               </span>

                 <span class="post-meta-item-text">阅读次数 </span>

                 <span class="leancloud-visitors-count"></span>
             </span>




        </div>
      </header>


    <div class="post-body" itemprop="articleBody">






版权声明:本文为 wintersmilesb101 -(个人独立博客– http://wintersmilesb101.online 欢迎访问)博主原创文章,未经博主允许不得转载。
既然是入门,那我们就从人类的起源。。PhantomJS 来说起吧。1、PhantomJS是什么?PhantomJS
          ...
          <!--noindex-->
          <div class="post-button text-center">
            <a class="btn" href="/2017/03/24/get-phantomJS-start/#more" rel="contents">
              阅读全文 &raquo;
            </a>
          </div>
          <!--/noindex-->


    </div>
    <div>

    </div>
    <div>

    </div>
    <div>

    </div>
    <footer class="post-footer">





        <div class="post-eof"></div>

    </footer>
  </article>







  <article class="post post-type-normal " itemscope itemtype="http://schema.org/Article">
    <link itemprop="mainEntityOfPage" href="http://WinterSmileSB101.online/2017/03/23/node-spider-scend/">
    <span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
      <meta itemprop="name" content="WinterSmileSB101">
      <meta itemprop="description" content="">
      <meta itemprop="image" content="http://on792ofrp.bkt.clouddn.com/17-3-22/29073846-file_1490159480452_d2de.jpg">
    </span>
    <span hidden itemprop="publisher" itemscope itemtype="http://schema.org/Organization">
      <meta itemprop="name" content="WinterSmileSB101 的个人房间">
    </span>

      <header class="post-header">


          <h2 class="post-title" itemprop="name headline">




                <a class="post-title-link" href="/2017/03/23/node-spider-scend/" itemprop="url">
                  一起学爬虫 Node.js 爬虫篇(二)
                </a>


          </h2>

        <div class="post-meta">
          <span class="post-time">

              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-o"></i>
              </span>

                <span class="post-meta-item-text">发表于</span>

              <time title="创建于" itemprop="dateCreated datePublished" datetime="2017-03-23T17:17:58+08:00">
                2017-03-23
              </time>


              <span class="post-meta-divider">|</span>


              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-check-o"></i>
              </span>

                <span class="post-meta-item-text">更新于</span>

              <time title="更新于" itemprop="dateModified" datetime="2017-03-24T10:22:12+08:00">
                2017-03-24
              </time>

          </span>

            <span class="post-category" >

              <span class="post-meta-divider">|</span>

              <span class="post-meta-item-icon">
                <i class="fa fa-folder-o"></i>
              </span>

                <span class="post-meta-item-text">分类于</span>


                <span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/爬虫/" itemprop="url" rel="index">
                    <span itemprop="name">爬虫</span>
                  </a>
                </span><span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/爬虫/Node-js-爬虫/" itemprop="url" rel="index">
                    <span itemprop="name">Node.js 爬虫</span>
                  </a>
                </span>



            </span>



              <span class="post-comments-count">
                <span class="post-meta-divider">|</span>
                <span class="post-meta-item-icon">
                  <i class="fa fa-comment-o"></i>
                </span>
                <a href="/2017/03/23/node-spider-scend/#comments" itemprop="discussionUrl">
                  <span class="post-comments-count ds-thread-count" data-thread-key="2017/03/23/node-spider-scend/" itemprop="commentCount"></span>
                </a>
              </span>




             <span id="/2017/03/23/node-spider-scend/" class="leancloud_visitors" data-flag-title="一起学爬虫 Node.js 爬虫篇(二)">
               <span class="post-meta-divider">|</span>
               <span class="post-meta-item-icon">
                 <i class="fa fa-eye"></i>
               </span>

                 <span class="post-meta-item-text">阅读次数 </span>

                 <span class="leancloud-visitors-count"></span>
             </span>




        </div>
      </header>


    <div class="post-body" itemprop="articleBody">






版权声明:本文为 wintersmilesb101 -(个人独立博客– http://wintersmilesb101.online 欢迎访问)博主原创文章,未经博主允许不得转载。
上一篇中我们对百度首页进行了标题的爬取,本来打算这次直接对上次没有爬取到的推荐新闻进行爬取,谁知道网页加载出来没网
          ...
          <!--noindex-->
          <div class="post-button text-center">
            <a class="btn" href="/2017/03/23/node-spider-scend/#more" rel="contents">
              阅读全文 &raquo;
            </a>
          </div>
          <!--/noindex-->


    </div>
    <div>

    </div>
    <div>

    </div>
    <div>

    </div>
    <footer class="post-footer">





        <div class="post-eof"></div>

    </footer>
  </article>







  <article class="post post-type-normal " itemscope itemtype="http://schema.org/Article">
    <link itemprop="mainEntityOfPage" href="http://WinterSmileSB101.online/2017/03/23/node-spider-first/">
    <span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
      <meta itemprop="name" content="WinterSmileSB101">
      <meta itemprop="description" content="">
      <meta itemprop="image" content="http://on792ofrp.bkt.clouddn.com/17-3-22/29073846-file_1490159480452_d2de.jpg">
    </span>
    <span hidden itemprop="publisher" itemscope itemtype="http://schema.org/Organization">
      <meta itemprop="name" content="WinterSmileSB101 的个人房间">
    </span>

      <header class="post-header">


          <h2 class="post-title" itemprop="name headline">




                <a class="post-title-link" href="/2017/03/23/node-spider-first/" itemprop="url">
                  一起学爬虫 Node.js 爬虫篇(一)
                </a>


          </h2>

        <div class="post-meta">
          <span class="post-time">

              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-o"></i>
              </span>

                <span class="post-meta-item-text">发表于</span>

              <time title="创建于" itemprop="dateCreated datePublished" datetime="2017-03-23T14:16:38+08:00">
                2017-03-23
              </time>


              <span class="post-meta-divider">|</span>


              <span class="post-meta-item-icon">
                <i class="fa fa-calendar-check-o"></i>
              </span>

                <span class="post-meta-item-text">更新于</span>

              <time title="更新于" itemprop="dateModified" datetime="2017-03-24T10:22:55+08:00">
                2017-03-24
              </time>

          </span>

            <span class="post-category" >

              <span class="post-meta-divider">|</span>

              <span class="post-meta-item-icon">
                <i class="fa fa-folder-o"></i>
              </span>

                <span class="post-meta-item-text">分类于</span>


                <span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/爬虫/" itemprop="url" rel="index">
                    <span itemprop="name">爬虫</span>
                  </a>
                </span><span itemprop="about" itemscope itemtype="http://schema.org/Thing">
                  <a href="/categories/爬虫/Node-js-爬虫/" itemprop="url" rel="index">
                    <span itemprop="name">Node.js 爬虫</span>
                  </a>
                </span>



            </span>



              <span class="post-comments-count">
                <span class="post-meta-divider">|</span>
                <span class="post-meta-item-icon">
                  <i class="fa fa-comment-o"></i>
                </span>
                <a href="/2017/03/23/node-spider-first/#comments" itemprop="discussionUrl">
                  <span class="post-comments-count ds-thread-count" data-thread-key="2017/03/23/node-spider-first/" itemprop="commentCount"></span>
                </a>
              </span>




             <span id="/2017/03/23/node-spider-first/" class="leancloud_visitors" data-flag-title="一起学爬虫 Node.js 爬虫篇(一)">
               <span class="post-meta-divider">|</span>
               <span class="post-meta-item-icon">
                 <i class="fa fa-eye"></i>
               </span>

                 <span class="post-meta-item-text">阅读次数 </span>

                 <span class="leancloud-visitors-count"></span>
             </span>




        </div>
      </header>


    <div class="post-body" itemprop="articleBody">






版权声明:本文为 wintersmilesb101 -(个人独立博客– http://wintersmilesb101.online 欢迎访问)博主原创文章,未经博主允许不得转载。
一看到爬虫或者一百度爬虫,那是铺天盖地的全是 Python 爬虫啊,不得不说爬虫的框架与资料,Python 基本是最
          ...
          <!--noindex-->
          <div class="post-button text-center">
            <a class="btn" href="/2017/03/23/node-spider-first/#more" rel="contents">
              阅读全文 &raquo;
            </a>
          </div>
          <!--/noindex-->


    </div>
    <div>

    </div>
    <div>

    </div>
    <div>

    </div>
    <footer class="post-footer">





        <div class="post-eof"></div>

    </footer>
  </article>


  </section>

  <nav class="pagination">
    <span class="page-number current">1</span><a class="page-number" href="/page/2/">2</a><span class="space">&hellip;</span><a class="page-number" href="/page/6/">6</a><a class="extend next" rel="next" href="/page/2/"><i class="fa fa-angle-right"></i></a>
  </nav>
          </div>



        </div>



  <div class="sidebar-toggle">
    <div class="sidebar-toggle-line-wrap">
      <span class="sidebar-toggle-line sidebar-toggle-line-first"></span>
      <span class="sidebar-toggle-line sidebar-toggle-line-middle"></span>
      <span class="sidebar-toggle-line sidebar-toggle-line-last"></span>
    </div>
  </div>
  <aside id="sidebar" class="sidebar">
    <div class="sidebar-inner">


      <section class="site-overview sidebar-panel sidebar-panel-active">
        <div class="site-author motion-element" itemprop="author" itemscope itemtype="http://schema.org/Person">
          <img class="site-author-image" itemprop="image"
               src="http://on792ofrp.bkt.clouddn.com/17-3-22/29073846-file_1490159480452_d2de.jpg"
               alt="WinterSmileSB101" />
          <p class="site-author-name" itemprop="name">WinterSmileSB101</p>

              <p class="site-description motion-element" itemprop="description"></p>

        </div>
        <nav class="site-state motion-element">

            <div class="site-state-item site-state-posts">
              <a href="/archives">
                <span class="site-state-item-count">52</span>
                <span class="site-state-item-name">日志</span>
              </a>
            </div>




            <div class="site-state-item site-state-categories">
              <a href="/categories/index.html">
                <span class="site-state-item-count">26</span>
                <span class="site-state-item-name">分类</span>
              </a>
            </div>




            <div class="site-state-item site-state-tags">
              <a href="/tags/index.html">
                <span class="site-state-item-count">113</span>
                <span class="site-state-item-name">标签</span>
              </a>
            </div>

        </nav>

          <div class="feed-link motion-element">
            <a href="http://blog.csdn.net/qq_21265915/rss/list" rel="alternate">
              <i class="fa fa-rss"></i>
              RSS
            </a>
          </div>

        <!--自己写的社交链接-->
        <div class="links-of-author motion-element">
         <span class="links-of-author-item">
         <a href="https://github.com/WinterSmileSB101" title="Github">
         <i class="fa fa-fw fa-github fa-lg"></i>
         </a>
         </span>
         <span class="links-of-author-item">
                  <a href="http://weibo.com/5602632941/profile?rightmod=1&wvr=6&mod=personinfo&is_all=1" title="微博">
                  <i class="fa fa-fw fa-weibo fa-lg"></i>
                  </a>
                  </span>
         <span class="links-of-author-item">
         <a href="http://www.jianshu.com/users/73344bc7e890/timeline" title="简书">
         <i class="fa fa-fw fa-bookmark fa-lg"></i>
         </a>
         </span>
<br />
        <span class="links-of-author-item">
                 <a href="https://www.douban.com/people/159359470/" title="豆瓣">
                 <i class="fa fa-fw fa-newspaper-o fa-lg"></i>
                 </a>
                 </span>
        <span class="links-of-author-item">
                 <a href="http://blog.csdn.net/qq_21265915" title="CSDN博客">
                 <i class="fa fa-fw fa-bug fa-lg"></i>
                 </a>
                 </span>
        </div>
        <!--自己写的社交链接-->






      </section>


        <div class="back-to-top">
          <i class="fa fa-arrow-up"></i>

            <span id="scrollpercent"><span>0</span>%</span>

        </div>

    </div>
  </aside>


      </div>
    </main>
    <footer id="footer" class="footer">
      <div class="footer-inner">
        <div class="copyright" >

  &copy;  2017.3.20 -
  <span itemprop="copyrightYear">2017</span>
  <span class="with-love">
    <i class="fa fa-heart"></i>
  </span>
  <span class="author" itemprop="copyrightHolder">Powered By - WinterSmileSB101</span>
</div>

<div class="powered-by">
    个人专属
</div>
<div class="theme-info">
  博客 -
  WinterSmileSB101
</div>



      </div>
    </footer>

  </div>

<script type="text/javascript">
  if (Object.prototype.toString.call(window.Promise) !== '[object Function]') {
    window.Promise = null;
  }
</script>




  <script type="text/javascript" src="/lib/jquery/index.js?v=2.1.3"></script>

  <script type="text/javascript" src="/lib/fastclick/lib/fastclick.min.js?v=1.0.6"></script>

  <script type="text/javascript" src="/lib/jquery_lazyload/jquery.lazyload.js?v=1.9.7"></script>

  <script type="text/javascript" src="/lib/velocity/velocity.min.js?v=1.2.1"></script>

  <script type="text/javascript" src="/lib/velocity/velocity.ui.min.js?v=1.2.1"></script>

  <script type="text/javascript" src="/lib/fancybox/source/jquery.fancybox.pack.js?v=2.1.5"></script>

  <script type="text/javascript" src="/lib/canvas-nest/canvas-nest.min.js"></script>



  <script type="text/javascript" src="/js/src/utils.js?v=5.1.0"></script>
  <script type="text/javascript" src="/js/src/motion.js?v=5.1.0"></script>





  <script type="text/javascript" src="/js/src/bootstrap.js?v=5.1.0"></script>




  <script type="text/javascript">
    var duoshuoQuery = {short_name:"wintersmilesb101"};
    (function() {
      var ds = document.createElement('script');
      ds.type = 'text/javascript';ds.async = true;
      ds.id = 'duoshuo-script';
      ds.src = (document.location.protocol == 'https:' ? 'https:' : 'http:') + '//static.duoshuo.com/embed.js';
      ds.charset = 'UTF-8';
      (document.getElementsByTagName('head')[0]
      || document.getElementsByTagName('body')[0]).appendChild(ds);
    })();
  </script>



      <script src="/lib/ua-parser-js/dist/ua-parser.min.js?v=0.7.9"></script>
      <script src="/js/src/hook-duoshuo.js?v=5.1.0"></script>


    <script src="/lib/ua-parser-js/dist/ua-parser.min.js?v=0.7.9"></script>
    <script src="/js/src/hook-duoshuo.js"></script>



  <script type="text/javascript">
    // Popup Window;
    var isfetched = false;
    // Search DB path;
    var search_path = "search.xml";
    if (search_path.length == 0) {
      search_path = "search.xml";
    }
    var path = "/" + search_path;
    // monitor main search box;
    function proceedsearch() {
      $("body")
        .append('<div class="search-popup-overlay local-search-pop-overlay"></div>')
        .css('overflow', 'hidden');
      $('.popup').toggle();
    }
    // search function;
    var searchFunc = function(path, search_id, content_id) {
      'use strict';
      $.ajax({
        url: path,
        dataType: "xml",
        async: true,
        success: function( xmlResponse ) {
          // get the contents from search data
          isfetched = true;
          $('.popup').detach().appendTo('.header-inner');
          var datas = $( "entry", xmlResponse ).map(function() {
            return {
              title: $( "title", this ).text(),
              content: $("content",this).text(),
              url: $( "url" , this).text()
            };
          }).get();
          var $input = document.getElementById(search_id);
          var $resultContent = document.getElementById(content_id);
          $input.addEventListener('input', function(){
            var matchcounts = 0;
            var str='<ul class=\"search-result-list\">';
            var keywords = this.value.trim().toLowerCase().split(/[\s\-]+/);
            $resultContent.innerHTML = "";
            if (this.value.trim().length > 1) {
              // perform local searching
              datas.forEach(function(data) {
                var isMatch = false;
                var content_index = [];
                var data_title = data.title.trim().toLowerCase();
                var data_content = data.content.trim().replace(/<[^>]+>/g,"").toLowerCase();
                var data_url = decodeURIComponent(data.url);
                var index_title = -1;
                var index_content = -1;
                var first_occur = -1;
                // only match artiles with not empty titles and contents
                if(data_title != '') {
                  keywords.forEach(function(keyword, i) {
                    index_title = data_title.indexOf(keyword);
                    index_content = data_content.indexOf(keyword);
                    if( index_title >= 0 || index_content >= 0 ){
                      isMatch = true;
                      if (i == 0) {
                        first_occur = index_content;
                      }
                    }
                  });
                }
                // show search results
                if (isMatch) {
                  matchcounts += 1;
                  str += "<li><a href='"+ data_url +"' class='search-result-title'>"+ data_title +"</a>";
                  var content = data.content.trim().replace(/<[^>]+>/g,"");
                  if (first_occur >= 0) {
                    // cut out 100 characters
                    var start = first_occur - 20;
                    var end = first_occur + 80;
                    if(start < 0){
                      start = 0;
                    }
                    if(start == 0){
                      end = 50;
                    }
                    if(end > content.length){
                      end = content.length;
                    }
                    var match_content = content.substring(start, end);
                    // highlight all keywords
                    keywords.forEach(function(keyword){
                      var regS = new RegExp(keyword, "gi");
                      match_content = match_content.replace(regS, "<b class=\"search-keyword\">"+keyword+"</b>");
                    });
                    str += "<p class=\"search-result\">" + match_content +"...</p>"
                  }
                  str += "</li>";
                }
              })};
            str += "</ul>";
            if (matchcounts == 0) { str = '<div id="no-result"><i class="fa fa-frown-o fa-5x" /></div>' }
            if (keywords == "") { str = '<div id="no-result"><i class="fa fa-search fa-5x" /></div>' }
            $resultContent.innerHTML = str;
          });
          proceedsearch();
        }
      });}
    // handle and trigger popup window;
    $('.popup-trigger').click(function(e) {
      e.stopPropagation();
      if (isfetched == false) {
        searchFunc(path, 'local-search-input', 'local-search-result');
      } else {
        proceedsearch();
      };
    });
    $('.popup-btn-close').click(function(e){
      $('.popup').hide();
      $(".local-search-pop-overlay").remove();
      $('body').css('overflow', '');
    });
    $('.popup').click(function(e){
      e.stopPropagation();
    });
  </script>


  <script src="https://cdn1.lncld.net/static/js/av-core-mini-0.6.1.js"></script>
  <script>AV.initialize("cOFi0858xVYxKW1wnErxqEra-gzGzoHsz", "7LaqqR82XnjzTbkv5eCKw5aW");</script>
  <script>
    function showTime(Counter) {
      var query = new AV.Query(Counter);
      var entries = [];
      var $visitors = $(".leancloud_visitors");
      $visitors.each(function () {
        entries.push( $(this).attr("id").trim() );
      });
      query.containedIn('url', entries);
      query.find()
        .done(function (results) {
          var COUNT_CONTAINER_REF = '.leancloud-visitors-count';
          if (results.length === 0) {
            $visitors.find(COUNT_CONTAINER_REF).text(0);
            return;
          }
          for (var i = 0; i < results.length; i++) {
            var item = results[i];
            var url = item.get('url');
            var time = item.get('time');
            var element = document.getElementById(url);
            $(element).find(COUNT_CONTAINER_REF).text(time);
          }
          for(var i = 0; i < entries.length; i++) {
            var url = entries[i];
            var element = document.getElementById(url);
            var countSpan = $(element).find(COUNT_CONTAINER_REF);
            if( countSpan.text() == '') {
              countSpan.text(0);
            }
          }
        })
        .fail(function (object, error) {
          console.log("Error: " + error.code + " " + error.message);
        });
    }
    function addCount(Counter) {
      var $visitors = $(".leancloud_visitors");
      var url = $visitors.attr('id').trim();
      var title = $visitors.attr('data-flag-title').trim();
      var query = new AV.Query(Counter);
      query.equalTo("url", url);
      query.find({
        success: function(results) {
          if (results.length > 0) {
            var counter = results[0];
            counter.fetchWhenSave(true);
            counter.increment("time");
            counter.save(null, {
              success: function(counter) {
                var $element = $(document.getElementById(url));
                $element.find('.leancloud-visitors-count').text(counter.get('time'));
              },
              error: function(counter, error) {
                console.log('Failed to save Visitor num, with error message: ' + error.message);
              }
            });
          } else {
            var newcounter = new Counter();
            /* Set ACL */
            var acl = new AV.ACL();
            acl.setPublicReadAccess(true);
            acl.setPublicWriteAccess(true);
            newcounter.setACL(acl);
            /* End Set ACL */
            newcounter.set("title", title);
            newcounter.set("url", url);
            newcounter.set("time", 1);
            newcounter.save(null, {
              success: function(newcounter) {
                var $element = $(document.getElementById(url));
                $element.find('.leancloud-visitors-count').text(newcounter.get('time'));
              },
              error: function(newcounter, error) {
                console.log('Failed to create');
              }
            });
          }
        },
        error: function(error) {
          console.log('Error:' + error.code + " " + error.message);
        }
      });
    }
    $(function() {
      var Counter = AV.Object.extend("Counter");
      if ($('.leancloud_visitors').length == 1) {
        addCount(Counter);
      } else if ($('.post-title-link').length > 1) {
        showTime(Counter);
      }
    });
  </script>



</body>
</html>

Process finished with exit code 0

ok 现在能够正确的访问到网址并且拿到源码了,想怎么嘿嘿怎么嘿嘿。

如有问题,希望不吝赐教

猜你喜欢

转载自blog.csdn.net/qq_21265915/article/details/78695632