关于使用CMD安装Python第三方模块库BeautifulSoup失败的解决方法

问题产生

在进行爬虫抓取时,需要安装第三方模块库BeautifulSoup

探索过程

尝试使用pip install BeautifulSoup
问题一
SyntaxError: Missing parentheses in call to ‘print’. Did you mean print(int “Unit tests have failed!”)?
方法一
进入官网下载安装包https://files.pythonhosted.org/packages/1e/ee/295988deca1a5a7accd783d0dfe14524867e31abb05b6c0eeceee49c759d/BeautifulSoup-3.2.1.tar.gz
解压后再次键入:python install setup.py
PS使用wheel安装包时:pip install **.whl
但依旧发现安装失败。
此时在源代码文件setup.py:阅读后发现其print未使用函数

from distutils.core import setup
import unittest
import warnings
warnings.filterwarnings("ignore", "Unknown distribution option")

import sys
# patch distutils if it can't cope with the "classifiers" keyword
if sys.version < '2.2.3':
    from distutils.dist import DistributionMetadata
    DistributionMetadata.classifiers = None
    DistributionMetadata.download_url = None

from BeautifulSoup import __version__

#Make sure all the tests complete.
import BeautifulSoupTests
loader = unittest.TestLoader()
result = unittest.TestResult()
suite = loader.loadTestsFromModule(BeautifulSoupTests)
suite.run(result)
if not result.wasSuccessful():
    print "Unit tests have failed!"
    for l in result.errors, result.failures:
        for case, error in l:
            print "-" * 80
            desc = case.shortDescription()
            if desc:
                print desc
            print error        
    print '''If you see an error like: "'ascii' codec can't encode character...", see\nthe Beautiful Soup documentation:\n http://www.crummy.com/software/BeautifulSoup/documentation.html#Why%20can't%20Beautiful%20Soup%20print%20out%20the%20non-ASCII%20characters%20I%20gave%20it?'''
    print "This might or might not be a problem depending on what you plan to do with\nBeautiful Soup."
    if sys.argv[1] == 'sdist':
        print
        print "I'm not going to make a source distribution since the tests don't pass."
        sys.exit(1)

setup(name="BeautifulSoup",
      version=__version__,
      py_modules=['BeautifulSoup', 'BeautifulSoupTests'],
      description="HTML/XML parser for quick-turnaround applications like screen-scraping.",
      author="Leonard Richardson",
      author_email = "[email protected]",
      long_description="""Beautiful Soup parses arbitrarily invalid SGML and provides a variety of methods and Pythonic idioms for iterating and searching the parse tree.""",
      classifiers=["Development Status :: 5 - Production/Stable",
                   "Intended Audience :: Developers",
                   "License :: OSI Approved :: Python Software Foundation License",
                   "Programming Language :: Python",
                   "Topic :: Text Processing :: Markup :: HTML",
                   "Topic :: Text Processing :: Markup :: XML",
                   "Topic :: Text Processing :: Markup :: SGML",
                   "Topic :: Software Development :: Libraries :: Python Modules",
                   ],
      url="http://www.crummy.com/software/BeautifulSoup/",
      license="BSD",
      download_url="http://www.crummy.com/software/BeautifulSoup/download/"
      )
    
    # Send announce to:
    #   [email protected]
    #   [email protected]

解决方法

主要原因是Python从2.0版本到3.0版本将其函数进行了大改。
print成为print()函数
亲测:目前3.7版本可以使用BS4这一库函数。

另外

要注意到BS4库在IDLE中import时无法使用BeautifulSoup4这一库名,暂时未知其问题出在哪。
解决

Beautiful Soup 3 目前已经停止开发,推荐在现在的项目中使用Beautiful Soup4,不过它已经被移植到BS4了,也就是说导入时我们需要 import bs4 。所以这里我们用的版本是 Beautiful Soup
4.3.2 (简称BS4),另外据说 BS4 对 Python3 的支持不够好,不过我用的是 Python2.7.7,如果有小伙伴用的是 Python3 版本,可以考虑下载 BS3 版本。

猜你喜欢

转载自blog.csdn.net/chenmo2019/article/details/84856803