Scrapy报错:LookupError: unknown encoding: ‘b‘utf8‘‘;xpath解析报错:LookupError: unknown encoding: ‘b‘utf8‘‘

Scrapy报错:LookupError: unknown encoding: 'b'utf8'';xpath解析报错:LookupError: unknown encoding: 'b'utf8''

完整报错如下:

2025-02-18 18:11:01 [scrapy.core.scraper] ERROR: Spider error processing <GET https://www.taiwu.com/zufang> (referer: None)
Traceback (most recent call last):
  File "D:\soft\anaconda3\Lib\site-packages\twisted\internet\defer.py", line 1075, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\soft\anaconda3\Lib\site-packages\scrapy\spiders\__init__.py", line 79, in _parse
    return self.parse(response, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\PycharmProjects\crawler_project\taiwu\taiwu\spiders\taiwu_zufang.py", line 23, in parse
    span = response.xpath("//a[@class='cursor-p']/span").getall()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\soft\anaconda3\Lib\site-packages\scrapy\http\response\text.py", line 157, in xpath
    return self.selector.xpath(query, **kwargs)
           ^^^^^^^^^^^^^
  File "D:\soft\anaconda3\Lib\site-packages\scrapy\http\response\text.py", line 145, in selector
    self._cached_selector = Selector(self)
                            ^^^^^^^^^^^^^^
  File "D:\soft\anaconda3\Lib\site-packages\scrapy\selector\unified.py", line 97, in __init__
    super().__init__(text=text, type=st, **kwargs)
  File "D:\soft\anaconda3\Lib\site-packages\parsel\selector.py", line 496, in __init__
    root, type = _get_root_and_type_from_text(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\soft\anaconda3\Lib\site-packages\parsel\selector.py", line 377, in _get_root_and_type_from_text
    root = _get_root_from_text(text, type=type, **lxml_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\soft\anaconda3\Lib\site-packages\parsel\selector.py", line 329, in _get_root_from_text
    return create_root_node(text, _ctgroup[type]["_parser"], **lxml_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\soft\anaconda3\Lib\site-packages\parsel\selector.py", line 110, in create_root_node
    parser = parser_cls(recover=True, encoding=encoding, huge_tree=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\soft\anaconda3\Lib\site-packages\lxml\html\__init__.py", line 1887, in __init__
    super().__init__(**kwargs)
  File "src\\lxml\\parser.pxi", line 1806, in lxml.etree.HTMLParser.__init__
  File "src\\lxml\\parser.pxi", line 858, in lxml.etree._BaseParser.__init__
LookupError: unknown encoding: 'b'utf8''
2025-02-18 18:11:01 [scrapy.core.engine] INFO: Closing spider (finished)
2025-02-18 18:11:01 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{
    
    'downloader/request_bytes': 310,
 'downloader/request_count': 1,
 'downloader/request_method_count/GET': 1,
 'downloader/response_bytes': 102477,
 'downloader/response_count': 1,
 'downloader/response_status_count/200': 1,
 'elapsed_time_seconds': 0.738762,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2025, 2, 18, 10, 11, 1, 573745, tzinfo=datetime.timezone.utc),
 'httpcompression/response_bytes': 471422,
 'httpcompression/response_count': 1,
 'log_count/DEBUG': 4,
 'log_count/ERROR': 1,
 'log_count/INFO': 10,
 'response_received_count': 1,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'spider_exceptions/LookupError': 1,
 'start_time': datetime.datetime(2025, 2, 18, 10, 11, 0, 834983, tzinfo=datetime.timezone.utc)}
2025-02-18 18:11:01 [scrapy.core.engine] INFO: Spider closed (finished)

D:\PycharmProjects\crawler_project\taiwu\taiwu\spiders>

关键报错信息如下
在这里插入图片描述
可以知道报错文件是 D:\soft\anaconda3\Lib\site-packages\parsel\selector.py
在这里插入图片描述
报错的原因就是使用 xpath 解析之后,选择器再识别编码的时候出错了,改完 selector 文件之后,重新运行
在这里插入图片描述