Python+selenium自动获取Web端斗鱼直播信息

企业开发 2023-04-06 16:03:46 阅读次数: 0

环境准备

Python
谷歌浏览器或其他浏览器的Driver驱动，最好设置下环境变量(全局驱动)，或者使用局部的驱动也可以
安装 selenium库

实现

打开浏览器
定位元素

打开页面调试，即可发现每个直播框都是很多的li元素，class都是layout-Cover-item，我们直接获取所有的这种li元素集合就行。

在这里插入图片描述

操作元素

获取完毕后，我们还需要提取每个直播间的标题、用户、热度等等数据。

在这里插入图片描述

数据都可以获取到，需要注意的是，我们需要遍历的是获取到的li集合中的web元素。

获取元素信息

# @creator by wlh
# @date 2023/3/16 14:01
import time

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()
url = "https://www.douyu.com/directory/all"

driver.get(url)
# 最大化
driver.maximize_window()
time.sleep(1)

n = 1
# 模拟获取10页
while n < 11:
    # 滑动到底部
    driver.execute_script("window.scrollTo(0, 10000)")
    # 等待加载元素
    time.sleep(2)
    # 获取所有的元素列表
    lst = driver.find_elements(By.CLASS_NAME, "layout-Cover-item")
    # 遍历所有的 li 标签
    for li in lst:
        item = {
    
    }
        # 每次从 li 元素里面找到需要的数据
        item["title"] = li.find_element(By.CLASS_NAME, "DyListCover-intro").text
        item["types"] = li.find_element(By.CLASS_NAME, "DyListCover-zone").text
        item["name"] = li.find_element(By.CLASS_NAME, "DyListCover-userName").text
        item["hot"] = li.find_element(By.CLASS_NAME, "DyListCover-hot").text
        print(item)

    # 获取下一页
    next = driver.find_element(By.XPATH, "//*[@title='下一页']")
    if next.is_enabled():
        next.click()
        n += 1
        time.sleep(1)

time.sleep(5)
driver.quit()

猜你喜欢

转载自blog.csdn.net/weixin_45248492/article/details/129587613

Python+selenium自动获取Web端斗鱼直播信息

Python+selenium 获取验证信息

python+selenium切换窗口（获取句柄信息）

Python+selenium自动化之cookie获取与登录

使用python+selenium对web进行自动化测试

Python+Selenium实现web自动化跳过登录

Python——selenium爬取斗鱼房间信息

selenium爬取斗鱼所有直播房间信息

自动出借-python+selenium

【Selenium07篇】python+selenium实现Web自动化：PO模型，PageObject模式！

【Selenium06篇】python+selenium实现Web自动化：日志处理

python+selenium自动化

Python+selenium自动循环送贺卡

python+selenium自动给下载某个文件

python+selenium自动化1

python+selenium自动化测试

python+selenium 实现每天自动登记

python利用danmu实时获取斗鱼等直播网站字幕

python-web自动化-Python+Selenium之expected_conditions：各种判断

python+selenium 拉勾网信息获取，主要是定位元素的练习

Python+Selenium练习（二十九）- 获取当前页面全部图片信息

python+selenium 简易地疫情信息自动打卡签到

web自动化：IE11运行Python+selenium程序

Python+Selenium - Web自动化测试（二）：元素定位

Python+Selenium - Web自动化测试（一）：环境搭建

第一章、Python+Selenium Web自动化测试环境搭建

python+selenium实现Web自动化：数据驱动框架，关键字驱动框架

学会Python+Selenium，搭建Web自动化框架分分钟的事

python+selenium基于po模式的web自动化测试框架

Python+selenium，轻松搭建 Web 自动化测试框架

今日推荐

面壁智能发布 Eurux-8x22B 开源大模型 —— 堪称「理科状元」

开源日报 | 谷歌扶持鸿蒙上位；开源Rabbit R1；Docker加持的安卓手机；微软的焦虑和野心；海尔电器把开放平台关了

中国码农的“35岁魔咒”

蘭雅 CorelDRAW 插件 2024.5.1 国际劳动节版，免费下载

Arc Browser for Windows 1.0 正式 GA

90后程序员开发视频搬运软件、不到一年获利超 700 万，结局很刑！

周排行

【转】spring中对控制反转和依赖注入的理解

tms webcore 安装和使用

java程序员进阶相关书籍

SpringMVC接受请求参数、

如何保存训练好的机器学习模型

MyEclipse、Eclipse设置项目JDK的三个地方

商超行业微信小程序开发定制一般多少钱（行业技术人员解读）

Markdown编辑器语言——30分钟入门到到精通

Linux系统下MongoDB的简单安装与基本操作

Power Strings

每日归档

更多

2024-05-07(14)

2024-05-06(40)

2024-05-05(0)

2024-05-04(7)

2024-05-03(19)

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)