Table of contents
1. Basic usage
There are many ways to use it on the Internet , but according to the relatively new find_element_by_xpath
method I used, this method can no longer be used and needs to be used instead .selenium
webdriver
find_element_by_xpath
find_element(By.XPATH,'A XPATH Value')
web_element = driver.find_element(By.XPATH,'A XPATH Value')
2. About xpath
(1) Copy an xpath from the web page
The stupid xpath value used in this article is the use of xpath as follows:
directly F12
click on the icon in the upper left corner of the webpage
and click on an element, and the location in the source code will be displayed on the right, in the source code
You can get an xpath by right-clicking in the position , but this kind of xpath is "not good", because if the source code changes, it is likely to change the xpath
(2) Why does xpath look like this
First, do a simple analysis of the xpath just mentioned
//*[@id="root"]/div/main/div/div/div/div/div[2]/div/div[1]/div/div[1]/form/div[2]/div/label/input
username = self.driver.find_element(By.XPATH,'//*[@id="root"]/div/main/div/div/div/div/div[2]/div/div[1]/div/div[1]/form/div[2]/div/label/input').text
The above xpath, combined with the above python3 statement, can be interpreted as: in self.driver
this web element
, among all tags (because of the first one //
), find the first one (because the find_element method is used, which will be expanded later) there is an id in the tag attribute, and its value is any (because of *
) tag of "root", the first div tag under it, the first div tag under it, the first div tag under it*4, the second one under it The first label tag under the div tag, the first input tag under it, so the input tag is finally found.
3. Customize xpath according to your needs
(1) Multiple xpath methods can correspond to the same HTML tag
After almost understanding the principle of xpath, you can design xapth according to your own needs and how to write the xpath above:
①
//*[@id="root"]/div/main/div/div/div/div/div[2]/div/div[1]/div/div[1]/form/div[2]/div/label/input
Change:
//div[@id="root"]/div/main/div/div/div/div/div[2]/div/div[1]/div/div[1]/form/div[2]/div/label/input
This is just a small change
② Another example is
the complete xpath value of the tag:
/html/body/div[1]/div/main/div/div/div/div/div[2]/div/div[1]/div/div[1]/form/div[2]/div/label/input
the /html/body/div[1]
and //div[@id="root"]
are the same and can be replaced with each other.
(2) Obtain the custom xpath value according to the attribute information of the tag
It can be noticed that the above input
tag has an attribute of class
, which value
is , after searching "Input i7cW1UcwT6ThdhTakqFm username-input"
in the console , it is found that there are only two tags with (or two input tags that meet this condition), we want to get the input phone number The corresponding xpath (used in the python code below) can be designed like this:ctrl+F
Input i7cW1UcwT6ThdhTakqFm username-input
class="Input i7cW1UcwT6ThdhTakqFm username-input"
phone_number = self.driver.find_element(By.XPATH,'XPATH')
//input[@class="Input i7cW1UcwT6ThdhTakqFm username-input"]
Or
//*[@class="Input i7cW1UcwT6ThdhTakqFm username-input"]
the meaning is: self.driver
find the first class="Input i7cW1UcwT6ThdhTakqFm username-input"
(restricted condition, don't ignore @
) input tag in (in this case, it can be *
(any tag))
As mentioned above, the xpath copied by the console is generally not very good (in my opinion), because If there are some changes in the source code, it is possible to change the xpath, but the attribute of a certain tag usually does not change, which improves the fault tolerance of the code in the face of changes in the source code of the web page.
(3) Get all web element elements that meet a certain xpath condition
Using the above xpath value customized according to the label attribute, you can get all the elements of the web element that meet a certain xpath condition at one time. This is very useful when processing similar data in batches. In the above example, you need to enter the verification code after entering the mobile phone number. If Still using the find_element method + full xpath method, you may need to design python statements like this:
phone_number = self.driver.find_element(By.XPATH,'//*[@id="root"]/div/main/div/div/div/div/div[2]/div/div[1]/div/div[1]/form/div[2]/div/label/input')
code = self.driver.find_element(By.XPATH,'//*[@id="root"]/div/main/div/div/div/div/div[2]/div/div[1]/div/div[1]/form/div[3]/div/label/input')
Every time you need to copy the xpath value on the web page,
and using the above method of customizing the xpath value according to the label attribute, you can design the following python code
input_list = self.driver.find_elements(By.XPATH,'//input[@class="Input i7cW1UcwT6ThdhTakqFm username-input"]')
phone_number = input_list[0]
code = input_list[1]
The find_elements method is used (different from find_element, find_element is find_elements[0]), and its return value is a list of Web Elements. The meaning of the above code is: input_list is a list containing all under self.driver
thisweb element
class="Input i7cW1UcwT6ThdhTakqFm username-input"
web element
(4) Relative path + xpath value
Sometimes there will be such a requirement:
it is necessary to obtain the number of approvals, title information, content information, etc. in all the answer information of the user. Of course, you can write the following code:
agree_list = self.driver.find_elements(By.XPATH,'//Button[@class="Button VoteButton VoteButton--up FEfUrdfMIKpQDJDqkjte"]/span')
titles = ....
content = ...
......
for index,agree_info in enumerate(agree) :
agree = agree_list[i]
title = titles[i]
....
But if sometimes there is more than one class="Button VoteButton VoteButton--up FEfUrdfMIKpQDJDqkjte"]
(agree button) in the answer information, give an inappropriate example:
if the button in the underlined position in the picture below is also class="Button VoteButton VoteButton--up FEfUrdfMIKpQDJDqkjte"]
how to solve it? (actually not, just an example)
The following provides a way of thinking:
this method in a certain class
list_items = self.driver.find_elements(By.XPATH,'//div[@class="List-item"]')
for list_item in list_items :
agree = list_item.find_element(By.XPATH,'.//Button[@class="Button VoteButton VoteButton--up FEfUrdfMIKpQDJDqkjte"]/span')
...
......
Note: In the following code, the beginning of the xpath //
is replaced ../
, and self.driver is replaced with another web element
( list_item
)
so that only the button list item
in this is obtained class
.
\
4. Implicitly wait for the element corresponding to xpath to appear
find_element
In order to prevent the python file from executing other methods before the page is fully loaded due to network speed problems, causing the program to report an error and exit abnormally, you can use the following statement to wait for the element corresponding to xpath to appear:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
WebDriverWait(self.driver,10000).until(EC.presence_of_element_located((By.XPATH, 'XPATH')))
Among them, 1000 is the longest waiting time set. If xpath
the element corresponding to the timeout still does not appear, the program will still report an error. Compared with the method used , the advantage of this method is that after the corresponding element sleep
appears on the page , it will xpath
Continue to execute the program quickly, without xpath
the problem that the program is still waiting for execution after it appears.
5. WEBELEMENT.click()
The solution of the method that the invisible (not displayed in the current view) element cannot be clicked
You can use the following statement
self.driver.execute_script("arguments[0].click();", web_element_clickable)
One of them web_element_clickable
is clickable web_element
.
6. The page does not load all the content at one time
Another thing is that the page does not load all the content at one time. It is necessary to slide down the page to load more content. You can use the following python statement. The following
example is that if the current page has less than 10 xpath values //div[@class="List-item"]
, web element
send Scroll down the page until the conditions are met.
ask_answers = self.driver.find_elements(By.XPATH,'//div[@class="List-item"]')
ask_answers_count = len(ask_answers)
# Scroll down until the count is at least 10
while ask_answers_count < valid_answers_count:
# Scroll by 100 pixels
self.driver.execute_script("window.scrollBy(0, 100);")
# Get the updated ask_answers_count of div elements with class "List-item"
ask_answers = self.driver.find_elements(By.XPATH,'//div[@class="List-item"]')
ask_answers_count = len(ask_answers)