You can check whether the scroll did anything in every step.
lastHeight = driver.execute_script("return document.body.scrollHeight")
while True:
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(pause)
newHeight = driver.execute_script("return document.body.scrollHeight")
if newHeight == lastHeight:
break
lastHeight = newHeight
This uses a static wait amount which is bad because you don't want to wait unnecessary when it finishes faster and you don't want that the script exits prematurely when the dynamic load is too slow for some reason.
Since a page usually loads some more elements into a list, you can check the length of the list before the load and wait until the next element is loaded.
For twitter this could look like this:
while True:
elemsCount = browser.execute_script("return document.querySelectorAll('.stream-items > li.stream-item').length")
browser.execute_script("window.scrollTo(0, document.body.scrollHeight);")
try:
WebDriverWait(browser, 20).until(
lambda x: x.find_element_by_xpath(
"//*[contains(@class,'stream-items')]/li[contains(@class,'stream-item')]["+str(elemsCount+1)+"]"))
except:
break
I used an XPath expression, because PhantomJS 1.x has a bug sometimes when using :nth-child()
CSS selectors.
Full version for reference.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…