Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
686 views
in Technique[技术] by (71.8m points)

selenium - Python Scraping - Unable to get required data from Flipkart

I was trying to scrape the customer reviews from Flipkart website. The following is the link. The following was my code to scrape, but it is always returning an empty list.

>>> from bs4 import BeautifulSoup
>>> import requests

>>> r = requests.get('https://www.flipkart.com/samsung-galaxy-j5-6-new-2016-edition-white-16-gb/product-reviews/itmegmrnzqjcpfg9?pid=MOBEG4XWJG7F9A6Z')
>>> soup = BeautifulSoup(r.content, 'lxml') # Tried with 'html.parser' also
>>> soup.find_all('div', '_3DCdKt')
[]
>>> soup.find_all('div', {'class': '_3DCdKt'})
[]
>>> soup.find_all('div', {'class': 'row _3wYu6I _3BRC7L'})
[]
>>> soup.find_all('div', {'class': '_1GRhLX hFPo14'})
[]

So, I tried to get the entire section, but I was getting only the following:

>>> soup.find_all('div', {'class': 'col-9-12'})
[<div class="col-9-12" data-reactid="96"><div class="row _2_xtR5" data-reactid="97"></div><div class="row _3wYu6I _1KVtzT" data-reactid="98"></div></div>]

I was not getting the other contents. So, next I tried with selenium, even then it was returning None. The following is my selenium code:

>>> driver = webdriver.Firefox()
>>> driver.get('https://www.flipkart.com/samsung-galaxy-j5-6-new-2016-edition-white-16-gb/product-reviews/itmegmrnzqjcpfg9?pid=MOBEG4XWJG7F9A6Z')
>>> a = driver.find_elements_by_class_name("_3DCdKt")
>>> len(a)
10
>>> for i in a:
...    print i.get_attribute('value')
...
None
None
None
None
None
None
None
None
None
None

What might be the problem? Am I doing any mistakes in the code. Kindly help. I am new to Python.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The reviews etc.. are populated using reactjs, the data is retrieved using an ajax request which you can mimic with requests:

import requests

data = {"productId": "MOBEG4XWJG7F9A6Z", # end of url pid=MOBEG4XWJG7F9A6Z
        "count": "15",
        "ratings": "ALL",
        "reviewerType:ALL"
        "sortOrder": "MOST_HELPFUL"}



headers = ({"x-user-agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.92 Safari/537.36 FKUA/website/41/website/Desktop"})
data = requests.get("https://www.flipkart.com/api/3/product/reviews", params=data, headers=headers).json()
print(data)

What you want is to access data["RESPONSE"]["data"] which is a list of dicts:

for dct in data["RESPONSE"]["data"]
    print(dct)

Which will give you:

{u'action': None, u'fixed': False, u'value': {u'rating': 5, u'text': u'Thanks to Flipkart who deliver it me with in 5 days 
Good Phone With Metal Body 
And Best front Camera With Flash
Best for night Selfie 
I Take more than 30 pic in night mode with front flash 
good smartphone  gold color is also supereb
best ever smartphone under 15k by samsung
Good Battery
Good Camera Front with Flash and Rear Also Superb', u'reportAbuse': {u'action': {u'originalUrl': None, u'params': {u'vote': u'ABUSE', u'reviewId': u'be37810e-20fe-4417-9d88-2709288cf2ba', u'reviewDomain': u'PRODUCT'}, u'loginType': u'LEGACY_LOGIN', u'url': None, u'fallback': None, u'type': u'REVIEW_VOTE', u'omnitureData': None, u'screenType': None, u'tracking': {}}, u'fixed': False, u'value': {u'count': 0, u'type': u'VoteValue'}, u'tracking': None}, u'totalCount': 285, u'downvote': {u'action': {u'originalUrl': None, u'params': {u'vote': u'DOWN', u'reviewId': u'be37810e-20fe-4417-9d88-2709288cf2ba', u'reviewDomain': u'PRODUCT'}, u'loginType': u'LEGACY_LOGIN', u'url': None, u'fallback': None, u'type': u'REVIEW_VOTE', u'omnitureData': None, u'screenType': None, u'tracking': {}}, u'fixed': False, u'value': {u'count': 74, u'type': u'VoteValue'}, u'tracking': None}, u'id': u'be37810e-20fe-4417-9d88-2709288cf2ba', u'author': u'Happy Thakur', u'url': u'/reviews/be37810e-20fe-4417-9d88-2709288cf2ba', u'upvote': {u'action': {u'originalUrl': None, u'params': {u'vote': u'UP', u'reviewId': u'be37810e-20fe-4417-9d88-2709288cf2ba', u'reviewDomain': u'PRODUCT'}, u'loginType': u'LEGACY_LOGIN', u'url': None, u'fallback': None, u'type': u'REVIEW_VOTE', u'omnitureData': None, u'screenType': None, u'tracking': {}}, u'fixed': False, u'value': {u'count': 211, u'type': u'VoteValue'}, u'tracking': None}, u'helpfulCount': 211, u'created': u'16 May, 2016', u'certifiedBuyer': True, u'title': u'Best Smartphone by Samsung', u'type': u'ProductReviewValue'}, u'tracking': None}
{u'action': None, u'fixed': False, u'value': {u'rating': 5, u'text': u"Updated Review on 02-August after 3 months of usage:
What I liked most:
Look : 100/100 - Very good looking phone. Gold color and the finishing is super cool
Size : 100/100 - 5.2 Inch is neither big nor small. I can still operate with one hand.. 
Battery : 100/100 - 3100 mAH is outstanding. 3G is always ON when i am out of home and Wi-Fi is always ON in home. I am charging mobile only once in every 36 hours. I use Whatsapp, instagram and Browsing mostly. 
Display : 90/100 - Not so bright and sharp as S series phones, but a real deal for the price. Impressed again. My only worry is about it is not having a Gorilla scratch proof glass. I may need to use tempered glass.
Touch : 95/100 - So smooth and I dont see any lags as of now.
Camera : 90/100 - Photos are good and can capture fast, but again not as great as S series phones. but at this price I believe this phone outclasses all other competitors in camera department. 

One last thing is about the SAMSUNG brand and its service center coverage, which is again awesome. 
Overall I am completely satisfied with the phone and this phone reached my expectations. 
What I disliked:
Earphone jack at the bottom.. I feel uncomfortable when chatting and listening to songs at same time
Low speaker volume, not a big deal though for me, As i don't use loudspeaker for songs mostly", u'reportAbuse': {u'action': {u'originalUrl': None, u'params': {u'vote': u'ABUSE', u'reviewId': u'e786669a-024b-4ef0-b70c-1e4fcf5fe5ff', u'reviewDomain': u'PRODUCT'}, u'loginType': u'LEGACY_LOGIN', u'url': None, u'fallback': None, u'type': u'REVIEW_VOTE', u'omnitureData': None, u'screenType': None, u'tracking': {}}, u'fixed': False, u'value': {u'count': 0, u'type': u'VoteValue'}, u'tracking': None}, u'totalCount': 272, u'downvote': {u'action': {u'originalUrl': None, u'params': {u'vote': u'DOWN', u'reviewId': u'e786669a-024b-4ef0-b70c-1e4fcf5fe5ff', u'reviewDomain': u'PRODUCT'}, u'loginType': u'LEGACY_LOGIN', u'url': None, u'fallback': None, u'type': u'REVIEW_VOTE', u'omnitureData': None, u'screenType': None, u'tracking': {}}, u'fixed': False, u'value': {u'count': 87, u'type': u'VoteValue'}, u'tracking': None}, u'id': u'e786669a-024b-4ef0-b70c-1e4fcf5fe5ff', u'author': u'Naresh Kareti', u'url': u'/reviews/e786669a-024b-4ef0-b70c-1e4fcf5fe5ff', u'upvote': {u'action': {u'originalUrl': None, u'params': {u'vote': u'UP', u'reviewId': u'e786669a-024b-4ef0-b70c-1e4fcf5fe5ff', u'reviewDomain': u'PRODUCT'}, u'loginType': u'LEGACY_LOGIN', u'url': None, u'fallback': None, u'type': u'REVIEW_VOTE', u'omnitureData': None, u'screenType': None, u'tracking': {}}, u'fixed': False, u'value': {u'count': 185, u'type': u'VoteValue'}, u'tracking': None}, u'helpfulCount': 185, u'created': u'13 May, 2016', u'certifiedBuyer': True, u'title': u'Absolute Stunner and Impressive', u'type': u'ProductReviewValue'}, u'tracking': None}
{u'action': None, u'fixed': False, u'value': {u'rating': 3, u'text': u'Hi,

I got this phone from Flipkart on Friday and here is my 3 days review.

Pros:
 * Beautiful design
 * Very handy, easy to handle
 * Battery backup is great
 * Back camera is good
 * No heating issues
 
Cons:
 * If we are charging, it will not show any light or any notification whether it is charging or not. We need to on the screen and check whether it is charging or not. So every time we need to turn it on and see whether it is charging or not.
* Camera issue: Once you take the picture and then press the back button it is taking some time to come back to camera mode.
* If you turn on the flash and take pic with back camera it is taking some time to capture the picture. With out Flash it is taking very fast.
* Volume is very low. Not enough for a medium sized room.
* Ear phones are not good especially for me. 


Will post my feedback after using it another 15 days.

Thanks', u'reportAbuse': {u'action': {u'originalUrl': None, u'params': {u'vote': u'ABUSE', u'reviewId': u'9cbcd27c-a8ad-4793-978a-5903cd086252', u'reviewDomain': u'PRODUCT'}, u'loginType': u'LEGACY_LOGIN', u'url': None, u'fallback': None, u'type': u'REVIEW_VOTE', u'omnitureData': None, u'screenType': None, u'tracking': {}}, u'fixed': False, u'value': {u'count': 0, u'type': u'VoteValue'}, u'tracking': None}, u'totalCount': 212, u'downvote': {u'action': {u'originalUrl': None, u'params': {u'vote': u'DOWN', u'reviewId': u'9cbcd27c-a8ad-4793-978a-5903cd086252', u'reviewDomain': u'PRODUCT'}, u'loginType': u'LEGACY_LOGIN', u'url': None, u'fallback': None, u'type': u'REVIEW_VOTE', u'omnitureData': None, u'screenType': None, u'tracking': {}}, u'fixed': False, u'value': {u'count': 67, u'type': u'VoteValue'}, u'tracking': None}, u'id': u'9cbcd27c-a8ad-4793-978a-5903cd086252', u'author': u'ileep ', u'url': u'/reviews/9cbcd27c-a8ad-4793-978a-5903cd086252', u'upvote': {u'action': {u'originalUrl': None, u'params': {u'vote': u'UP', u'reviewId': u'9cbcd27c-a8ad-4793-978a-5903cd086252', u'reviewDomain': u'PRODUCT'}, u'loginType': u'LEGACY_LOGIN', u'url': None, u'fallback': None, u'type': u'REVIEW_VOTE', u'omnitureData': None, u'screenType': None, u'tracking': {}}, u'fixed': False, u'value': {u'count': 145, u'type': u'VoteValue'}, u'tracking': None}, u'helpfulCount': 145, u'created': u'16 May, 2016', u'certifiedBuyer': True, u'title': u'Good looking phone with some drawbacks', u'type': u'ProductReviewValue'}, u'tracking': None}
{u'action': None, u'fixed': False, u'value': {u'rating': 5, u'text': u'Super Amoled Display..2 GB RAM with Latest Android Marshmallow OS only for 13K....its difficult to get Samsung Phone with 2 GB ram in such a low price Range...used for 15 days....Going Smooth....Awesome Earphone Quality.....selfie and back Camera Good.....Battery last for more than a day with Continous usage or will go for two days....Free Microsoft apps and Much More...', u'reportAbuse': {u'action': {u'originalUrl': None, u'params': {u'vote': u'ABUSE', u'reviewId': u'1546ed16-5945-4257-9f2d-0d86db7ed92e', u'reviewDomain': u'PRODUCT'}, u'loginType': u'LEGACY_LOGIN', u'url': None, u'fallback': None, u'type': u'REVIEW_VOTE', u'omnitureData': None, u'screenType': None, u'tracking': {}}, u'fixed': False, u'value': {u'count': 0, u'type': u'VoteValue'}, u'tracking': None}, u'totalCount': 34, u'downvote': {u'action': {u'originalUrl': None, u'params': {u'vote': u'DOWN', u'reviewId': u'1546ed16-5945-4257-9f2d-0d86db7ed92e', u'reviewDomain': u'PRODUCT'}, u'loginType': u'LEGACY_LOGIN', u'url': None, u'fallback': None, u'type': u'REVIEW_VOTE', u'omnitureData': None, u'screenType': None, u'tracking': {}}, u'fixed': False, u'value': {u'count': 9, u'type': u'VoteValue'}, u'tracking': None}, u'id': u'1546ed16-5945-4257-9f2d-0d86db7ed92e', u'author': u'Prashant Dias', u'url': u'/reviews/1546ed16-5945-4257-9f2d-0d86db7ed92e', u'upvote': {u'action': {u'originalUrl': None, u'params': {u'vote': u'UP', u'reviewId': u'1546ed16-5945-4257-9f2d-0d86db7ed92e', u'reviewDomain': u'PRODUCT'}, u'loginType': u'LEGACY_LOGIN', u'url': None, u'fallback': None, u'type': u'REVIEW_VOTE', u'omnitureData': None, u'screenType': None, u'tracking': {}}, u'fixed': False, u'value': {u'count': 25, u'type': u'VoteValue'}, u'tracking': None}, u'helpfulCount': 25, u'created': u'7 Sep, 2016', u'certifiedBuyer': True, u'title': u'Brilliant Phone Compared to Money', u'type': u'ProductReviewValue'}, u'tracking': None}
{u'action': None, u'fixed': False, u'value': {u'rating': 5, u'text': u"Nice.battery backup it's good", u'reportAbuse': {u'action': {u'originalUrl': None, u'params': {u'vote': u'ABUSE', u'reviewId': u'a9f2f6a0-2272-4187-bd37-48eb8a0a85c9', u'reviewDomain': u'PRODUCT'}, u'loginType': u'LEGACY_LOGIN', u'url': None, u'fallback': None, u'type': u'REVIEW_VOTE', u'omnitureData': None, u'screenType': None, u'tracking': {}}, u'fixed': False, u'value': {u'count': 0, u'type': u'VoteValue'}, u'tracking': None}, u'totalCount': 5, u'downvote': {u'action': {u'originalUrl': None, u'params': {u'vote': u'DOWN', u'reviewId': u'a9f2f6a0-2272-4187-bd37-48eb8a0a85c9', u'reviewDomain': u'PRODUCT'}, u'loginType': u'LEGACY_LOGIN', u'url': None, u'fallback': None, u'type': u'REVIEW_VOTE', u'omnitureData': None, u'screenType': None, u'tracking': {}}, u'fixed': False, u'value': {u'count': 0, u'type': u'VoteValue'}, u'tracking': None}, u'id': u'a9f2f6a0-2272-4187-bd37-48eb8a0a85c9', u'author': u'Flipkart Customer', u'url': u'/reviews/a9f2f6a0-2272-4187-bd37-48eb8a0a85c9', u'upvote': {u'action': {u'originalUrl': None, u'params': {u'vote': u'UP', u'reviewId': u'a9f2f6a0-2272-4187-bd37-48eb8a0a85c9', u'reviewDomain': u'PRODUCT'}, u'loginType': u'LEGACY_LOGIN', u'url': None, u'fallback': None, u'type': u'REVIEW_VOTE', u'omnitureData': None, u'screenType': None, u'tracking': {}}, u'fixed': False, u'value': {u'count': 5, u'type': u'VoteValue'}, u'tracking': None}, u'helpfulCount': 5, u'created': u'17 Aug, 2016', u'certifiedBuyer': True, u'title': u"It's very good", u'type': u'ProductReviewValue'}, u'tracking': None}
{u'action': None, u'fixed': False, u'value': {u'rating': 5, u'text': u'This Phone is awesome..Must Buy', u'reportAbuse': {u'action': {u'originalUrl': No

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...