Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
206 views
in Technique[技术] by (71.8m points)

python - How to create dictionary with Bs4 and requests?

I need to create a dictionary in Python which contain 'data-value': '(each size, ex: US: 6Y)'

Page source code looks like this:

<div data-header-element=".b-userMenu" data-sticky-header="true" id="js-size-pick">



<p class="m-productDescr_sizeItem">
<a class="m-productDescr_sizeBtn js-sizeItem js-tooltipHtml js-tooltip_rm" data-carturl="/cart/add?id=545896443" data-tip="    &lt;span&gt;   US: 3,5Y  &lt;/span&gt;
    &lt;span&gt;   EU: 35,5  &lt;/span&gt;
    &lt;span&gt;   CM: 22,5  &lt;/span&gt;
" data-value="545896443">
                                35,5
                            </a>
<span class="js-tooltipContent g-dn">
<span>   US: 3,5Y  </span>
<span>   EU: 35,5  </span>
<span>   CM: 22,5  </span>
</span>
</p>


<p class="m-productDescr_sizeItem">
<a class="m-productDescr_sizeBtn js-sizeItem js-tooltipHtml js-tooltip_rm" data-carturl="/cart/add?id=545895979" data-tip="    &lt;span&gt;   US: 4Y  &lt;/span&gt;
    &lt;span&gt;   EU: 36  &lt;/span&gt;
    &lt;span&gt;   CM: 23  &lt;/span&gt;
" data-value="545895979">
                                36
                            </a>
<span class="js-tooltipContent g-dn">
<span>   US: 4Y  </span>
<span>   EU: 36  </span>
<span>   CM: 23  </span>
</span>
</p>

Do you have any idea how to solve this? I tried with loop like for size in 'class'= "m-productDescr_sizeItem"


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You'll have to iterate through the span tags. Keep in mind with dictionaries, you can't have duplicate keys. So I created a list of dictionaries here since there are duplicate keys:

html = '''<div data-header-element=".b-userMenu" data-sticky-header="true" id="js-size-pick">



<p class="m-productDescr_sizeItem">
<a class="m-productDescr_sizeBtn js-sizeItem js-tooltipHtml js-tooltip_rm" data-carturl="/cart/add?id=545896443" data-tip="    &lt;span&gt;   US: 3,5Y  &lt;/span&gt;
    &lt;span&gt;   EU: 35,5  &lt;/span&gt;
    &lt;span&gt;   CM: 22,5  &lt;/span&gt;
" data-value="545896443">
                                35,5
                            </a>
<span class="js-tooltipContent g-dn">
<span>   US: 3,5Y  </span>
<span>   EU: 35,5  </span>
<span>   CM: 22,5  </span>
</span>
</p>


<p class="m-productDescr_sizeItem">
<a class="m-productDescr_sizeBtn js-sizeItem js-tooltipHtml js-tooltip_rm" data-carturl="/cart/add?id=545895979" data-tip="    &lt;span&gt;   US: 4Y  &lt;/span&gt;
    &lt;span&gt;   EU: 36  &lt;/span&gt;
    &lt;span&gt;   CM: 23  &lt;/span&gt;
" data-value="545895979">
                                36
                            </a>
<span class="js-tooltipContent g-dn">
<span>   US: 4Y  </span>
<span>   EU: 36  </span>
<span>   CM: 23  </span>
</span>
</p>'''

Given that html:

from bs4 import BeautifulSoup

soup = BeautifulSoup(html, 'html.parser')
spans = soup.find_all('span',{'class':'js-tooltipContent g-dn'})

dict_list = []
for span in spans:
    
    alpha = span.find_all('span')
    id_temp = span.parent()[0]['data-value']
    for each in alpha:
        temp_dict = {}
        values = each.text.strip().split(':')
        k = values[0].strip()
        v = values[1].strip()
        
        temp_dict.update({'size':v, 'id':id_temp})
        dict_list.append(temp_dict)

Output:

print (dict_list)
[{'size': '3,5Y', 'id': '545896443'}, {'size': '35,5', 'id': '545896443'}, {'size': '22,5', 'id': '545896443'}, {'size': '4Y', 'id': '545895979'}, {'size': '36', 'id': '545895979'}, {'size': '23', 'id': '545895979'}]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...