Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
860 views
in Technique[技术] by (71.8m points)

parsing - Find div with class using PHP Simple HTML DOM Parser

I am just starting with the mentioned Parser and somehow running on problems directly with the beginning.

Referring to this tutorial:

http://net.tutsplus.com/tutorials/php/html-parsing-and-screen-scraping-with-the-simple-html-dom-library/

I want now simply find in a sourcecode tne content of a div with a class ClearBoth Box

I retrieve the code with curl and create a simple html dom object:

$cl = curl_exec($curl);  
$html = new simple_html_dom();
$html->load($cl);

Then I wanted to add the content of the div into an array called divs:

$divs = $html->find('div[.ClearBoth Box]');

But now, when I print_r the $divs, it gives much more, despite the fact that the sourcecode has not more inside the div.

Like this:

Array
(
    [0] => simple_html_dom_node Object
        (
            [nodetype] => 1
            [tag] => br
            [attr] => Array
                (
                    [class] => ClearBoth
                )

            [children] => Array
                (
                )

            [nodes] => Array
                (
                )

            [parent] => simple_html_dom_node Object
                (
                    [nodetype] => 1
                    [tag] => div
                    [attr] => Array
                        (
                            [class] => SocialMedia
                        )

                    [children] => Array
                        (
                            [0] => simple_html_dom_node Object
                                (
                                    [nodetype] => 1
                                    [tag] => iframe
                                    [attr] => Array
                                        (
                                            [id] => ShowFacebookButtons
                                            [class] => SocialWeb FloatLeft
                                            [src] => http://www.facebook.com/plugins/xxx
                                            [style] => border:none; overflow:hidden; width: 250px; height: 70px;
                                        )

                                    [children] => Array
                                        (
                                        )

                                    [nodes] => Array
                                        (
                                        )

I do not understand why the $divs has not simply the code from the div?

Here is an example of the source code at the site:

<div class="ClearBoth Box">
          <div>
<i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualit?t: Sehr empfehlenswert"></i>
<i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualit?t: Sehr empfehlenswert"></i>
<i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualit?t: Sehr empfehlenswert"></i>
<i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualit?t: Sehr empfehlenswert"></i>
<i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualit?t: Sehr empfehlenswert"></i>

              <strong class="AlignMiddle LeftSmallPadding">gute peppige Qualit?t</strong> <span class="AlignMiddle">(17.03.2013)</span>
          </div>
          <div class="BottomMargin">
            gute Verarbeitung, sch?nes Design,
          </div>
        </div>

What am I doing wrong?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The right code to get a div with class is:

$ret = $html->find('div.foo');
//OR
$ret = $html->find('div[class=foo]');

Basically you can get elements as you were using a CSS selector.

source: http://simplehtmldom.sourceforge.net/manual.htm
How to find HTML elements? section, tab Advanced


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...