Welcome to the Soupy documentation¶

Soupy is a wrapper around BeautifulSoup that makes it easier to search through HTML and XML documents.

from soupy import Soupy , Q html = """ <div id="main"> <div>The web is messy</div> and full of traps <div>but Soupy loves you</div> </div>""" print ( Soupy ( html ) . find ( id = 'main' ) . children . each ( Q . text . strip ()) # extract text from each node, trim whitespace . filter ( len ) # remove empty strings . val ()) # dump out of Soupy

[u'The web is messy', u'and full of traps', u'but Soupy loves you']

Compare to the same task in BeautifulSoup:

from bs4 import BeautifulSoup , NavigableString html = """ <div id="main"> <div>The web is messy</div> and full of traps <div>but Soupy loves you</div> </div>""" result = [] for node in BeautifulSoup ( html ) . find ( id = 'main' ) . children : if isinstance ( node , NavigableString ): text = node . strip () else : text = node . text . strip () if len ( text ): result . append ( text ) print ( result )

[u'The web is messy', u'and full of traps', u'but Soupy loves you']

Soupy uses BeautifulSoup under the hood and provides a very similar API, while smoothing over some of the warts in BeautifulSoup. Soupy also adds a functional interface for chaining together operations, gracefully dealing with failed searches, and extracting data into simpler formats.