色综合一区二区,成人av免费在线观看,日本特黄特色aaa大片免费

本文介紹了python 爬蟲(chóng)之BeautifulSoup 使用select方法詳解，分享給大家。具體如下：

				 
				?

									<html><head><title>The Dormouse's story</title></head>

									<body>

									<p class="title" name="dromouse"><b>The Dormouse's story</b></p>

									<p class="story">Once upon a time there were three little sisters; and their names were

									<a href="http://example.com/elsie" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" class="sister" id="link1"><!-- Elsie --></a>,

									<a href="http://example.com/lacie" rel="external nofollow" rel="external nofollow" rel="external nofollow" class="sister" id="link2">Lacie</a> and

									<a href="http://example.com/tillie" rel="external nofollow" rel="external nofollow" rel="external nofollow" class="sister" id="link3">Tillie</a>;

									and they lived at the bottom of a well.</p>

									<p class="story">...</p>

									"""

我們?cè)趯?CSS 時(shí)，標(biāo)簽名不加任何修飾，類名前加點(diǎn)，id名前加 #，在這里我們也可以利用類似的方法來(lái)篩選元素，用到的方法是 soup.select()，返回類型是 list

（1）通過(guò)標(biāo)簽名查找

				 
				?

									print soup.select('title') 

									#[<title>The Dormouse's story</title>]

									print soup.select('a')

									#[<a class="sister" href="http://example.com/elsie" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" id="link1"><!-- Elsie --></a>, <a class="sister" href="http://example.com/lacie" rel="external nofollow" rel="external nofollow" rel="external nofollow" id="link2">Lacie</a>, <a class="sister" href="http://example.com/tillie" rel="external nofollow" rel="external nofollow" rel="external nofollow" id="link3">Tillie</a>]

									print soup.select('b')

									#[<b>The Dormouse's story</b>]

（2）通過(guò)類名查找

				 
				?

									print soup.select('.sister')

									#[<a class="sister" href="http://example.com/elsie" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" id="link1"><!-- Elsie --></a>, <a class="sister" href="http://example.com/lacie" rel="external nofollow" rel="external nofollow" rel="external nofollow" id="link2">Lacie</a>, <a class="sister" href="http://example.com/tillie" rel="external nofollow" rel="external nofollow" rel="external nofollow" id="link3">Tillie</a>]

（3）通過(guò) id 名查找

				 
				?

									print soup.select('#link1')

									#[<a class="sister" href="http://example.com/elsie" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" id="link1"><!-- Elsie --></a>]

（4）組合查找

組合查找即和寫 class 文件時(shí)，標(biāo)簽名與類名、id名進(jìn)行的組合原理是一樣的，例如查找 p 標(biāo)簽中，id 等于 link1的內(nèi)容，二者需要用空格分開(kāi)

				 
				?

									print soup.select('p #link1')

									#[<a class="sister" href="http://example.com/elsie" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" id="link1"><!-- Elsie --></a>]

直接子標(biāo)簽查找

				 
				?

									print soup.select("head > title")

									#[<title>The Dormouse's story</title>]

（5）屬性查找

查找時(shí)還可以加入屬性元素，屬性需要用中括號(hào)括起來(lái)，注意屬性和標(biāo)簽屬于同一節(jié)點(diǎn)，所以中間不能加空格，否則會(huì)無(wú)法匹配到。

				 
				?

									print soup.select("head > title")

									#[<title>The Dormouse's story</title>]

									print soup.select('a[href="http://example.com/elsie" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" ]')

									#[<a class="sister" href="http://example.com/elsie" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" id="link1"><!-- Elsie --></a>]

同樣，屬性仍然可以與上述查找方式組合，不在同一節(jié)點(diǎn)的空格隔開(kāi)，同一節(jié)點(diǎn)的不加空格

				 
				?

									print soup.select('p a[href="http://example.com/elsie" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" ]')

									#[<a class="sister" href="http://example.com/elsie" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" id="link1"><!-- Elsie --></a>]