python 采集小程序

作者: admin 分类: python 发布时间: 2012-08-01 15:35 ė 6 没有评论

# coding=utf8
#LINUXQQ for crawler data v0.1
import os
import re
import urllib

videourl = ‘http://www.51zxw.net/’
rootdir = ‘D:\\video\\’

def progress(blocknum,blocksize,totalsize):
per = 100.0 * blocknum * blocksize / totalsize
if per > 100:
per = 100
print “%.2f%%”% per

def contact(link,directory):
newlink = link.replace(‘&’,’&’)
newhtml = urllib.urlopen(str(videourl + newlink))
newdata = re.compile(‘]*?>(.*?)‘, re.S|re.U)
req = re.findall(data,html.read())
for i in req:
contact(i[0],i[1])

if __name__==’__main__’:
i = 8
p = 1
while p <= i: url = 'http://www.51zxw.net/list.aspx?page=%d&cid=359' % (p) p += 1 crawler(url)

本文出自 小Q,转载时请注明出处及相应链接。

本文永久链接: http://www.linuxqq.com/archives/871.html

0
更多
Ɣ回顶部