Python3爬取妹子图——爬虫&下载网络图片

IT黑名单 2016-12-2 14:04:11

准备工具:

Python 3.4.3
图片网站
import re
import time
from bs4 import BeautifulSoup
from urllib.request import urlopen, urlretrieve

def unzip(data, charset='utf8'):
    import gzip
    return gzip.decompress(data).decode(charset)

def getHtml(url, charset='utf8'):
    print(url)
    resp = urlopen(url)
    encoding = resp.info().get('Content-Encoding')
    if 'gzip' == encoding:
        return unzip(resp.read(), charset)
    return resp.read().decode(charset)

def getImg(url):
    html = getHtml(url)
    soup = BeautifulSoup(html, 'html.parser')
    imgs = soup.findAll(name="img", attrs={"src":re.compile("\.jpg")})
    for img in imgs:
        imgUrl = img['src']
        print(imgUrl)
        urlretrieve(imgUrl, '%s.jpg'%time.time())

def main():
    getImg('http://www.mmjpg.com/')
if __name__=='__main__':
    main()

贴个执行结果图:(逃ing...)



转载请注明来源【IT黑名单

本文链接:http://blog.itblacklist.cn/20161202/8432.html

© Copyright 2016 IT黑名单 Inc.All Rights Reserved. 豫ICP备15018592号-2