Requests库是用python语言基于urllib编写的,采用的是Apache2 Licensed开源协议的HTTP库。本文旨在总结使用requests库实现爬虫post方法的几种区别。
requests库版本:2.24.0

0x01 POST表单提交参数

import requests
url="http://192.168.1.122"
payload = {'key1': 'value#1', 'key2': 'value 2'}
#或者data参数传入元组列表,payload = (('key1', 'value#1'), ('key1', 'value 2')),两者发包相同
r=requests.post(url,data=payload)

发包情况如下,Content-Type字段为application/x-www-form-urlencoded,它是post的默认格式,使用js中URLencode转码方法。包括将name、value中的空格替换为加号;将非ascii字符做百分号编码;将input的name、value用'='连接,不同的input之间用'&'连接。

POST / HTTP/1.1
Host: 192.168.1.122
User-Agent: python-requests/2.24.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Length: 27
Content-Type: application/x-www-form-urlencoded

key1=value%231&key2=value+2

0x02 POST以json形式提交参数

import requests
import json
url="http://192.168.1.122"
payload = {'key1': 'value#1','key2':'value 2'}
r=requests.post(url,json=payload)

此时,Content-Type字段为application/json,请求体以json格式发送:

POST / HTTP/1.1
Host: 192.168.1.122
User-Agent: python-requests/2.24.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Length: 38
Content-Type: application/json

{"key1": "value#1", "key2": "value 2"}

0x03 POST上传字符串

import requests
url="http://192.168.1.122"
xml = "my xml\n"
xml2="""{"key":"value"}"""
xml3=xml+xml2
#headers可以根据需要自定义,headers = {'Content-Type': 'application/html'}
r=requests.post(url,data=xml3)

很多时候抓包发现post发送的请求体比较复杂,我们可以把符合格式的数据做成字符串的形式上传,然后headers根据需要自己定义。请求如下:

POST / HTTP/1.1
Host: 192.168.1.122
User-Agent: python-requests/2.24.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Length: 22

my xml
{"key":"value"}

0x04 POST上传文件

import requests
url="http://192.168.1.122"
files = {'file': open('test.txt', 'rb')}
r=requests.post(url,files=files)

使用files参数,即会以表单形式上传文件。

POST / HTTP/1.1
Host: 192.168.1.122
User-Agent: python-requests/2.24.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Length: 157
Content-Type: multipart/form-data; boundary=2e6d3d07e843ebdfed4289659ba3a23a

--2e6d3d07e843ebdfed4289659ba3a23a
Content-Disposition: form-data; name="file"; filename="test.txt"

hello world!

--2e6d3d07e843ebdfed4289659ba3a23a--
参考文档:
https://requests.readthedocs.io/zh_CN/latest/user/quickstart.html#post
https://blog.csdn.net/u013827143/article/details/86222486
https://www.cnblogs.com/lly-lcf/p/13876823.html
Last modification:January 20th, 2021 at 01:43 am