HTTPX：Python 中的异步HTTP客户端

2024-06-27码农

HTTPX 是一个高效并发请求Python 库，用于发送异步 HTTP 请求。它是基于 httpcore 和 asyncio 构建的，这意味着它可以利用 Python 的异步功能来执行网络操作，从而提高应用程序的性能和效率，特别是在处理大量并发请求时。

「安装」

pip install httpx

「使用」

「1. 发送 GET 请求」：

import httpx asyncdef fetch_data(): asyncwith httpx.AsyncClient() as client: response = await client.get('https://api.example.com/data') print(response.status_code) # 打印状态码 print(response.json()) # 打印 JSON 响应体 # 运行异步函数 import asyncio asyncio.run(fetch_data())

「2. 发送 POST 请求」：

asyncdef post_data(): asyncwith httpx.AsyncClient() as client: response = await client.post('https://api.example.com/submit', json={'key': 'value'}) print(response.status_code) print(response.text) # 打印响应体 asyncio.run(post_data())

「3. 发送带有请求头的请求」：

asyncdef request_with_headers(): headers = { 'User-Agent': 'MyApp/1.0', 'Accept': 'application/json', } asyncwith httpx.AsyncClient(headers=headers) as client: response = await client.get('https://api.example.com/data') print(response.json()) asyncio.run(request_with_headers())

「4. 使用请求参数」：

asyncdef request_with_params(): params = { 'query': 'search term', 'page': 2, } asyncwith httpx.AsyncClient() as client: response = await client.get('https://api.example.com/search', params=params) print(response.json()) asyncio.run(request_with_params())

「5. 使用超时」：

asyncdef request_with_timeout(): timeout = 5# 5 seconds asyncwith httpx.AsyncClient(timeout=timeout) as client: response = await client.get('https://api.example.com/data') print(response.status_code) asyncio.run(request_with_timeout())

「6. 使用代理」：

asyncdef request_with_proxy(): proxies = { 'http': 'http://10.10.1.10:3128', 'https': 'https://10.10.1.11:1080', } asyncwith httpx.AsyncClient(proxies=proxies) as client: response = await client.get('https://api.example.com/data') print(response.status_code) asyncio.run(request_with_proxy())

「7. 使用 Cookies」：

asyncdef request_with_cookies(): cookies = {'session_token': '123456789'} asyncwith httpx.AsyncClient(cookies=cookies) as client: response = await client.get('https://api.example.com/data') print(response.status_code) asyncio.run(request_with_cookies())

「实例：获取百度搜索结果」

import httpx import asyncio from bs4 import BeautifulSoup asyncdef fetch_baidu_search_results(keyword): asyncwith httpx.AsyncClient(headers={"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3"}) as client: response = await client.get("https://www.baidu.com/s?wd={keyword}") if response.status_code == 200: soup = BeautifulSoup(response.text, 'html.parser') h3_tags = soup.find_all('h3') search_results = [] for h3 in h3_tags: a_tag = h3.find('a') if a_tag: title = a_tag.get_text(strip=True) link = a_tag['href'] search_results.append({'title': title, 'link': link}) return search_results keyword = "HTTPX"#关键词 results = asyncio.run(fetch_baidu_search_results(keyword)) for result in results: print(result)

「HttpX 「vs」 Requests」

特性/库	HTTPX	Requests
异步支持	支持	不支持
同步支持	不支持同步请求	支持
HTTP/2	支持	通过第三方库支持（如 `requests-toolbelt` ）
连接池	内置支持	不内置，但可以通过 `requests.Session` 实现
流式上传	支持	支持
流式下载	支持	支持
超时控制	支持连接超时和读取超时	支持
重试机制	内置支持	需要使用第三方库（如 `urllib3` 的重试功能）
代理支持	内置支持	内置支持
Cookie 管理	内置支持	内置支持
JSON 支持	内置支持	内置支持
表单数据	内置支持	内置支持
SSL/TLS 支持	内置支持	内置支持
测试服务器	提供一个简单的测试服务器 `httpx.TestClient`	不提供
错误处理	使用异常来处理错误	使用异常来处理错误
社区和流行度	相对较新，但迅速增长	非常流行，广泛使用
维护状态	活跃	活跃