Python,學霸
閱讀指南
簡介
安裝
例項
輸出
安裝
pip install mechanicalsoup
簡介
大家好!今天給你們帶來了透過mechanicalsoup爬取gitee搜尋結果的簡單例項,可以設定頁數。
例項
import mechanicalsoup
def fetch_repo_info(keyword, pages=2):
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3"
}
base_url = "https://search.gitee.com/?skin=rec&type=repository&q={}&pageno={}"
all_results = [] #用於儲存所有頁面的結果
for page in range(1, pages + 1):
url = base_url.format(keyword, page)
browser = mechanicalsoup.StatefulBrowser()
browser.session.headers.update(headers)
browser.open(url) #存取指定的URL
page_content = browser.get_current_page() #獲取當前頁面
items = page_content.find_all( class_="item")
for item in items:
link_element = item.find('a', href=True)
title = link_element.text.strip() if link_element else"No title found"
if title == "No title found":
continue
link = link_element['href'] if link_element else"No link found"
desc_element = item.find( class_="desc")
description = desc_element.text.strip() if desc_element else"No description found"
all_results.append({
"title": title,
"link": link,
"description": description
})
return all_results
#搜尋
print(fetch_repo_info("PYTHON",3))
輸出
[{'title': 'OpenHarmony', 'link': 'https://gitee.com/openharmony?_from=gitee_search', 'description': 'OpenHarmony 是開放原子開源基金會(OpenAtom Foundation)旗下開源計畫,定位是一款面向全場景的開源分散式作業系統。'}, {'title': '小柒2012/從 零學Python', 'link': 'https://gitee.com/52it style/Python?_from=gitee_search', 'description': '從零學Python,各種開發案例,非週期性更新。'}, {'title': '程式語言演算法集/Python', 'link': 'https://gitee.com/TheAlgorithms/Python?_from=gitee_search', 'description': 'Python 演算法集'}, {'title': '程式設計師晚楓/python-office', 'link': 'https://gitee.com/CoderWanFeng/python-office?_from=gitee_search', 'description': 'Python自動化辦公的第三方庫:pip install python-office'}, {'title': 'YuHong-LDU/Python-AI', 'link': 'https://gitee.com/yuhong-ldu/python-ai?_from=gitee_search', 'description': 'Python與人工智慧實踐 (魯東大學信電學院人工智慧教研室)'}, {'title': '碼多多AI/likeadmin(Python版)', 'link': 'https://gitee.com/likeadmin/likeadmin_python?_from=gitee_search', 'description': '🚀🚀🚀likeadmin是一套快速開發管理後台,使用流行的技術棧Python3、FastAPI、TypeScript、Vue3、vite2、Element Plus1.2(ElementUI)。 後台管理系統、後台管理框架、Python管理後台、FastApi管理後台、前後端分離管理後台、Vue3管理後台、Vue'}, {'title': 'awesome-lib/awesome-python', 'link': 'https://gitee.com/awesome-lib/awesome-python?_from=gitee_search', 'description': 'awesome-python 的中文版'}, {'title': 'Gitee 極速下載/jackfrued-Python-100-Days', 'link': 'https://gitee.com/mirrors/jackfrued-Python-100-Days?_from=gitee_search', 'description': 'Python - 100天從新手到大師'}, {'title': 'mktime/python-learn', 'link': 'https://gitee.com/mktime/python-learn?_from=gitee_search', 'description': 'GPT對話,Python基礎編程範例:Excel讀寫追加處理,XML解析、JSON解析、FLV與MP4轉換,PyQT界面應用程式開發範例等,https證書到期檢測,糗百爬蟲,pdf和圖片互相轉換,socket使用,百度OCR呼叫例子,IP及埠快速掃描。'}, {'title': '非空/QrF.Python.FaceRecognition', 'link': 'https://gitee.com/QR/QrF.Python.FaceRecognition?_from=gitee_search', 'description': 'Python 人臉辨識技術'}, {'title': '天勤量化(TqSdk)/tqsdk-python', 'link': 'https://gitee.com/tianqin_quantification_tqsdk/tqsdk-python?_from=gitee_search', 'description': '簡單但強大的Python量化開發包'}, {'title': 'EliteQuant/EliteQuant_Python', 'link': 'https://gitee.com/EliteQuant/EliteQuant_Python?_from=gitee_search', 'description': 'Python量化投資交易平台。基於Python3的多執行緒並行式高頻交易平台, 提供一致的回測和即時交易解決方案。它遵循現代設計模式,例如事件驅動, 伺服器/客戶端架構和松散耦合的強大穩定的分布式系統。它遵循與其他EliteQuant產品線相同的結構和績效'}, {'title': 'OpenHarmony', 'link': 'https://gitee.com/openharmony?_from=gitee_search', 'description': 'OpenHarmony 是開放原子開源基金會(OpenAtom Foundation)旗下開源計畫,定位是一款面向全場景的開源分散式作業系統。'}, {'title': '火鳥/Python開源踩地雷遊戲PyMine', 'link': 'https://gitee.com/jerryshensjf/PyMine?_from=gitee_search', 'description': 'Python WxPython開源踩地雷遊戲PyMine為開 源踩地雷遊戲PyMine 使用Python語言和WxPython UI框架。本例移植自本人開源常式JMine 請在程式所在目錄使用python PyMine.py啟動常式需要先安裝Python 3'}, {'title': '武沛齊/python_course', 'link': 'https://gitee.com/wupeiqi/python_course?_from=gitee_search', 'description': 'Python全棧開發課件 & 源碼 & 題目 & 答案'}, {'title': '唐佐林/Python for OpenHarmony', 'link': 'https://gitee.com/delphi-tang/python-for-hos?_from=gitee_search', 'description': '在鴻蒙裝置上使用 Python 編程。'}, {'title': 'Gitee Community/Python 貪吃蛇魔改大賽', 'link': 'https://gitee.com/gitee-community/Adapted-game?_from=gitee_search', 'description': 'Python 「貪吃蛇」 魔改大賽,是 Gitee 面向 Python 愛好者舉辦的一場創意編程比賽,旨在鼓勵喜愛 Python 編程或有豐富想象力、創新力的小夥伴積極參與開源,並為其提供競技和展示的舞台,將天馬行空的想象化為一行行程式碼,為經典的「貪吃蛇」小遊戲煥發新的生命力!'}, {'title': 'Python自動化辦公社群/python_auto_office', 'link': 'https://gitee.com/zhaofeng092/python_auto_office?_from=gitee_search', 'description': '關註公眾號:Python自動化辦公社群,發送:1109,領取【47頁PPT-Python如何進行自動化辦公?】。'}, {'title': 'keijack/python-simple-http-server', 'link': 'https://gitee.com/keijack/python-simple-http-server?_from=gitee_search', 'description': '一個超輕量級的 HTTP Server,支持執行緒和協程模式,源生支持 websocket 哦!你也可以非常容易的將其嵌入到 WSGI 與 ASGI 的伺服器裏。並且支持分布式 Session!'}, {'title': 'andyham/Python_junior', 'link': 'https://gitee.com/andyham_andy.ham/Python_junior?_from=gitee_search', 'description': 'The foundation of financial risk model programming'}, {'title': '耿直的小爬蟲/Python爬蟲', 'link': 'https://gitee.com/testp2y/python_reptilian?_from=gitee_search', 'description': '大數據時代 讓爬蟲爬取我們所需'}, {'title': 'vn.py官方/vn.py', 'link': 'https://gitee.com/vnpy/vnpy?_from=gitee_search', 'description': '基於Python的開源量化交易平台開發框架'}, {'title': '30秒學程式碼/30-seconds-of-python-code', 'link': 'https://gitee.com/seconds-of-code/30-seconds-of-python-code?_from=gitee_search', 'description': 'Python 語言版的 30 秒學程式碼'}, {'title': 'src-openEuler/fuse-python', 'link': 'https://gitee.com/src-openeuler/fuse-python?_from=gitee_search', 'description': 'Python bindings for FUSE - filesystem in userspace.'}, {'title': 'OpenHarmony', 'link': 'https://gitee.com/openharmony?_from=gitee_search', 'description': 'OpenHarmony 是開放原子開源基金會(OpenAtom Foundation)旗下開源計畫,定位是一款面向全場景的開源分散式作業系統。'}, {'title': 'src-openEuler/python-docker', 'link': 'https://gitee.com/src-openeuler/python-docker?_from=gitee_search', 'description': 'A Python library for the Docker Engine API'}, {'title': 'keijack/python-eureka-client', 'link': 'https://gitee.com/keijack/python-eureka-client?_from=gitee_search', 'description': '一個 Python 編寫的 eureka 客戶端,同時支持註冊與發現服務,能使得你的程式碼非常方便 地接入 spring cloud 中。'}, {'title': 'src-openEuler/python-importlib-metadata', 'link': 'https://gitee.com/src-openeuler/python-importlib-metadata?_from=gitee_search', 'description': 'Read metadata from Python packages'}, {'title': '6tail/lunar-python', 'link': 'https://gitee.com/6tail/lunar-python?_from=gitee_search', 'description': '行事曆、公歷(陽歷)、農歷(陰歷、老黃歷)、佛歷、道歷,支持節假日、星座、儒略日、幹支、生肖、節氣、節日、彭祖百忌、每日宜忌、吉神宜趨兇煞宜忌、吉神(喜神/福神/財神/陽貴神/陰貴神)方位、胎神方位、沖煞、納音、星宿、八字、五行、十神、建除十二值星、青龍'}, {'title': 'src-openEuler/python-texttable', 'link': 'https://gitee.com/src-openeuler/python-texttable?_from=gitee_search', 'description': 'Python module to generate a formatted text table, using ASCII characters'}, {'title': '百曉通客棧/BXT-AR4Python', 'link': 'https://gitee.com/Lindor_L/BXT-AR4Python?_from=gitee_search', 'description': '百曉通客棧-增強現實開發庫(with Python)'}, {'title': 'src-oepkgs/python-kafka-python', 'link': 'https://gitee.com/src-oepkgs/python-kafka-python?_from=gitee_search', 'description': 'Pure Python client for Apache Kafka'}, {'title': 'src-oepkgs/python-python-gitlab', 'link': 'https://gitee.com/src-oepkgs/python-python-gitlab?_from=gitee_search', 'description': 'Python module for interacting with the GitLab API'}, {'title': 'ni1o1/pygeo-tutorial', 'link': 'https://gitee.com/ni1o1/pygeo-tutorial?_from=gitee_search', 'description': 'Tutorial of geospatial data processing using python 用python分析時空數據的教程(in Chinese and English )'}, {'title': 'src-openEuler/python-meson-python', 'link': 'https://gitee.com/src-openeuler/python-meson-python?_from=gitee_search', 'description': 'Meson Python build backend (PEP 517)'}, {'title': 'wilson_yin/Zero basics Python', 'link': 'https://gitee.com/wilsonyin/zero-basics-python?_from=gitee_search', 'description': '零基礎學Python'}]