- 已经有第三方库了,但是更新速度较慢,不是很成熟
- 库名:scrapy-ja3
- 使用方式1:直接在
settings.py
配置文件中加入一行
# ja3伪造
DOWNLOAD_HANDLERS = {
'http': 'scrapy_ja3.download_handler.JA3DownloadHandler',
'https': 'scrapy_ja3.download_handler.JA3DownloadHandler'
}
- 使用方式2:在爬虫文件中实现(
settings.py
文件中不配置)
from scrapy import Request, Spider
class Ja3TestSpider(Spider):
name="ja3_test"
custom_settings = {
'DOWNLOAD_HANDLERS': {
'http': 'scrapy_ja3.download_handler.JA3DownloadHandler',
'https': 'scrapy_ja3.download_handler.JA3DownloadHandler',
}
}
def start_requests(self):
start_urls = [
'https://tls.browserleaks.com/json',
]
for url in start_urls:
yield Request(url=url, callback=self.parse_ja3)
def parse_ja3(self, response):
self.logger.info(response.text)
self.logger.info("ja3_hash: " + response.json()['ja3_hash'])
由于
scrapy-ja3
不支持最新版的scrapy
前两个依赖一定要指定版本,否则一定会出现各种依赖问题
pip install Twisted==22.10.0
pip install Scrapy==2.9.0
pip install scrapy-ja3
Source link
lol