记录一些常见的爬虫 UA,方便在爬虫访问是进行统计
主体 | user-agent | 用途 |
---|---|---|
googlebot | 搜索引擎 | |
google-structured-data-testing-tool | 测试工具 | |
Mediapartners-Google | Adsense广告网页被访问后,爬虫就来访 | |
Microsoft | bingbot | 搜索引擎 |
Linked | linkedinbot | 应用内搜索 |
百度 | baiduspider | 搜索引擎 |
奇虎 360 | 360Spider | 搜索引擎 |
搜狗 | Sogou Spider | 搜索引擎 |
Yahoo | Yahoo! Slurp China | 搜索引擎 |
Yahoo | Yahoo! Slurp | 搜索引擎 |
头条 | Bytespider | 搜索引擎 |
twitterbot | 应用内搜索 | |
facebookexternalhit | 应用内搜索 | |
- | rogerbot | - |
- | embedly | - |
Quora | quora link preview | - |
- | showyoubot | - |
- | outbrain | - |
- | - | |
- | slackbot | - |
- | vkShare | - |
- | W3C_Validator | - |
nginx 判断:
if ($http_user_agent ~* "googlebot|bingbot|yandex|baiduspider|360Spider|Sogou Spider|Bytespider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest\/0\.|pinterestbot|slackbot|vkShare|W3C_Validator|whatsapp") {
## do something
}