求个正则表达式匹配 URL 的,最好能够提取 host、port、path、params、hash 信息
前端识别一些链接,主要是缩短字符用途。如果不能提取到很细,提取 host 信息也是OK的~
试试这个:
/^(?:([A-Za-z]+):)?(\/{0,3})([0-9.\-A-Za-z]+)(?::(\d+))?(?:\/([^?#]*))?(?:\?([^#]*))?(?:#(.*))?$/
参考这里:用正则表达式分析 URL。(文章有详细解释)
var parse_url = /^(?:([A-Za-z]+):)?(\/{0,3})([0-9.\-A-Za-z]+)(?::(\d+))?(?:\/([^?#]*))?(?:\?([^#]*))?(?:#(.*))?$/;
var url = "https://harttle.land:80/tags.html?simple=true#HTML",
result = parse_url.exec(url);
blanks = ' ';
fields = ['url', 'scheme', 'slash', 'host', 'port', 'path', 'query', 'hash'];
fields.forEach(function(field, i){
console.log(field + ':' + blanks.substr(field.length) + result[i]);
});
可以解析出:
url: https://harttle.land:80/tags.html?simple=true#HTML
scheme: http
slash: //
host: harttle.land
port: 80
path: tags.html
query: single=true
hash: HTML
这个网站可以看到正则的状态机结构,方便调试: