求个正则表达式匹配 URL 的,最好能够提取 host、port、path、params、hash 信息

前端识别一些链接,主要是缩短字符用途。如果不能提取到很细,提取 host 信息也是OK的~

regexp
54 views
Comments
登录后评论
Sign In
·

试试这个:

/^(?:([A-Za-z]+):)?(\/{0,3})([0-9.\-A-Za-z]+)(?::(\d+))?(?:\/([^?#]*))?(?:\?([^#]*))?(?:#(.*))?$/

参考这里:用正则表达式分析 URL。(文章有详细解释)

var parse_url = /^(?:([A-Za-z]+):)?(\/{0,3})([0-9.\-A-Za-z]+)(?::(\d+))?(?:\/([^?#]*))?(?:\?([^#]*))?(?:#(.*))?$/;
var url = "https://harttle.land:80/tags.html?simple=true#HTML",
    result = parse_url.exec(url);
    blanks = '       ';
    fields = ['url', 'scheme', 'slash', 'host', 'port', 'path', 'query', 'hash'];
fields.forEach(function(field, i){
    console.log(field + ':' + blanks.substr(field.length) + result[i]);
});

可以解析出:

url:    https://harttle.land:80/tags.html?simple=true#HTML
scheme: http
slash:  //
host:   harttle.land
port:   80
path:   tags.html
query:  single=true
hash:   HTML

这个网站可以看到正则的状态机结构,方便调试: