問題描述
我正在嘗試編寫一個正則表達式.
I'm trying to write a regexp.
一些背景信息:我嘗試查看我網站 URL 的 REQUEST_URI 是否包含另一個 URL.像這樣:
some background info: I am try to see if the REQUEST_URI of my website's URL contains another URL. like these:
- http://mywebsite.com/google.com/search=xyz
但是,網址不會總是包含http"或www".所以模式也應該匹配像這樣的字符串:
However, the url wont always contain the 'http' or the 'www'. so the pattern should also match strings like:
- http://mywebsite.com/yahoo.org/search=xyz
- http://mywebsite.com/www.yahoo.org/search=xyz強>
- http://mywebsite.com/msn.co.uk'
- http://mywebsite.com/http://msn.co.uk'
有一堆正則表達式可以匹配 url,但我發現沒有一個可以在 http 和 www 上進行可選匹配.
there are a bunch of regexps out there to match urls but none I have found do an optional match on the http and www.
我想知道匹配的模式是否可能是這樣的:
i'm wondering if the pattern to match could be something like:
^([a-z]).(com|ca|org|etc)(.)
我想也許另一個選擇是匹配任何包含點 (.) 的字符串.(因為我的應用程序中的其他 REQUEST_URI 通常不包含點)
I thought maybe another option was to perhaps just match any string that had a dot (.) in it. (as the other REQUEST_URI's in my application typically won't contain dots)
這對任何人都有意義嗎?我真的很感謝在這方面的幫助,因為它已經阻止了我的項目數周.
Does this make sense to anyone? I'd really appreciate some help with this its been blocking my project for weeks.
非常感謝-蒂姆
推薦答案
我建議使用一種簡單的方法,基本上是建立在你所說的基礎上,只是任何帶有點的東西,但也使用正斜杠.捕獲所有內容而不會錯過不尋常的 URL.所以就像:
I suggest using a simple approach, essentially building on what you said, just anything with a dot in it, but working with the forward slashes too. To capture everything and not miss unusual URLs. So something like:
^((?:https?://)?[^./]+(?:.[^./]+)+(?:/.*)?)$
它讀作:
- 可選 http://或 https://
- 非點或正斜杠字符
- 一組或多組點后跟非點或正斜杠字符
- 可選的正斜杠及其后的任何內容
將整個事物捕獲到第一個分組.
Capturing the whole thing to the first grouping.
它會匹配,例如:
nic.uk
nic.uk/
http://nic.uk
http://nic.uk/
https://example.com/test/?a=bcd
驗證它們是有效的 URL 是另一回事!它也會匹配:
Verifying they are valid URLs is another story! It would also match:
index.php
它不會匹配:
目錄/index.php
最小匹配基本上是something.something
,其中沒有正斜杠,除非它在點之后至少出現一個字符.因此,請確保不要將這種格式用于其他任何用途.
The minimal match is basically something.something
, with no forward slash in it, unless it comes at least one character past the dot. So just be sure not to use that format for anything else.
這篇關于正則表達式匹配帶有可選的“www"和協議的 URL的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!