Shortest URL regex match in JavaScript

  • Create the shortest regular expression that will roughly match a URL in text when run in JavaScript


    "some text".match(/your regular expression goes here/);

    The regular expression needs to

    • capture all valid URLS that are for http and https.

    • not worry about not matching for URL looking strings that aren't actually valid URLS like super.awesome/cool

    • be valid when run as a JavaScript regex

    Test criteria:


    Not Match:

    • example

    • super/cool

    • Good Morning

    • i:can

    • hello.

    Here is a test that might help clarify a bit

    I apologize for the lack of clarity, I hadn't realized how awful matching URLs was.

    Ahgrrrr! I miss my edit privileges! I you're going to restrict the game to one language perhaps you should tag it with that language.

    What constitute a valid URL character? because I can simply use `\w` for everything Do you expect backreferences for different URL components?

    "A URI is a sequence of characters from a very limited set, i.e. the letters of the basic Latin alphabet, digits, and a few special characters," according to RFC 2396.

    Mike: I guess there is still some clarification in order. As it stands now I can just use `/:/` as the regular expression and match valid URIs and not match all your examples on the »Not match« list. As long as you're going that route it's simply the question: What is the shortest regular expression that will not match any of the example strings but still catch all URIs.

    I think this questions seems to be a "give me teh codez" question.

    @M28 the lack of clarity may seem that way but I did learn a lot from it and I'm still working on my own answer. If you think it should be deleted we can do that if it is better for the community.

    Just try to write a longer challenge with more details.

    Voting to close as there is no specification of what you count as a URL. Should we allow usernames/passwords in the URL? Ports? Should we validate URL lengths?

    I agree with Redwolf, in that this challenge requires a full spec of what is and isn't a valid URL, rather than that being defined through the test cases.

    Since this challenge is long inactive and I think the clarity points are uncontroversial by today's standards, I am going to be using my power as moderator to close this question. Feel free to flag this for attention if the issues have been resolved, or given the age of the question ask it again fresh (although I would recommend the sandbox first to make sure the spec is solid).

  • www0z0k

    www0z0k Correct answer

    10 years ago

    doesn't match 3 strings that it shouldn't, matches almost anything else ;)

    upd: it still doesn't match all 5

License under CC-BY-SA with attribution

Content dated before 7/24/2021 11:53 AM