How To Get The Domain From A URL Using JavaScript

| Created: August 29th, 2011
JavaScript Development 12 Comments

I’m sure this is a common problem: getting the domain or host name from a URL using JavaScript. There are certainly many solutions to this problem out there. However, the solutions I found weren’t robust enough for my needs, so I ended up writing my own.

The Problem With Most Solutions

Most of the solutions I found suffered from one or more of the following problems:

They rely on window.location.hostname (or similar)

The vast majority of solutions out there work by parsing window.location.hostname or window.location.href. This would be fine if I was working out the domain of the page I’m currently on, but I want to work it out for a URL that I’m not actually visiting at the moment. It has to work for any URL stored in a variable in string format.

They don’t cater for URL parameters or hash URLs

Most solutions parse the URL based only on the forward slash. That works for the most common URLs, such as

  • http://scratch99.com/web-development/javascript/

which will return scratch99.com, but it won’t work with the following URLs:

  • http://scratch99.com#footer
  • scratch99.com?s=web+development

which would return the full URL in both cases. I need it to strip out everything after the ? and # characters.

They don’t strip out the http://

I’m processing the URL so I can store the domain and use it in a URL, in the following way:

http://domain.com/sources/theparseddomain.com/

I do not want the following:

http://domain.com/sources/http://theparseddomain.com/

so I need to strip out the http:// when parsing the URL.

How To Parse A URL And Get The Hostname

My solution, taking all of the problems above into account, is as follows:

[sourcecode language=”javascript”]
var url = "http://scratch99.com/web-development/javascript/";
var domain = url.replace(‘http://’,”).replace(‘https://’,”).split(/[/?#]/)[0];
[/sourcecode]

In the example above, the domain variable will contain the value “scratch99.com”.

It is the second line in the code above that is important. The url variable can be changed to the URL that you need to work with, whether you just change the line of code, or set the variable using a form, or even loop through an array of URLs.

Lets look at that second line of code a little more closely:

First, the .replace('http://','') strips the http:// off the URL. I do this for the reasons explained above, but it also makes the rest of the code simpler. If we don’t do this, then depending on whether the http:// is present or not, the domain name may be either the 1st or 3rd element of the split result array. If we strip it, it will always be the first element.

This doesn’t cater for user data entry problems, such as the user leaving the : out of a URL, eg: http//domain.com/. I’ll take that as acceptable, as in this case the user is going to be me! However, if this was for end users, you’d probably want to check that they didn’t make such a mistake.

Editorial Note
Thanks to Edward Caissie for pointing out in the comments that the original solution didn’t cater for https:. I’ve now added an extra replace to cater for this. I’m sure there’s a way to do it with a single replace using regex, but will leave it there for now.

Next, the .split(/[/?#]/) splits the resulting string based on regex. The string will be split into parts, based not just on the forward slash, but also on the ? and # characters. The first item of the resulting array will be the hostname. This works, but I’m far from a regex master, so if any has a better way of doing this, let me know in the comments.

Finally, the [0] gets the first element of the array resulting from the split method. For improved readability, you could move this to a separate line of code, eg:

[sourcecode language=”javascript”]
var url = "http://scratch99.com/web-development/javascript/";
var urlParts = url.replace(‘http://’,”).replace(‘https://’,”).split(/[/?#]/);
var domain = urlParts[0];
[/sourcecode]

Hostname Vs Domain Name

Strictly speaking, the code above returns the host name (including the subdomain), rather than the domain. That suits my needs. If you just want to strip the “www.”, then you can use the following code in place of line 2 above (note the extra replace):

[sourcecode language=”javascript”]var sourceString = url.replace(‘http://’,”).replace(‘https://’,”).replace(‘www.’,”).split(/[/?#]/)[0];[/sourcecode]

If you need to remove any subdomain, not just www, then you’ll have to do some extra parsing of the result.

Examples

The following examples will all result in scratch99.com being returned:

  • http://scratch99.com/web-development/javascript/
  • http://scratch99.com#footer
  • scratch99.com?s=web+development

Try it yourself with any URL you like:


Extract

If you can break it, let me know!

12 responses on “How To Get The Domain From A URL Using JavaScript

  1. Michael

    Or just this:

    var hn = window.location.hostname.split(‘.’).reverse();
    var host = hn[1] + “.” + hn[0];
    console.log(“Host:”, host);

  2. PAEz

    Or you could avoid the the replace() with…..
    domain = “http://scratch99.com/web-development/javascript/”.split(/\/\/|\//)[1]
    …works on “//ajax.googleapis.com” aswell and wont care what the protocol was

  3. PAEz

    Oops, make that….
    “http://scratch99.com/web-development/javascript/”.split(/\/\/|[?#\/]/)[1]
    …must contain a protocol tho or start with //, which is fine in my case

  4. Cristian

    You should strip all protocols from https:// passing through ftp:// and ending in wss://

  5. Gaby

    You can also use this: ^(?:https?\:\/\/)?(?:www\.)?([^\.]+)(?:\.).*$ first group $1 will have it