Domain sharding for all

One of the most frequently talked about performance improvements for web sites is the sharding of domains. This was one of Steve Souders’ original rules (mentioned in High Performance Web Sites) and still one of the Yahoo! Exceptional Performance guidelines. The basic problem is that browsers limit the number of parallel connections opened to a particular domain. Although recent browsers have upped the number of parallel connections for HTTP 1.1 from 2 to 6, your page still downloads faster if you have at least a couple of other domains from which to download resources. The guideline is to have at least two different domains but no more than four (as you increase DNS lookups and therefore negatively impact performance).

When I’ve discussed this with other developers, everyone agrees that this is a good idea. Most major web sites use multiple domains to download page components. It’s relatively cheap and easy to setup a new CNAME entry for your domain and direct some resources there. The pain point I usually here is that for personal or small business web sites, they don’t have access to creating subdomains and certainly don’t have a CDN to use. Fortunately, there’s a variety of places you can store resources that will allow event the smallest site to enjoy the performance gain of using multiple domains.

Photos – Flickr

If you regularly use photos on your site, then Flickr is a great option. Flickr not only hosts your photo files, but also creates several different sizes of the original image, all of which can be embedded on your page. Once your photos are on Flickr, you can click on “All Sizes” to choose the image size you want and then scroll to the bottom of the page where you’ll see a section with both HTML to embed the image and a direct link. The only thing to keep in mind is that, according to the Flickr terms of service, you’ll need to link that image back to its Flickr page (very easy using the provided HTML). Your photos are then loaded from the Flickr servers, so there’s no need to upload them to your own site.

Flickr comes with both free and pro versions, and photo embedding on your site can be done with either plan. The pro version is $24.95 per year and removes all limits on uploads, among other things.

Graphics – Photobucket

For non-photo images, such as background images or decorative page enhancements, Photobucket is a great alternative. Photobucket gained popularity with the rapid rise of MySpace as people needed free locations to upload photos in order to create those crazy MySpace profile page designs. Photobucket makes it easy to embed images by providing links for various services as well as a direct URL to each image. Unlike Flickr, you can only share the exact same image you uploaded, which makes Photobucket less useful for photos (which are often quite large) but quite useful for other graphics. This is currently where I store the graphics of my books that you see on this site.

Photobucket also comes in free and pro flavors. The free version has a limit on both uploads and data transfer while the pro version ($24.95 per year) removes those restrictions and also allows videos and SWF files to be uploaded and shared.

JavaScript libraries – Yahoo! and Google

If you use a popular JavaScript library, YUI, Dojo, jQuery, MooTools, etc., there’s no need to have those files on your server since both Yahoo! and Google host various parts on their own CDNs. Yahoo!, of course, does so for the YUI library (both the 2.x and 3.x versions), allowing you to use the same CDN that is used on the Yahoo! network to load YUI. Google offers a much wider selection of libraries, including YUI, that can be loaded from their CDN. Not only do Yahoo! and Google offer other domains from which to load the libraries, since these are CDN domains, the resources are loaded from the best possible geographic server location (adding an even bigger performance boost). There is no charge from either Yahoo! or Google to access these libraries.

Presentations – Slideshare

If you’ve been at a tech conference recently, you’re probably familiar with Slideshare. It’s where many notable speakers upload their presentations to share with everyone. Slideshare translates Powerpoint, Keynote, and other presentation formats into a Flash movie that is embeddable anywhere, making it the “YouTube of presentations.” Beyond that, Slideshare stores a copy of the presentation that can be downloaded directly from their site. Since presentations tend to be quite large, it’s a good idea to store those elsewhere so as not to affect your monthly bandwidth. All of my presentations are up on Slideshare.

Anything – Amazon S3

I’d be remiss if I didn’t mention Amazon’s S3. While not a free services, Amazon S3 has some very reasonable pricing for data transfer and storage (we’re talking cents per GB of storage and transfer). While not strictly a service for web sites, Amazon S3 allows you to store any file of any type and size in the same location for a low cost. If you want all of your files in the same location on a dependable system, Amazon S3 is a great option, which is why companies like Twitter use Amazon S3.

Others?

I’m sure there are many more free and pay sites to use for web data storage, but these are the ones I’ve personally used and would recommend. What are you favorite sites for storing data?

Comments

  1. Andrew Mattie

    You left out Amazon CloudFront (http://aws.amazon.com/cloud.... It's a pay-as-you-go CDN service that feeds directly off of S3. It's cheap, and it's obviously far faster (albeit more expensive) than simply serving from S3. We use it and we absolutely love it.

  2. Michael Shkutkov

    Google App Engine
    * 1 Gb of data transferred in and out per day for free. Also good price - cheaper then S3 if month data transfer out is less than 100Tb.
    * 1 Gb stored data for free per project.

  3. Rob

    +1 for CloudFront. S3 has a ton of latency. I still use it for "content" images, but anything that needs to load quickly is referenced from CloudFront. It's a 2x-3x difference in speed (130ms vs 3-400ms) in the States and must be significantly different in Europe/ Asia where the regional servers must make a huge difference.

  4. Joel Cass

    I wouldn't necessarily agree. The real key would be to firstly reduce the number of calls you need to make by reducing the number of assets on the page. Then, consider using multiple domains.

    I'm not really sure what difference multiple domains would make, especially on slow connections where downloading 4 or more assets in parallel would probably take the same amount of time as downloading them in a serial fashion. You got to remember that if a lookup has already been made to one domain, and a connection is already established, there would be less overhead in downloading files.

    Also, using google analytics as a case in point, sometimes your site is only as good as the slowest server. You will notice on many sites, the page load time always blows out due to the lag involved in pulling down google analytics scripts, which are downloaded and executed at load time. Same thing applies for sites that use banner ads and the like from other sources, e.g. doubleclick, yahoo! etc.

    Also, with sites such as flickr, slideshare, youtube etc you got to remember that once your data is in their system it no longer belongs to you. In most cases when you upload assets into their systems you are giving them full rights over content reuse and distribution.

    I think domain sharding would have real merit as long as there is a guarantee that all servers will perform at a high level of standard and that latency between all servers and the client is about the same. I think this is where content distribution networks can provide real benefit as they provide a large number of servers and guarantee that the performance and latency are consistent (and hopefully of a good standard).

  5. Nicholas C. Zakas

    @Joel - I'm not sure there's room to disagree on this. There's been a lot of research in this area that shows pages completely download faster when more resources are downloaded in parallel, which is what domain sharding does for you. You should really check out the links in this post that cover the research in this area, as I think that will make the benefits clearer for you.

  6. Joel Cass

    Just to clarify - I do agree with the principle but disagree with using some external vendors' sites. I'd still stand by my thoughts - your site is only as good as your worst server. If you have more than one domain pointing to the same server, good. But if you consider using services including amazon S3, flickr, youtube et al., you cannot always guarantee that these services will perform well - if they perform badly, your website performance will suffer too.

  7. Mike Hopley

    You could actually get worse performance by doing this. More realistically, many sites would see no real difference either way.

    Domain sharding is great, but only if you spread the resources effectively across shards. If you spread them ineffectively, then you're incurring a DNS lookup for little benefit.

    Let's take the scenario of an image-heavy site with only a small amount of javascript & css. Moving all the images to Flickr will have little effect on performance. The improvement in parallel downloads will only affect images downloading in parallel with javascript/css, and not images in parallel with images.

    For that site, moving half the images to Flickr would be effective. By spreading the images between two hostnames, up to twice as many images can be downloaded at once.

    for personal or small business web sites, they don’t have access to creating subdomains and certainly don’t have a CDN to use.

    Really? What kind of web host doesn't allow you to create subdomains? You can get this facility from a $5-per-month package.

    If you're talking about people on free web hosting, then I agree. But anyone considering domain sharding is fairly serious about his website, and one has to wonder why he is using free hosting.

    By the same token, anyone serious enough to shard is serious enough to use a CDN. Amazon Cloudfront has zero up-front costs. For a low-traffic website, Cloudfront's charges are trivial ($0.15 -- $0.20 per gB).

    To those site owners who want to shard but "cannot", your advice is, "Try this workaround". My advice would be: "Get a decent web host and try Cloudfront."

    (Note: I've never used Cloudfront myself. It's just that their pricing model makes them a logical choice here.)

Understanding JavaScript Promises E-book Cover

Demystify JavaScript promises with the e-book that explains not just concepts, but also real-world uses of promises.

Download the Free E-book!

The community edition of Understanding JavaScript Promises is a free download that arrives in minutes.