UPDATE: This blog has moved to http://marcelo.sampasite.com/brave-tech-world/default.htm . Please, update your subscription. Click to subscribe on Bloglines.
There is a ton of research on user behavior based on task response time and I'm sure if you search Jakob Nielsen's site you'll find something.
I remember an IBM study made more than 15 years ago that said if it took 1 second for the computer to respond, it would take 1 second for the user to start the next step, but, if it takes 10 seconds for the computer to respond, it takes 20 seconds for the user to start the next step on his task. It is an exponential problem because users get more and more distract the longer it takes for a command to respond.
Web sites are no different. Google proved with its super fast pages (part real, part perception) that users will perform many more searches if they know they don't have to wait long. If each query on Google would take 20 seconds to respond, you are more likely spending more time thinking about what search terms you'll use. But since it takes just 3 seconds, you just keep adding and removing search terms.
So, during the past few weeks I've been collecting tips for Web Developers on how to improve their page speed. These tips effect latency, bandwidth, rendering and/or perception of when a page is ready. They are in no particular order.
Tip #1: Strip spaces, tabs, CR/LF from the HTML
I'm always surprised when I look at some large website HTML to find out that it has a ton of unnecessary spaces, tabs, new-lines, HTML comments, etc. Just removing those elements can reduce the page size by 5-10%, which in turn can decrease the download latency. I'll go one step further and say to you to not use quotes on attributes unless necessary.
Tip #2: Don't use XHMTL
This is very controversial. A lot of people will call me crazy, but I see XHTML as a loser technology. It has its benefits, but they are far outweighted by the drawbacks. And the biggest drawback for me is that XHTML makes your page larger. Purists will always build their page on XHTML, but if you are in doubt about using it or not, don't!
Tip #3: Keep Cookies Small
Your cookie is sent back to your server every single time the user makes a request for anything. Even with Images, JS, CSS requests or XML-over-HTTP (AJAX) the cookie is sent. A typical HTTP request will have between 500-1000 bytes. Now, if you have 4 cookies each with names like "user_name" followed by a value with 30 characters, you are adding 15-30% more bytes to the request.
Tip #4: Keep JavaScript Small
Who cares if I'm calling my JavaScript function "start_processing_page()" or "slp()"? The download speed cares and the interpreter cares as well, so, use tiny function and variable names, remove comments and unnecessary code.
Tip #5: Use Public Caching
IMHO, This is one of the most under-used features of HTTP. Big websites use it (usually through a CDN, like Akamai), but the vast majority (I dare to say 99%) don't. All those icons, CSSs, JS can, and should, be cached by the browser (Private Cache), but public caching also allows Proxies in-between to cache them. This reduces the load on your server, allowing more CPU and bandwith to do the important stuff. Now, a lot of people don't use Public caching (or even Private) because their CSS is changing, the JS has bugs that need to be fixed, etc. Well, you can do 3 things to deal with that. 1) Let content to be cached for a short period of time (for example, 24h only). 2) Rename the files every time you make a change to them, this way you can let it be cached permanently, or 3) Implement an HTTP Filter that automatically renames the file if they have changed.
Tip #6: Enable HTTP Compression
Your HTML couldn't be a better candidate for compression. It has a very limited character set and lots of repetitions (count the number of "DIV" on your page). That is why HTTP Compression makes so much sense. It can reduces the download by 70% or more. So, instead of having to send 40KB of data, you are sending just 15KB. The user will thank you.
Tip #7: Keep all as much as possible in lower case
This actually works in conjunction with HTTP compression. Remember that this type of compression is lossless, this is, decompressing a content will yield the exactly original, which means that the compression algorithm will treat "DIV", "Div" and "div" as different streams. So, always use lower case for tag names and attributes on the HTML and CSS. Also try to be consistent on your JavaScript.
Tip #8: Avoid Tables
Rendering a table is probably the worse nightmare for a browser. If the browser starts showing the table before all the content inside it is loaded, the browser's rendering engine will have to re-render it many times as each piece is loaded. On the other hand, if the browser needs to wait for everything to be loaded, the user we see a blank page (or partially blank) for a few seconds. Browser's usually use a combination of both to reduce the number of re-renderings without leaving the user hanging in there. The point is, don't make your whole page start with a table. It is preferrable to have 3 tables (header, body, footer). Whenever possible, just avoid using tables altogether.
Tip #9: Set image size
This is very similar to the table rendering problem. If you add an IMG tag to the middle of your page and don't set "width" and "height", the browser has to wait for the image to be loaded to decide the final size, but, meanwhile it will cost the browser at least 1 re-rendering because it will not wait for all the images to be loaded to show you the page.
Tip #10: Compact your GIF/JPG
So, your page has several GIFs and/or JPG? It is very likely that those could be compressed even more without any loss! GIF/PNG mainly have a very compact data structure, but most applications like Corel Photo-Paint and Adobe PhotoShop don't optimize it at all. Go to http://download.com and find yourself a good set of tools to compact your image files. You will be surprised that one of your GIFs had 900 bytes and after compacting it, end up being just 80 bytes.
Tip #11: Reduce the number of external elements
If you see a request graphic from Keynote (a site perf monitoring service) you would be shocked at how long it takes to download just a few extra files to render a page, like a few images, a CSS and a JS file. If you did a good job with Tip #5 (using caching), the impact will be lesser. A browser can only request an image file, after it detected it on the parsing of the HTML. A lot of those file requests are serial. Some browsers limite the number of TCP connections to a single server (usually to 2), thus, allowing your page to only download 2 files at a time. If you have 1 page, 1 css, 1 js, and 7 images on your page (10 files), you can imagine that a lot has to happen before everything is loaded. The point here is, try to reduce the number of files (mostly images), and, if the CSS/JS are small enough, embed it into the page.
Tip #12: Use a single DNS Lookup
This is so overlooked. How many Web Developers think about DNS Lookup when they are building a site? I guarantee you, not many. But even before the browser opens a connection to your server, it needs to do a DNS Lookup to resolve the domain name to an IP address. Now, DNS lookups is one of the fatest things on the Internet, because the protocol is tiny and it is cached everywhere, including the user's computer. But, sometimes you see sites making "creative" domain names for the same server. Like all images come from "images.mysite.com", the page is coming from "w3.mysite.com" (after a redirect from "www.mysite.com"), and the streaming video comes from "mms.mysite.com". That is 3 DNS lookups more than necessary.
Tip #13: Delay Script Starts
If you have a process that renders 100 images per second using 100% CPU, and you add another process doing the same thing, the performance will be less than 100 images per second (less than 50 per process). That is because now the OS has to manage context switches. The same thing applies the scripts on your page. If the browser stills loading and processing a few images, or CSS and you just fire a script, it will take longer for that script to execute than if you had waited the page to be completely loaded. Actually, it gets a little bit more complicated. The browser fires the "onload" event for the page once it has all the elements necessary to render the page, not after the page has really been rendered (there is no "onrendercomplete" event). This means that even after the onload event, the CPU still being used by the browser to render the page. What I usually do in situations like this is to add two indirections. First, attach a script to the onload event to invoke a function that will create a time-event in a few seconds that will do the real initialization. In other words:
<body onload="setTimeout('init();',1000);">
Tip #14: Watch for Memory Leak
The biggest problem with browser's memory leak is that it doesn't affect only the page that created the leak, it affects every single page from any site after that. Internet Explorer is notorious for its massive memory leaks (becase of poor JavaScript). There are a few tools on the Internet to find out if your script is causing memory leak and where. The easiest test is to load your page 100 times and watch PerfMon to see if the Working Set is growing or not. The most simple thing that you should do is to unbind every event that you bound to (dynamically), and to release every reference possible (this also helps the JavaScript garbage collector to be faster).
If you have no clue what I talked about in one of the topics above, either you really don't need to know about it, or, you should immediately go buy some books, and I recommend all books by O'Reilly, like:
- Dynamic HTML: The Definitive Reference - by Danny Goodman
- JavaScript & DHTML Cookbook - by Danny Goodman
- HTTP: The Definitive Guide - by David Gourley & Brian Totty
- Web Caching - by Duane Wessels
A well coded XHTML page with one CSS file takes far less code than the same design built using font tags and nested tables on each page.
A badly coded XHTML page (defining a seperate class for each element, for example) can be a larger filesize, but that's down to developer incompetance rather than the coding language used.
Posted by: Nice Paul | January 03, 2006 at 05:56 AM
I wish folks at the compnay I work at could all read this. I think I'll pass this around. Thanks for the info. I find this extremely useful!
Posted by: michael | January 03, 2006 at 06:22 AM
he is partly right on xhtml:
http://www.hixie.ch/advocacy/xhtml
also use hardware load balancing if you coun, products by foundry networks are good.
Posted by: unknown comic | January 03, 2006 at 06:27 AM
"do not use quotes on attributes unless necessary"
"Don't use XHTML"
utter, utter bollocks.
I bet you're the sort of person that still uses MS Frontpage
Posted by: unimpressed | January 03, 2006 at 06:27 AM
RE: b100dian
I'd take issue with a lot of what *you* have said...
Let's get down to fundamentals:
Kilobits are not the same as Kilobytes. So 35KB of data would not take 1 second to download on a 35 Kbps connection.
I think this article and the comments point out a common problem with HTML and web developers that no one wants to agree to standards. 'I want to use uppercase tags and I want to use html, I want to use XHTML with uppercase tags'. It's worth bearing in mind that if one develops using standards then their site will be accessible to more people, so for example reading a web-page on a psp, or mobile phone.
Posted by: Ed | January 03, 2006 at 06:36 AM
Don't use XHTML. Don't use spaces and tabs. Ha ha ha.
Ha ha ha.
Ha!
Posted by: hahaha | January 03, 2006 at 06:49 AM
utter tripe
Posted by: anon | January 03, 2006 at 07:04 AM
Speaking as a web professional, who agrees with all of the criticism of this article -- because the critics are absolutely right -- I think the author of the article should do the responsible and right thing, and either take this post down, or better yet include a foreword or afterword in it to acknowledge the article's mistakes.
The author is doing a major disservice to the industry, and to his or herself, by leaving the post as-is.
Posted by: Zach | January 03, 2006 at 07:11 AM
No one should ever user something else than XHTML on newer pages!!!! We do have to get our standard, or web "development" will always be pain in the a (and we're there were we are because of ALL THE SEMI-PROs who think standards are for wumpuses - and because of companies like Microsoft).
Btw, uppercase as tags and so is forbidden in XHTML.
The best validator (not as broken as the one from the W3C):
http://www.validome.org
Posted by: John Moo | January 03, 2006 at 07:50 AM
Don't use XHTML/standards?
IMHO any speed benefits gained from using noncompliant code will only benefit the lucky users whose browsers happen to interpret your non-standard pages in the way you hope.
Good work getting this debate going :)
Posted by: Carl Hubbers | January 03, 2006 at 08:13 AM
I think the overall idea behind putting a list like this together is to get the developer into an optimizing mindset. These days people take bandwidth for granted but you have to remember that over 70% of Internet users are still using dialup. I agree with some of the tips listed and I disagree with others, however I think the author is taking an important first step in emphasizing the importance of optimization. Just like when you code a standalone application, yes you can skip the optimization step and everything will work but you'll end up with a bloated and slower application in the long run. Just because we have the bandwidth doesn't mean we should not optimize. Also remember that your web page is not the only one a user will look at one time. I normally have 4 or 5 five browser tabs open at one time and 2 or 3 other applications running in the background and I'm sure that's lite compared to other people. With how Windows is, optimization is key. Maybe not everything that was mentioned here but like I said, this is a good first step in the right direction.
Posted by: Roy | January 03, 2006 at 08:26 AM
I'm pretty much on the side of Sarah, and I would think most professional web developers would follow.
As stated above, some of the points seem badly researched, especially 11+12:
One of the reasons to use different subdomains for images and so forth is to get around the http-connection limit. Also, one of the ideas of CSS is that the CSS files are cached, and that reverses the problem, less data is loaded with each page request. Take away as much whitespace as you want, I'm pretty sure that an external css structure is, in the very most cases, much more effective.
As I'm not really sure: Is this to be taken as a serious article, or was that more a case of "let's rail up the developers?"
Posted by: Matthias | January 03, 2006 at 08:31 AM
Hmmmm. It would seem that you don't really know anything about professional web-development.
Use XHTML. Use well-written JavaScript with meaningful variable and function names (in .js files, not in-line). Use CSS (and use it well -- learn how to use float, margins etc -- again in external .css files). Do not use tables for layout -- not ever. Use Photoshop's Save For Web option to create your GIF/JPEG/PNG files.
We aren't living in the 90's anymore.
Posted by: Adrian | January 03, 2006 at 09:05 AM
IN RESPONSE TO ALL THE FLAME:
=============================
I'm sticking to what I said above. This is my opinion and I'll recommend it to anyone.
Specifics:
- HTML vs XHTML: For those that don't know, HTML *is* a standard.
- If you think stripping spaces (HTML,CSS,JS) is a maintenance nightmare, you are right. However who said you need to do it by hand? Write a filter to strip out the spaces on the Release version.
- These are not tips for your 10-page website. Nor to your 1000-page website. These are tips for people building massive websites (a la Amazon, Google, MSN, eBay).
All the tips above worked very well for me, and I learned them by trial and error. You should do your own research and find what works for you.
Cheers,
Marcelo
Posted by: Marcelo Calbucci | January 03, 2006 at 09:30 AM
I must disagree with your Javascript comment. If you can do things such as form validation and sorting on the client side and save a trip to the server, your application will perform much faster/better and your server will thank you for distributing the work load.
Posted by: Matthew Price | January 03, 2006 at 10:37 AM
I know this blog software is based on TypePad and you might not have much control over the code generated. Nevertheless, viewing the source shows that the code does not follow your own tips/guidelines. Some optimizations is good but should not be abused, there are always trade-offs.
Posted by: Son Nguyen | January 03, 2006 at 10:46 AM
Funny comments--makes this gloomy day in SF a good one =)
Happy New Year!
Posted by: Sherwin Techico | January 03, 2006 at 11:25 AM
So you're trying to tell me (a comp.sci grad) that it's hard for a computer to parse a well-known and structured XML document, and that HTML tag-soup is faster?
Posted by: Adam | January 03, 2006 at 01:13 PM
Adam: parsing HTML/tag soup almost certainly /is/ faster. The browser doesn't have to check each element for well-formedness, and HTML parsing has been around for much longer and hence is much better optimised than XHTML parsing (hardly anyone serves valid XHTML so there hasn't been nearly as much effort put in to optimising its performance). I've also heard that Firefox in XHTML mode won't start rendering the document until the entire XHTML page has been loaded, but I'm not 100% sure if this is true.
There are a lot of very ill-informed comments attached to this entry. HTML 4.01 is just as much of a "web standard" as XHTML 1.0 - and it will be supported by browsers for decades to come. Think about it: if a browser dropped HTML 4 support it wouldn't be able to render 99% of the web, and would be utterly useless.
I have no idea where the idea that "Ajax only works with XHTML" comes from, but it's not even remotely true.
Posted by: Simon Willison | January 03, 2006 at 02:53 PM
Umm ... to add to all the criticism ... it appears that your own company site (which does not use Typepad and therefore theoretically should be totally under your control) does not follow you own recommendations.
http://www.sampa.com/
Most of which, I agree with many of the other posters, are not great ideas at all.
Just one that I'll point out: embedding your CSS file in each page makes the download bigger, not smaller, since every single page must now contain all the CCS ... whereas if it's linked to, it only needs to be called once, and then is locally cached in the browser.
I notice that your company's site must not be done by you but a smarter web developer .... it calls an external CSS file.
;-)
Posted by: John Koetsier | January 03, 2006 at 03:01 PM
great list. though i do not agree with the xhtml point. xhtml makes code more readable and clean. as a programmer, I am willing so sacrifice a couple of bytes for that.
Posted by: miscblogger | January 03, 2006 at 03:30 PM
one thing that no one has mentioned is to question the whole premise of this post: that transmission and rendering time are the big problems... in many cases it isn't so!
In many cases the delay between request and render is due to server load. If you look at CNN or Google's markup (apparently the site scale this post is aimed at) they don't strip whitespace, they use external CSS, they use different domains etc.
The magic Google and these sites work is in using clustered databases, load-balance server farms, Akamai to reduce network latencies etc etc etc.
My thoughts on faster-loading pages:
1) Use the XHTML/external CSS/external JS approach
2) use gzip compression
3) make sure any dynamic server-side code is decently fast
anything else is marginal marginal marginal
Posted by: nick | January 03, 2006 at 03:48 PM
There's absolutely nothing to stop you from writing HTML 4.01 code that is just as readable and clean as the equivalent XHTML.
Posted by: Simon Willison | January 03, 2006 at 04:02 PM
"parsing HTML/tag soup almost certainly /is/ faster."
No, it absolutely is not. Valid html could be parsed with an SGML parser, which is slower than an XML parser that would be used on xhtml. But no browsers do this because there is so little valid html out there, so they have to use a slow, complicated parsing engine to decipher the mess that most "html" pages are. XML parsing is WAY faster, like 10x faster.
"I've also heard that Firefox in XHTML mode won't start rendering the document until the entire XHTML page has been loaded, but I'm not 100% sure if this is true."
Of course its true, and not just for firefox. You can't tell if an XML document is well-formed until you have the whole thing. So all browsers rendering xhtml (properly, with the right mime type set) will behave like this.
koenkai, you have no clue what you are talking about. First you say that using xhtml is good, then you say you like using all caps tags. Xhtml requires all tags to be lower case. Xhtml has nothing to do with gayjax, which predates xhtml first of all, and predates its new and retarded acronym. And it doesn't eliminate redundant calls to the server, it just requests less data with each call. 35KB of data will take NINE SECONDS to download on a 33.6K modem connection. That alone is enough time for people to give up on your bloated page and go somewhere else.
Marcelo, you are even more clueless. Your advice is not for sites like google and amazon, they aren't stupid enough to do this crap. You have obviously never worked on a large site, but if you at least look at the source of such sites, you will see they don't follow your terrible advice.
Your points contradict each other (if you are using output compression on the webserver, then stripping whitespace and comments and such gets you nothing), or are completely wrong (javascript and css belong in external files, so they are only requested on the first page view, then cached browser-side) or show your lack of understanding (using multiple subdomains is for spreading the requests for different kinds of content to different servers). At least you got a couple things right (that everyone else has known since 1994).
Ralesk, yes you need to put in image dimensions. The browser doesn't know how big the image is till it starts getting the image, but its already started rendering the page then. So it renders the page not knowing how much space to leave for your images, then as it gets the images it keep re-rendering the page with the right size for the image.
And to all the completely clueless people shouting nonsense like "use standards/xhtml", HTML is a standard. You bandwagon jumping dufuses that insist on pushing xhtml, but use tag soup with an xhtml doctype on your own pages are really annoying. If you have no clue about something, DON'T ADVOCATE IT. All you do is make people think that xhtml is empty hype with no value when you spew nonsense and show that you don't even understand xhtml yourself.
Posted by: Jerry | January 03, 2006 at 05:59 PM
"Marcelo, you are even more clueless. [...] You have obviously never worked on a large site"
You don't count MSN as a large site?
I stand by my statement: modern browsers probably render XHTML slower than HTML, so switching to XHTML because it is faster for browsers to parse is currently a waste of time.
Anyone know of any benchmarks that support or disprove this point?
Posted by: Simon Willison | January 04, 2006 at 02:28 AM