UPDATE: This blog has moved to http://marcelo.sampasite.com/brave-tech-world/default.htm . Please, update your subscription. Click to subscribe on Bloglines.
There is a ton of research on user behavior based on task response time and I'm sure if you search Jakob Nielsen's site you'll find something.
I remember an IBM study made more than 15 years ago that said if it took 1 second for the computer to respond, it would take 1 second for the user to start the next step, but, if it takes 10 seconds for the computer to respond, it takes 20 seconds for the user to start the next step on his task. It is an exponential problem because users get more and more distract the longer it takes for a command to respond.
Web sites are no different. Google proved with its super fast pages (part real, part perception) that users will perform many more searches if they know they don't have to wait long. If each query on Google would take 20 seconds to respond, you are more likely spending more time thinking about what search terms you'll use. But since it takes just 3 seconds, you just keep adding and removing search terms.
So, during the past few weeks I've been collecting tips for Web Developers on how to improve their page speed. These tips effect latency, bandwidth, rendering and/or perception of when a page is ready. They are in no particular order.
Tip #1: Strip spaces, tabs, CR/LF from the HTML
I'm always surprised when I look at some large website HTML to find out that it has a ton of unnecessary spaces, tabs, new-lines, HTML comments, etc. Just removing those elements can reduce the page size by 5-10%, which in turn can decrease the download latency. I'll go one step further and say to you to not use quotes on attributes unless necessary.
Tip #2: Don't use XHMTL
This is very controversial. A lot of people will call me crazy, but I see XHTML as a loser technology. It has its benefits, but they are far outweighted by the drawbacks. And the biggest drawback for me is that XHTML makes your page larger. Purists will always build their page on XHTML, but if you are in doubt about using it or not, don't!
Tip #3: Keep Cookies Small
Your cookie is sent back to your server every single time the user makes a request for anything. Even with Images, JS, CSS requests or XML-over-HTTP (AJAX) the cookie is sent. A typical HTTP request will have between 500-1000 bytes. Now, if you have 4 cookies each with names like "user_name" followed by a value with 30 characters, you are adding 15-30% more bytes to the request.
Tip #4: Keep JavaScript Small
Who cares if I'm calling my JavaScript function "start_processing_page()" or "slp()"? The download speed cares and the interpreter cares as well, so, use tiny function and variable names, remove comments and unnecessary code.
Tip #5: Use Public Caching
IMHO, This is one of the most under-used features of HTTP. Big websites use it (usually through a CDN, like Akamai), but the vast majority (I dare to say 99%) don't. All those icons, CSSs, JS can, and should, be cached by the browser (Private Cache), but public caching also allows Proxies in-between to cache them. This reduces the load on your server, allowing more CPU and bandwith to do the important stuff. Now, a lot of people don't use Public caching (or even Private) because their CSS is changing, the JS has bugs that need to be fixed, etc. Well, you can do 3 things to deal with that. 1) Let content to be cached for a short period of time (for example, 24h only). 2) Rename the files every time you make a change to them, this way you can let it be cached permanently, or 3) Implement an HTTP Filter that automatically renames the file if they have changed.
Tip #6: Enable HTTP Compression
Your HTML couldn't be a better candidate for compression. It has a very limited character set and lots of repetitions (count the number of "DIV" on your page). That is why HTTP Compression makes so much sense. It can reduces the download by 70% or more. So, instead of having to send 40KB of data, you are sending just 15KB. The user will thank you.
Tip #7: Keep all as much as possible in lower case
This actually works in conjunction with HTTP compression. Remember that this type of compression is lossless, this is, decompressing a content will yield the exactly original, which means that the compression algorithm will treat "DIV", "Div" and "div" as different streams. So, always use lower case for tag names and attributes on the HTML and CSS. Also try to be consistent on your JavaScript.
Tip #8: Avoid Tables
Rendering a table is probably the worse nightmare for a browser. If the browser starts showing the table before all the content inside it is loaded, the browser's rendering engine will have to re-render it many times as each piece is loaded. On the other hand, if the browser needs to wait for everything to be loaded, the user we see a blank page (or partially blank) for a few seconds. Browser's usually use a combination of both to reduce the number of re-renderings without leaving the user hanging in there. The point is, don't make your whole page start with a table. It is preferrable to have 3 tables (header, body, footer). Whenever possible, just avoid using tables altogether.
Tip #9: Set image size
This is very similar to the table rendering problem. If you add an IMG tag to the middle of your page and don't set "width" and "height", the browser has to wait for the image to be loaded to decide the final size, but, meanwhile it will cost the browser at least 1 re-rendering because it will not wait for all the images to be loaded to show you the page.
Tip #10: Compact your GIF/JPG
So, your page has several GIFs and/or JPG? It is very likely that those could be compressed even more without any loss! GIF/PNG mainly have a very compact data structure, but most applications like Corel Photo-Paint and Adobe PhotoShop don't optimize it at all. Go to http://download.com and find yourself a good set of tools to compact your image files. You will be surprised that one of your GIFs had 900 bytes and after compacting it, end up being just 80 bytes.
Tip #11: Reduce the number of external elements
If you see a request graphic from Keynote (a site perf monitoring service) you would be shocked at how long it takes to download just a few extra files to render a page, like a few images, a CSS and a JS file. If you did a good job with Tip #5 (using caching), the impact will be lesser. A browser can only request an image file, after it detected it on the parsing of the HTML. A lot of those file requests are serial. Some browsers limite the number of TCP connections to a single server (usually to 2), thus, allowing your page to only download 2 files at a time. If you have 1 page, 1 css, 1 js, and 7 images on your page (10 files), you can imagine that a lot has to happen before everything is loaded. The point here is, try to reduce the number of files (mostly images), and, if the CSS/JS are small enough, embed it into the page.
Tip #12: Use a single DNS Lookup
This is so overlooked. How many Web Developers think about DNS Lookup when they are building a site? I guarantee you, not many. But even before the browser opens a connection to your server, it needs to do a DNS Lookup to resolve the domain name to an IP address. Now, DNS lookups is one of the fatest things on the Internet, because the protocol is tiny and it is cached everywhere, including the user's computer. But, sometimes you see sites making "creative" domain names for the same server. Like all images come from "images.mysite.com", the page is coming from "w3.mysite.com" (after a redirect from "www.mysite.com"), and the streaming video comes from "mms.mysite.com". That is 3 DNS lookups more than necessary.
Tip #13: Delay Script Starts
If you have a process that renders 100 images per second using 100% CPU, and you add another process doing the same thing, the performance will be less than 100 images per second (less than 50 per process). That is because now the OS has to manage context switches. The same thing applies the scripts on your page. If the browser stills loading and processing a few images, or CSS and you just fire a script, it will take longer for that script to execute than if you had waited the page to be completely loaded. Actually, it gets a little bit more complicated. The browser fires the "onload" event for the page once it has all the elements necessary to render the page, not after the page has really been rendered (there is no "onrendercomplete" event). This means that even after the onload event, the CPU still being used by the browser to render the page. What I usually do in situations like this is to add two indirections. First, attach a script to the onload event to invoke a function that will create a time-event in a few seconds that will do the real initialization. In other words:
<body onload="setTimeout('init();',1000);">
Tip #14: Watch for Memory Leak
The biggest problem with browser's memory leak is that it doesn't affect only the page that created the leak, it affects every single page from any site after that. Internet Explorer is notorious for its massive memory leaks (becase of poor JavaScript). There are a few tools on the Internet to find out if your script is causing memory leak and where. The easiest test is to load your page 100 times and watch PerfMon to see if the Working Set is growing or not. The most simple thing that you should do is to unbind every event that you bound to (dynamically), and to release every reference possible (this also helps the JavaScript garbage collector to be faster).
If you have no clue what I talked about in one of the topics above, either you really don't need to know about it, or, you should immediately go buy some books, and I recommend all books by O'Reilly, like:
- Dynamic HTML: The Definitive Reference - by Danny Goodman
- JavaScript & DHTML Cookbook - by Danny Goodman
- HTTP: The Definitive Guide - by David Gourley & Brian Totty
- Web Caching - by Duane Wessels
XHTML can actually be used to dramatically speed up a site, if you use it right, but what I am about to describe may be considered "cheating".
Since XHTML is HTML implemented in XML, you can use AJAX techniques with it. You don't even need to learn that much AJAX: there's a super simplified subset of AJAX techniques called AHAH (Asynchronous HTML and HTTP) that permits you to update only the parts of the page that change by substituting in chunks of XHTML into the page; that's a whole lot less to transfer than an entire page. On top of that, since with AHAH, you're working with XHTML, there's no XML translation to bother with; don't even bother learning XML style sheet transformations--the stuff being added to your pages is just XHTML, which needs no translation to render on a browser. Just use some scripting to change the parts of the page that need to be updated from the server, and provide permalinks on the page as necessary. This alone will save you more bandwidth than almost all the techniques you listed combined, other than public caching and optimizing your PNGs and GIFs.
Basically, transmitting only the differences between the pages rather than whole pages is the best way to make content load fast on a site. Everything else is marginal compared to this.
(I learned about AHAH at: http://microformats.org/wiki/rest/ahah)
Posted by: Berkana | January 02, 2006 at 11:22 PM
I find that using xhtml far outways the benifits of just doing html. There is no reason you cannot right html that is perfectly fine xhtml without the correct declerations. Your main argument was that XHTML makes your pages bigger. Well maybe by a couple of bytes. An extra couple quotations and some proper structure far outway the size increase. I doo see your point about increased size, but browsers will have an easier time rendering good code compared to bad code. XHTML also supports growing technologies and the legacy and rather depricated html will only find itself in less and less use. Also, as the web grows (at its extreme weight) we soon won't have to worry at all about squeezing bits and bytes out of our html files, more likely squeezing those bytes out of our mpg and avi files.
Posted by: Dustin | January 02, 2006 at 11:30 PM
On point #1, you mention to get rid of all spaces, CF/LR's and quotes (if unnecessary). However, I would propose that it would be unwise to go too crazy with this. Well written, readable but slightly larger code is more valuable than illegible but tiny code. A point worth considering.
Posted by: Josh | January 03, 2006 at 02:29 AM
Eight of your fourteen directives result in maintenance nightmares. A site utilizing all eight of those might load fractionally faster, but no web dev with a life would think twice about offering to maintain it.
On the other hand, proper publication of images and avoidance of tables by themselves will account for most of the speed advantage, most of the time.
I'm grateful that someone pointed out DNS lookups and the hazards of requesting sibling media files from multiple/external networks. Poorly implemented sites will fail to render during network hiccups as a result (which to me is the greater hazard).
Posted by: ben | January 03, 2006 at 02:32 AM
what a joker.
I hope you don't tell clients this BS... no wait I hope you do. Some I can fix it for them and get paid.
don't use XHTML... WAKE UP!! it's not 1999 any more you old fart.
You need to catch up to the rest of use because your going to go out of business in the next 3 years.
Posted by: Chris | January 03, 2006 at 02:33 AM
I'd argue that there's no reason to use GIF anymore. The only possible advantage GIF has over PNG is animation, which has no place on modern pages.
And I'm guessing these image compacting tools don't do anything other than index the colours. Reducing the amount of colours in a PNG to what is actually in the image, or even 255, should reduce the size dramatically. Only when PNG is using the same amount of colours is it comparable to GIF.
Posted by: Dean | January 03, 2006 at 02:46 AM
In response:
(1) Strip whitespace.
Bull. See #6 - they're compressed to nearly nothing anyway. And they make your markup that much harder to read in the long run, so the moment you need to make changes to it, guess what - you're spending far longer making the changes because you can't read your own markup.
(2) Don't use XHTML.
I call bull here as well. Falls back into the same category as #6 - it also enforces rules on how you write your markup, making it possible not only for the end user's system to parse the page far easier, and making it possible for you to make changes on the fly to the markup, but makes it easier in the future for alternate methods of transporting the data to be used.
(3) Keep cookies small.
Fair enough - better yet, though, keep data such as the example above on the server. Network security dictates that you don't give the client machine access to data that they don't need - keep the data in a session and only send a session ID.
(4) Keep JS small.
Um. See caching here, and while the interpreter might care and take nanoseconds longer to parse the script, nanoseconds of time isn't worth the time spent actually being able to find your way through your script again.
(5) Caching.
Good idea, but it really only helps if you've got a static site. If the CSS and scripts and images aren't being dynamically generated and aren't being cached, then you've done something wrong.
(6) Enable HTTP compression.
No disputes on this count - when dealing with uncompressed data. Otherwise it's just wasted time on both ends of the connection.
(7) Lower-case HTML, CSS, JS
While it works in conjunction with compression, unless you're using wacky caps everywhere, the difference is minimal.
(8) Avoid tables.
Any purist could tell you that. Of course, that's because "table" implies "tabular data", such as a parts catalog, or a long-format directory listing, etc. Tables got shoehorned into presentation because it was the only thing that used to be avaliable for doing any sort of complex layouts - that's what CSS is for.
(9) Set image size.
Yup. No disputes.
(10) Compact images.
Might help. If the images are only used once. Once they've been loaded - see caching, above - they don't need to be fetched again.
(11) Reduce the number of external elements.
Which is fair enough - but you lose all benefit of caching if you embed scripts & CSS into the pages, unless they're specific to that one page and you don't intend it to be viewed repeatedly. That's why inline CSS is a bad idea as well as embedded scripts.
(12) Use a single DNS lookup.
Mm, it can help the initial request. It's also a standard practice that I tend to follow - but there are plenty of reasons to avoid following it, one of them being that you have a server spewing up static content aside from the main server, for example.
(13) Delay script starts.
This doesn't do anything for the page load except makes it wait to finish. If you've got an onload handler, which is doing something essential to the usage of the page, making the stuff that happens onload wait up for a setTimeout() to take place is just making the page that much slower.
(14) Watch for memory leaks.
If you're doing anything with complex scripting, you should already be doing this.
In practice, most of the tips above don't really mean much for most developers because they have various things to do with the markup and other components after they've written the initial page - whether it's maintenance, adding features, fixing bugs, what have you - and readable XHTML, readable CSS, and readable JavaScript go a log way towards maintaining a site.
Posted by: Sarah | January 03, 2006 at 02:53 AM
Yea great tips, maybe the author should use them as his code is filled with un-needed spaces and tabs:)
Posted by: Lsv | January 03, 2006 at 02:56 AM
"You need to catch up to the rest of use because your going to go out of business in the next 3 years."
Chris ... you have no idea what you are talking about do you?
I admit that the few extra bytes of data that are required for an XHTML page are no longer a problem for most people with a DSL connection.
But XHTML does have 0 benefits over HTML. In fact most people don't even use XHTML properly. That is because Internet Explorer is unable to parse it. XHTML is an XML based language and thus it requires the application/xml+xhmtl MIME (try that for fun in IE). So you'll end up sending is as text/html ... having 0 of the 'benefits' of XHTML (the browser just builds a HTML DOM). And more notebly you've spend time on doing something completely and utterly useless. Read up on some HTML vs. XHTML discussions (there are plenty of those around the web) before you just carelessly shout that one is better then the other
Posted by: Joël Kuiper | January 03, 2006 at 03:05 AM
Good ideas here, but it would be even more useful if combined with links to online resources explaining how to implement these changes.
Posted by: Sean McManus | January 03, 2006 at 03:37 AM
Tip #2: Don't use XHMTL
Oh my god :-)
You want a clean and small page?
Use Webstandards! => XHTML and CSS!
But use it correct! Keep in mind XHTML is a MARKUP language not a formatting language! Separate content (XHTML) from design (CSS)!
Currently I am working on my new blog, in a view weeks I will show you there how you should use XHTML ;-)
Posted by: Flexchamp | January 03, 2006 at 03:47 AM
About point 6, data compression is not always a good idea since it requires more CPU load to cache the page, if CPU is your bottleneck it can even decrease the performance of your website.
Posted by: Sjors Pals | January 03, 2006 at 04:12 AM
If i'm wrong and XHTML is not the future why are Microsoft adding it to IE7... that's right.
so maybe you should read up before you go shoting your mouth off.
HTML is dead and in 3 years we can start to weed out people like you and the author and only try coders will be left.
now go back to your WYSIWYG editor... amature.
Posted by: Chris | January 03, 2006 at 04:24 AM
and yes I am now aware I made some spelling mistakes some don't try and be clever and point them out.
Posted by: Chris | January 03, 2006 at 04:26 AM
Someone may correct me here, but anymore, browsers render in different modes for different HTML DTDs - if you use XHTML markup, it will render in "strict" mode, and if you use older versions of HTML, it will render in "quirks" mode. If I'm not mistaken, writing well-formed XHTML code will render quicker in strict mode than HTML will render in quirks mode. Not to mention that tables and font tags can be replaced for the most part with CSS, saving code and speeding up rendering. The flip-side of this is, unless you're very careful about validating your code (with W3C's validator or some other tool), it hardly matters anyway, because XHTML that isn't well-formed will still get rendered in quirks mode.
That said, keeping your images compressed and making sure to include size attributes is terrific advice, keeping cookies small is terrific advice, because those are one-time things that you can set and forget. But anybody who has built and maintained more than a couple of web sites (as opposed to web *pages*) will find the usefulness of most of these tips, especially things like taking out all the spaces and linefeeds, are far outweighed by the hassle of maintaining the code. A few extra milliseconds, or even a few extra seconds, of download or rendering time vs. a few extra hours a week maintaining "optimized" code is not even a consideration. Show this list to your project manager, and he'll chuckle indulgently and tell you to stop browsing and get back to work.
Posted by: Ryland | January 03, 2006 at 04:33 AM
Some of these tips are so idiotic and against the idea of WWW (see http://www.w3.org/Provider/Style/URI for example.) It's like someone from Microsoft had written them (smart people with no contextual awareness). Just make it work fast now and don't think big. In no means don't think about reliability.
Posted by: N Any | January 03, 2006 at 04:44 AM
>it requires the application/xml+xhmtl MIME
This is incorrect. From the XHTML 1.0 spec, sending as text/html is allowed:
http://www.w3.org/TR/xhtml1/#media
It's amusing seeing all the ignorant people claim otherwise without ever having looked it up themselves.
Posted by: anon | January 03, 2006 at 04:45 AM
Nice tips
Posted by: Lloyd | January 03, 2006 at 04:50 AM
Wow, not very bright.
Putting all HTML on one line, yes, nice for maintnance in the future, and how does that affect anything if you have gzip compression on?
Dude... don't write about things you know nothing about.
Posted by: Eric | January 03, 2006 at 05:09 AM
It seems to me that this very page doesn't follow your advices..
Posted by: b100dian | January 03, 2006 at 05:12 AM
So much in this blog entry is simply wrong that I'd wonder when was the last time you've developed for the web...just for the record, Im a profession web designer/developer and have been developing/creating/administering sites since '94. So with that said...
I'd take issue with a lot of what's been proposed here...first, we live in the age of broadband, so the whole point of the post is somewhat irrelevant. Yes, I remember the age of 300baud, but we're way beyond that. A few K of savings means zero to most folks out there (33.6Kbps download mean that we're talking at most one second using the author's example of 35K in savings). And the cost in terms of readability, etc. can be huge (hearing from a former Microsoftee that HTML should be clean is kind of amusing seeing how well MS products created bloated HTML).
Tip #1
Next, leaving out CR/LF as well as spacing is insane if you ever want to have HTML that is remotely readable.
Tip #2
Don't use XHTML? Um, that's about as inane a proposition as I've heard. AJAX alone is reason enough to do this (if you're really concerned about server performance and would like to eliminate redundant calls to the server).
Tip #4:
Keeping JS small--well, that makes sense...but I'd suggest writing good JS rather than worrying about the ridiculously tiny amount of savings shortening your function names would give you...
Tip #7:
Keep everything in lowercase? Well, granted that could help on the compression, but personally I like all my HTML tags to be in uppercase...again, due to readability issues...
Tip #8:
This is simply incorrect. Browsers (at least today's rendering engines) do not wait to render a properly encoded table until all the table data is downloaded. If you encode your table appropriately, it'll render as rows are downloaded...however, with that said, using CSS instead of tables is likely a better solution in most cases.
Tip #10:
Photoshop doesn't compress? Im sure the folks at Adobe will be amused. As well as PS users everywhere. Have you used it since PS5? Because it does compress--you just have to tell it how much if you use JPEG. GIF is compressed automagically as long as you limit the color palette.
Those are my main issues with what the author suggests. I should note, however, that some of his other points are well taken, and in some cases great advice. But the simple inaccuracies in the items I've noted undermined, for me, the author's credibility to speak with any degree of authority on this matter...
Posted by: koenkai | January 03, 2006 at 05:43 AM
First: thanks Sarah for that summary, you’re spot on!
Re: XHTML vs. HTML, quirks and Standards.
Even with HTML 4.0 and 4.01 it is more than possible to have browsers render it in standards mode — and hereby I’d like to remind you all that even IE6 has a quirks and a not-so-quirks mode.
All you have to do is put a PROPER doctype at the beginning of your file.
And, you know, even if it’s not nice (read: illegal) to serve XHTML 1.1 as text/html (it is legal to serve XHTML 1.0 so, however), you can do something about it still. For example, you serve it as text/html if the HTTP_USER_AGENT is a /^Mozilla\/4.0/ (which, for example, IE6 is), and serve it as application/xhtml+xml (not xml+xhtml) for modern browsers.
Posted by: Ralesk | January 03, 2006 at 05:46 AM
Thanks Dean for explaining it to this n00b web-designer.
PS And I think the compression is quite a bit higher than 3:1 when compressing html/xhtml. The sites I have build using Apache as webserver shows it is rather 10:1 (or more).
Posted by: Olle | January 03, 2006 at 05:47 AM
Very nice article. I have a tool that can help web designers strip out extrenuous code out of their HTML, like inline styles, and obsolete and deprecated HTML. Although it's targeted as a SEO tool, it's also an excellent web standards tool, designed to help you make smaller pages.
http://www.sitening.com/tools/seo-analyzer/
Posted by: Jon Henshaw | January 03, 2006 at 05:53 AM
Koenkai: #7, yeah, back in the HTML days I would do my elements ALL-CAPS, and the attribute names CamelCaps. It was so good, so easy to read! Then they decided that the XHTML namespace be all lowercase…
Others, question: image sizes. Isn’t that known the very moment the header arrives from the server? I seriously can’t be arsed when I post a picture in my blog to write height and width data in the (X)HTML of my blog entry. Thing is, the browser surely doesn’t have to wait till the whole image file loads — and if you still use images for layout…
PS: Marcelo: the Captcha is extremely unreadable, even for my — presumably human, and not even all that bad — eyes.
Posted by: Ralesk | January 03, 2006 at 05:55 AM