Web Developers: Speed up your pages!
UPDATE: This blog has moved to http://marcelo.sampasite.com/brave-tech-world/default.htm . Please, update your subscription. Click to subscribe on Bloglines.
There is a ton of research on user behavior based on task response time and I'm sure if you search Jakob Nielsen's site you'll find something.
I remember an IBM study made more than 15 years ago that said if it took 1 second for the computer to respond, it would take 1 second for the user to start the next step, but, if it takes 10 seconds for the computer to respond, it takes 20 seconds for the user to start the next step on his task. It is an exponential problem because users get more and more distract the longer it takes for a command to respond.
Web sites are no different. Google proved with its super fast pages (part real, part perception) that users will perform many more searches if they know they don't have to wait long. If each query on Google would take 20 seconds to respond, you are more likely spending more time thinking about what search terms you'll use. But since it takes just 3 seconds, you just keep adding and removing search terms.
So, during the past few weeks I've been collecting tips for Web Developers on how to improve their page speed. These tips effect latency, bandwidth, rendering and/or perception of when a page is ready. They are in no particular order.
Tip #1: Strip spaces, tabs, CR/LF from the HTML
I'm always surprised when I look at some large website HTML to find out that it has a ton of unnecessary spaces, tabs, new-lines, HTML comments, etc. Just removing those elements can reduce the page size by 5-10%, which in turn can decrease the download latency. I'll go one step further and say to you to not use quotes on attributes unless necessary.
Tip #2: Don't use XHMTL
This is very controversial. A lot of people will call me crazy, but I see XHTML as a loser technology. It has its benefits, but they are far outweighted by the drawbacks. And the biggest drawback for me is that XHTML makes your page larger. Purists will always build their page on XHTML, but if you are in doubt about using it or not, don't!
Tip #3: Keep Cookies Small
Your cookie is sent back to your server every single time the user makes a request for anything. Even with Images, JS, CSS requests or XML-over-HTTP (AJAX) the cookie is sent. A typical HTTP request will have between 500-1000 bytes. Now, if you have 4 cookies each with names like "user_name" followed by a value with 30 characters, you are adding 15-30% more bytes to the request.
Tip #4: Keep JavaScript Small
Who cares if I'm calling my JavaScript function "start_processing_page()" or "slp()"? The download speed cares and the interpreter cares as well, so, use tiny function and variable names, remove comments and unnecessary code.
Tip #5: Use Public Caching
IMHO, This is one of the most under-used features of HTTP. Big websites use it (usually through a CDN, like Akamai), but the vast majority (I dare to say 99%) don't. All those icons, CSSs, JS can, and should, be cached by the browser (Private Cache), but public caching also allows Proxies in-between to cache them. This reduces the load on your server, allowing more CPU and bandwith to do the important stuff. Now, a lot of people don't use Public caching (or even Private) because their CSS is changing, the JS has bugs that need to be fixed, etc. Well, you can do 3 things to deal with that. 1) Let content to be cached for a short period of time (for example, 24h only). 2) Rename the files every time you make a change to them, this way you can let it be cached permanently, or 3) Implement an HTTP Filter that automatically renames the file if they have changed.
Tip #6: Enable HTTP Compression
Your HTML couldn't be a better candidate for compression. It has a very limited character set and lots of repetitions (count the number of "DIV" on your page). That is why HTTP Compression makes so much sense. It can reduces the download by 70% or more. So, instead of having to send 40KB of data, you are sending just 15KB. The user will thank you.
Tip #7: Keep all as much as possible in lower case
This actually works in conjunction with HTTP compression. Remember that this type of compression is lossless, this is, decompressing a content will yield the exactly original, which means that the compression algorithm will treat "DIV", "Div" and "div" as different streams. So, always use lower case for tag names and attributes on the HTML and CSS. Also try to be consistent on your JavaScript.
Tip #8: Avoid Tables
Rendering a table is probably the worse nightmare for a browser. If the browser starts showing the table before all the content inside it is loaded, the browser's rendering engine will have to re-render it many times as each piece is loaded. On the other hand, if the browser needs to wait for everything to be loaded, the user we see a blank page (or partially blank) for a few seconds. Browser's usually use a combination of both to reduce the number of re-renderings without leaving the user hanging in there. The point is, don't make your whole page start with a table. It is preferrable to have 3 tables (header, body, footer). Whenever possible, just avoid using tables altogether.
Tip #9: Set image size
This is very similar to the table rendering problem. If you add an IMG tag to the middle of your page and don't set "width" and "height", the browser has to wait for the image to be loaded to decide the final size, but, meanwhile it will cost the browser at least 1 re-rendering because it will not wait for all the images to be loaded to show you the page.
Tip #10: Compact your GIF/JPG
So, your page has several GIFs and/or JPG? It is very likely that those could be compressed even more without any loss! GIF/PNG mainly have a very compact data structure, but most applications like Corel Photo-Paint and Adobe PhotoShop don't optimize it at all. Go to http://download.com and find yourself a good set of tools to compact your image files. You will be surprised that one of your GIFs had 900 bytes and after compacting it, end up being just 80 bytes.
Tip #11: Reduce the number of external elements
If you see a request graphic from Keynote (a site perf monitoring service) you would be shocked at how long it takes to download just a few extra files to render a page, like a few images, a CSS and a JS file. If you did a good job with Tip #5 (using caching), the impact will be lesser. A browser can only request an image file, after it detected it on the parsing of the HTML. A lot of those file requests are serial. Some browsers limite the number of TCP connections to a single server (usually to 2), thus, allowing your page to only download 2 files at a time. If you have 1 page, 1 css, 1 js, and 7 images on your page (10 files), you can imagine that a lot has to happen before everything is loaded. The point here is, try to reduce the number of files (mostly images), and, if the CSS/JS are small enough, embed it into the page.
Tip #12: Use a single DNS Lookup
This is so overlooked. How many Web Developers think about DNS Lookup when they are building a site? I guarantee you, not many. But even before the browser opens a connection to your server, it needs to do a DNS Lookup to resolve the domain name to an IP address. Now, DNS lookups is one of the fatest things on the Internet, because the protocol is tiny and it is cached everywhere, including the user's computer. But, sometimes you see sites making "creative" domain names for the same server. Like all images come from "images.mysite.com", the page is coming from "w3.mysite.com" (after a redirect from "www.mysite.com"), and the streaming video comes from "mms.mysite.com". That is 3 DNS lookups more than necessary.
Tip #13: Delay Script Starts
If you have a process that renders 100 images per second using 100% CPU, and you add another process doing the same thing, the performance will be less than 100 images per second (less than 50 per process). That is because now the OS has to manage context switches. The same thing applies the scripts on your page. If the browser stills loading and processing a few images, or CSS and you just fire a script, it will take longer for that script to execute than if you had waited the page to be completely loaded. Actually, it gets a little bit more complicated. The browser fires the "onload" event for the page once it has all the elements necessary to render the page, not after the page has really been rendered (there is no "onrendercomplete" event). This means that even after the onload event, the CPU still being used by the browser to render the page. What I usually do in situations like this is to add two indirections. First, attach a script to the onload event to invoke a function that will create a time-event in a few seconds that will do the real initialization. In other words:
<body onload="setTimeout('init();',1000);">
Tip #14: Watch for Memory Leak
The biggest problem with browser's memory leak is that it doesn't affect only the page that created the leak, it affects every single page from any site after that. Internet Explorer is notorious for its massive memory leaks (becase of poor JavaScript). There are a few tools on the Internet to find out if your script is causing memory leak and where. The easiest test is to load your page 100 times and watch PerfMon to see if the Working Set is growing or not. The most simple thing that you should do is to unbind every event that you bound to (dynamically), and to release every reference possible (this also helps the JavaScript garbage collector to be faster).
If you have no clue what I talked about in one of the topics above, either you really don't need to know about it, or, you should immediately go buy some books, and I recommend all books by O'Reilly, like:
- Dynamic HTML: The Definitive Reference - by Danny Goodman
- JavaScript & DHTML Cookbook - by Danny Goodman
- HTTP: The Definitive Guide - by David Gourley & Brian Totty
- Web Caching - by Duane Wessels
XHTML can actually be used to dramatically speed up a site, if you use it right, but what I am about to describe may be considered "cheating".
Since XHTML is HTML implemented in XML, you can use AJAX techniques with it. You don't even need to learn that much AJAX: there's a super simplified subset of AJAX techniques called AHAH (Asynchronous HTML and HTTP) that permits you to update only the parts of the page that change by substituting in chunks of XHTML into the page; that's a whole lot less to transfer than an entire page. On top of that, since with AHAH, you're working with XHTML, there's no XML translation to bother with; don't even bother learning XML style sheet transformations--the stuff being added to your pages is just XHTML, which needs no translation to render on a browser. Just use some scripting to change the parts of the page that need to be updated from the server, and provide permalinks on the page as necessary. This alone will save you more bandwidth than almost all the techniques you listed combined, other than public caching and optimizing your PNGs and GIFs.
Basically, transmitting only the differences between the pages rather than whole pages is the best way to make content load fast on a site. Everything else is marginal compared to this.
(I learned about AHAH at: http://microformats.org/wiki/rest/ahah)
Posted by: Berkana | January 02, 2006 at 11:22 PM
I find that using xhtml far outways the benifits of just doing html. There is no reason you cannot right html that is perfectly fine xhtml without the correct declerations. Your main argument was that XHTML makes your pages bigger. Well maybe by a couple of bytes. An extra couple quotations and some proper structure far outway the size increase. I doo see your point about increased size, but browsers will have an easier time rendering good code compared to bad code. XHTML also supports growing technologies and the legacy and rather depricated html will only find itself in less and less use. Also, as the web grows (at its extreme weight) we soon won't have to worry at all about squeezing bits and bytes out of our html files, more likely squeezing those bytes out of our mpg and avi files.
Posted by: Dustin | January 02, 2006 at 11:30 PM
On point #1, you mention to get rid of all spaces, CF/LR's and quotes (if unnecessary). However, I would propose that it would be unwise to go too crazy with this. Well written, readable but slightly larger code is more valuable than illegible but tiny code. A point worth considering.
Posted by: Josh | January 03, 2006 at 02:29 AM
Eight of your fourteen directives result in maintenance nightmares. A site utilizing all eight of those might load fractionally faster, but no web dev with a life would think twice about offering to maintain it.
On the other hand, proper publication of images and avoidance of tables by themselves will account for most of the speed advantage, most of the time.
I'm grateful that someone pointed out DNS lookups and the hazards of requesting sibling media files from multiple/external networks. Poorly implemented sites will fail to render during network hiccups as a result (which to me is the greater hazard).
Posted by: ben | January 03, 2006 at 02:32 AM
what a joker.
I hope you don't tell clients this BS... no wait I hope you do. Some I can fix it for them and get paid.
don't use XHTML... WAKE UP!! it's not 1999 any more you old fart.
You need to catch up to the rest of use because your going to go out of business in the next 3 years.
Posted by: Chris | January 03, 2006 at 02:33 AM
I'd argue that there's no reason to use GIF anymore. The only possible advantage GIF has over PNG is animation, which has no place on modern pages.
And I'm guessing these image compacting tools don't do anything other than index the colours. Reducing the amount of colours in a PNG to what is actually in the image, or even 255, should reduce the size dramatically. Only when PNG is using the same amount of colours is it comparable to GIF.
Posted by: Dean | January 03, 2006 at 02:46 AM
In response:
(1) Strip whitespace.
Bull. See #6 - they're compressed to nearly nothing anyway. And they make your markup that much harder to read in the long run, so the moment you need to make changes to it, guess what - you're spending far longer making the changes because you can't read your own markup.
(2) Don't use XHTML.
I call bull here as well. Falls back into the same category as #6 - it also enforces rules on how you write your markup, making it possible not only for the end user's system to parse the page far easier, and making it possible for you to make changes on the fly to the markup, but makes it easier in the future for alternate methods of transporting the data to be used.
(3) Keep cookies small.
Fair enough - better yet, though, keep data such as the example above on the server. Network security dictates that you don't give the client machine access to data that they don't need - keep the data in a session and only send a session ID.
(4) Keep JS small.
Um. See caching here, and while the interpreter might care and take nanoseconds longer to parse the script, nanoseconds of time isn't worth the time spent actually being able to find your way through your script again.
(5) Caching.
Good idea, but it really only helps if you've got a static site. If the CSS and scripts and images aren't being dynamically generated and aren't being cached, then you've done something wrong.
(6) Enable HTTP compression.
No disputes on this count - when dealing with uncompressed data. Otherwise it's just wasted time on both ends of the connection.
(7) Lower-case HTML, CSS, JS
While it works in conjunction with compression, unless you're using wacky caps everywhere, the difference is minimal.
(8) Avoid tables.
Any purist could tell you that. Of course, that's because "table" implies "tabular data", such as a parts catalog, or a long-format directory listing, etc. Tables got shoehorned into presentation because it was the only thing that used to be avaliable for doing any sort of complex layouts - that's what CSS is for.
(9) Set image size.
Yup. No disputes.
(10) Compact images.
Might help. If the images are only used once. Once they've been loaded - see caching, above - they don't need to be fetched again.
(11) Reduce the number of external elements.
Which is fair enough - but you lose all benefit of caching if you embed scripts & CSS into the pages, unless they're specific to that one page and you don't intend it to be viewed repeatedly. That's why inline CSS is a bad idea as well as embedded scripts.
(12) Use a single DNS lookup.
Mm, it can help the initial request. It's also a standard practice that I tend to follow - but there are plenty of reasons to avoid following it, one of them being that you have a server spewing up static content aside from the main server, for example.
(13) Delay script starts.
This doesn't do anything for the page load except makes it wait to finish. If you've got an onload handler, which is doing something essential to the usage of the page, making the stuff that happens onload wait up for a setTimeout() to take place is just making the page that much slower.
(14) Watch for memory leaks.
If you're doing anything with complex scripting, you should already be doing this.
In practice, most of the tips above don't really mean much for most developers because they have various things to do with the markup and other components after they've written the initial page - whether it's maintenance, adding features, fixing bugs, what have you - and readable XHTML, readable CSS, and readable JavaScript go a log way towards maintaining a site.
Posted by: Sarah | January 03, 2006 at 02:53 AM
Yea great tips, maybe the author should use them as his code is filled with un-needed spaces and tabs:)
Posted by: Lsv | January 03, 2006 at 02:56 AM
"You need to catch up to the rest of use because your going to go out of business in the next 3 years."
Chris ... you have no idea what you are talking about do you?
I admit that the few extra bytes of data that are required for an XHTML page are no longer a problem for most people with a DSL connection.
But XHTML does have 0 benefits over HTML. In fact most people don't even use XHTML properly. That is because Internet Explorer is unable to parse it. XHTML is an XML based language and thus it requires the application/xml+xhmtl MIME (try that for fun in IE). So you'll end up sending is as text/html ... having 0 of the 'benefits' of XHTML (the browser just builds a HTML DOM). And more notebly you've spend time on doing something completely and utterly useless. Read up on some HTML vs. XHTML discussions (there are plenty of those around the web) before you just carelessly shout that one is better then the other
Posted by: Joël Kuiper | January 03, 2006 at 03:05 AM
Good ideas here, but it would be even more useful if combined with links to online resources explaining how to implement these changes.
Posted by: Sean McManus | January 03, 2006 at 03:37 AM
Tip #2: Don't use XHMTL
Oh my god :-)
You want a clean and small page?
Use Webstandards! => XHTML and CSS!
But use it correct! Keep in mind XHTML is a MARKUP language not a formatting language! Separate content (XHTML) from design (CSS)!
Currently I am working on my new blog, in a view weeks I will show you there how you should use XHTML ;-)
Posted by: Flexchamp | January 03, 2006 at 03:47 AM
About point 6, data compression is not always a good idea since it requires more CPU load to cache the page, if CPU is your bottleneck it can even decrease the performance of your website.
Posted by: Sjors Pals | January 03, 2006 at 04:12 AM
If i'm wrong and XHTML is not the future why are Microsoft adding it to IE7... that's right.
so maybe you should read up before you go shoting your mouth off.
HTML is dead and in 3 years we can start to weed out people like you and the author and only try coders will be left.
now go back to your WYSIWYG editor... amature.
Posted by: Chris | January 03, 2006 at 04:24 AM
and yes I am now aware I made some spelling mistakes some don't try and be clever and point them out.
Posted by: Chris | January 03, 2006 at 04:26 AM
Someone may correct me here, but anymore, browsers render in different modes for different HTML DTDs - if you use XHTML markup, it will render in "strict" mode, and if you use older versions of HTML, it will render in "quirks" mode. If I'm not mistaken, writing well-formed XHTML code will render quicker in strict mode than HTML will render in quirks mode. Not to mention that tables and font tags can be replaced for the most part with CSS, saving code and speeding up rendering. The flip-side of this is, unless you're very careful about validating your code (with W3C's validator or some other tool), it hardly matters anyway, because XHTML that isn't well-formed will still get rendered in quirks mode.
That said, keeping your images compressed and making sure to include size attributes is terrific advice, keeping cookies small is terrific advice, because those are one-time things that you can set and forget. But anybody who has built and maintained more than a couple of web sites (as opposed to web *pages*) will find the usefulness of most of these tips, especially things like taking out all the spaces and linefeeds, are far outweighed by the hassle of maintaining the code. A few extra milliseconds, or even a few extra seconds, of download or rendering time vs. a few extra hours a week maintaining "optimized" code is not even a consideration. Show this list to your project manager, and he'll chuckle indulgently and tell you to stop browsing and get back to work.
Posted by: Ryland | January 03, 2006 at 04:33 AM
Some of these tips are so idiotic and against the idea of WWW (see http://www.w3.org/Provider/Style/URI for example.) It's like someone from Microsoft had written them (smart people with no contextual awareness). Just make it work fast now and don't think big. In no means don't think about reliability.
Posted by: N Any | January 03, 2006 at 04:44 AM
>it requires the application/xml+xhmtl MIME
This is incorrect. From the XHTML 1.0 spec, sending as text/html is allowed:
http://www.w3.org/TR/xhtml1/#media
It's amusing seeing all the ignorant people claim otherwise without ever having looked it up themselves.
Posted by: anon | January 03, 2006 at 04:45 AM
Nice tips
Posted by: Lloyd | January 03, 2006 at 04:50 AM
Wow, not very bright.
Putting all HTML on one line, yes, nice for maintnance in the future, and how does that affect anything if you have gzip compression on?
Dude... don't write about things you know nothing about.
Posted by: Eric | January 03, 2006 at 05:09 AM
It seems to me that this very page doesn't follow your advices..
Posted by: b100dian | January 03, 2006 at 05:12 AM
So much in this blog entry is simply wrong that I'd wonder when was the last time you've developed for the web...just for the record, Im a profession web designer/developer and have been developing/creating/administering sites since '94. So with that said...
I'd take issue with a lot of what's been proposed here...first, we live in the age of broadband, so the whole point of the post is somewhat irrelevant. Yes, I remember the age of 300baud, but we're way beyond that. A few K of savings means zero to most folks out there (33.6Kbps download mean that we're talking at most one second using the author's example of 35K in savings). And the cost in terms of readability, etc. can be huge (hearing from a former Microsoftee that HTML should be clean is kind of amusing seeing how well MS products created bloated HTML).
Tip #1
Next, leaving out CR/LF as well as spacing is insane if you ever want to have HTML that is remotely readable.
Tip #2
Don't use XHTML? Um, that's about as inane a proposition as I've heard. AJAX alone is reason enough to do this (if you're really concerned about server performance and would like to eliminate redundant calls to the server).
Tip #4:
Keeping JS small--well, that makes sense...but I'd suggest writing good JS rather than worrying about the ridiculously tiny amount of savings shortening your function names would give you...
Tip #7:
Keep everything in lowercase? Well, granted that could help on the compression, but personally I like all my HTML tags to be in uppercase...again, due to readability issues...
Tip #8:
This is simply incorrect. Browsers (at least today's rendering engines) do not wait to render a properly encoded table until all the table data is downloaded. If you encode your table appropriately, it'll render as rows are downloaded...however, with that said, using CSS instead of tables is likely a better solution in most cases.
Tip #10:
Photoshop doesn't compress? Im sure the folks at Adobe will be amused. As well as PS users everywhere. Have you used it since PS5? Because it does compress--you just have to tell it how much if you use JPEG. GIF is compressed automagically as long as you limit the color palette.
Those are my main issues with what the author suggests. I should note, however, that some of his other points are well taken, and in some cases great advice. But the simple inaccuracies in the items I've noted undermined, for me, the author's credibility to speak with any degree of authority on this matter...
Posted by: koenkai | January 03, 2006 at 05:43 AM
First: thanks Sarah for that summary, you’re spot on!
Re: XHTML vs. HTML, quirks and Standards.
Even with HTML 4.0 and 4.01 it is more than possible to have browsers render it in standards mode — and hereby I’d like to remind you all that even IE6 has a quirks and a not-so-quirks mode.
All you have to do is put a PROPER doctype at the beginning of your file.
And, you know, even if it’s not nice (read: illegal) to serve XHTML 1.1 as text/html (it is legal to serve XHTML 1.0 so, however), you can do something about it still. For example, you serve it as text/html if the HTTP_USER_AGENT is a /^Mozilla\/4.0/ (which, for example, IE6 is), and serve it as application/xhtml+xml (not xml+xhtml) for modern browsers.
Posted by: Ralesk | January 03, 2006 at 05:46 AM
Thanks Dean for explaining it to this n00b web-designer.
PS And I think the compression is quite a bit higher than 3:1 when compressing html/xhtml. The sites I have build using Apache as webserver shows it is rather 10:1 (or more).
Posted by: Olle | January 03, 2006 at 05:47 AM
Very nice article. I have a tool that can help web designers strip out extrenuous code out of their HTML, like inline styles, and obsolete and deprecated HTML. Although it's targeted as a SEO tool, it's also an excellent web standards tool, designed to help you make smaller pages.
http://www.sitening.com/tools/seo-analyzer/
Posted by: Jon Henshaw | January 03, 2006 at 05:53 AM
Koenkai: #7, yeah, back in the HTML days I would do my elements ALL-CAPS, and the attribute names CamelCaps. It was so good, so easy to read! Then they decided that the XHTML namespace be all lowercase…
Others, question: image sizes. Isn’t that known the very moment the header arrives from the server? I seriously can’t be arsed when I post a picture in my blog to write height and width data in the (X)HTML of my blog entry. Thing is, the browser surely doesn’t have to wait till the whole image file loads — and if you still use images for layout…
PS: Marcelo: the Captcha is extremely unreadable, even for my — presumably human, and not even all that bad — eyes.
Posted by: Ralesk | January 03, 2006 at 05:55 AM
A well coded XHTML page with one CSS file takes far less code than the same design built using font tags and nested tables on each page.
A badly coded XHTML page (defining a seperate class for each element, for example) can be a larger filesize, but that's down to developer incompetance rather than the coding language used.
Posted by: Nice Paul | January 03, 2006 at 05:56 AM
I wish folks at the compnay I work at could all read this. I think I'll pass this around. Thanks for the info. I find this extremely useful!
Posted by: michael | January 03, 2006 at 06:22 AM
he is partly right on xhtml:
http://www.hixie.ch/advocacy/xhtml
also use hardware load balancing if you coun, products by foundry networks are good.
Posted by: unknown comic | January 03, 2006 at 06:27 AM
"do not use quotes on attributes unless necessary"
"Don't use XHTML"
utter, utter bollocks.
I bet you're the sort of person that still uses MS Frontpage
Posted by: unimpressed | January 03, 2006 at 06:27 AM
RE: b100dian
I'd take issue with a lot of what *you* have said...
Let's get down to fundamentals:
Kilobits are not the same as Kilobytes. So 35KB of data would not take 1 second to download on a 35 Kbps connection.
I think this article and the comments point out a common problem with HTML and web developers that no one wants to agree to standards. 'I want to use uppercase tags and I want to use html, I want to use XHTML with uppercase tags'. It's worth bearing in mind that if one develops using standards then their site will be accessible to more people, so for example reading a web-page on a psp, or mobile phone.
Posted by: Ed | January 03, 2006 at 06:36 AM
Don't use XHTML. Don't use spaces and tabs. Ha ha ha.
Ha ha ha.
Ha!
Posted by: hahaha | January 03, 2006 at 06:49 AM
utter tripe
Posted by: anon | January 03, 2006 at 07:04 AM
Speaking as a web professional, who agrees with all of the criticism of this article -- because the critics are absolutely right -- I think the author of the article should do the responsible and right thing, and either take this post down, or better yet include a foreword or afterword in it to acknowledge the article's mistakes.
The author is doing a major disservice to the industry, and to his or herself, by leaving the post as-is.
Posted by: Zach | January 03, 2006 at 07:11 AM
No one should ever user something else than XHTML on newer pages!!!! We do have to get our standard, or web "development" will always be pain in the a (and we're there were we are because of ALL THE SEMI-PROs who think standards are for wumpuses - and because of companies like Microsoft).
Btw, uppercase as tags and so is forbidden in XHTML.
The best validator (not as broken as the one from the W3C):
http://www.validome.org
Posted by: John Moo | January 03, 2006 at 07:50 AM
Don't use XHTML/standards?
IMHO any speed benefits gained from using noncompliant code will only benefit the lucky users whose browsers happen to interpret your non-standard pages in the way you hope.
Good work getting this debate going :)
Posted by: Carl Hubbers | January 03, 2006 at 08:13 AM
I think the overall idea behind putting a list like this together is to get the developer into an optimizing mindset. These days people take bandwidth for granted but you have to remember that over 70% of Internet users are still using dialup. I agree with some of the tips listed and I disagree with others, however I think the author is taking an important first step in emphasizing the importance of optimization. Just like when you code a standalone application, yes you can skip the optimization step and everything will work but you'll end up with a bloated and slower application in the long run. Just because we have the bandwidth doesn't mean we should not optimize. Also remember that your web page is not the only one a user will look at one time. I normally have 4 or 5 five browser tabs open at one time and 2 or 3 other applications running in the background and I'm sure that's lite compared to other people. With how Windows is, optimization is key. Maybe not everything that was mentioned here but like I said, this is a good first step in the right direction.
Posted by: Roy | January 03, 2006 at 08:26 AM
I'm pretty much on the side of Sarah, and I would think most professional web developers would follow.
As stated above, some of the points seem badly researched, especially 11+12:
One of the reasons to use different subdomains for images and so forth is to get around the http-connection limit. Also, one of the ideas of CSS is that the CSS files are cached, and that reverses the problem, less data is loaded with each page request. Take away as much whitespace as you want, I'm pretty sure that an external css structure is, in the very most cases, much more effective.
As I'm not really sure: Is this to be taken as a serious article, or was that more a case of "let's rail up the developers?"
Posted by: Matthias | January 03, 2006 at 08:31 AM
Hmmmm. It would seem that you don't really know anything about professional web-development.
Use XHTML. Use well-written JavaScript with meaningful variable and function names (in .js files, not in-line). Use CSS (and use it well -- learn how to use float, margins etc -- again in external .css files). Do not use tables for layout -- not ever. Use Photoshop's Save For Web option to create your GIF/JPEG/PNG files.
We aren't living in the 90's anymore.
Posted by: Adrian | January 03, 2006 at 09:05 AM
IN RESPONSE TO ALL THE FLAME:
=============================
I'm sticking to what I said above. This is my opinion and I'll recommend it to anyone.
Specifics:
- HTML vs XHTML: For those that don't know, HTML *is* a standard.
- If you think stripping spaces (HTML,CSS,JS) is a maintenance nightmare, you are right. However who said you need to do it by hand? Write a filter to strip out the spaces on the Release version.
- These are not tips for your 10-page website. Nor to your 1000-page website. These are tips for people building massive websites (a la Amazon, Google, MSN, eBay).
All the tips above worked very well for me, and I learned them by trial and error. You should do your own research and find what works for you.
Cheers,
Marcelo
Posted by: Marcelo Calbucci | January 03, 2006 at 09:30 AM
I must disagree with your Javascript comment. If you can do things such as form validation and sorting on the client side and save a trip to the server, your application will perform much faster/better and your server will thank you for distributing the work load.
Posted by: Matthew Price | January 03, 2006 at 10:37 AM
I know this blog software is based on TypePad and you might not have much control over the code generated. Nevertheless, viewing the source shows that the code does not follow your own tips/guidelines. Some optimizations is good but should not be abused, there are always trade-offs.
Posted by: Son Nguyen | January 03, 2006 at 10:46 AM
Funny comments--makes this gloomy day in SF a good one =)
Happy New Year!
Posted by: Sherwin Techico | January 03, 2006 at 11:25 AM
So you're trying to tell me (a comp.sci grad) that it's hard for a computer to parse a well-known and structured XML document, and that HTML tag-soup is faster?
Posted by: Adam | January 03, 2006 at 01:13 PM
Adam: parsing HTML/tag soup almost certainly /is/ faster. The browser doesn't have to check each element for well-formedness, and HTML parsing has been around for much longer and hence is much better optimised than XHTML parsing (hardly anyone serves valid XHTML so there hasn't been nearly as much effort put in to optimising its performance). I've also heard that Firefox in XHTML mode won't start rendering the document until the entire XHTML page has been loaded, but I'm not 100% sure if this is true.
There are a lot of very ill-informed comments attached to this entry. HTML 4.01 is just as much of a "web standard" as XHTML 1.0 - and it will be supported by browsers for decades to come. Think about it: if a browser dropped HTML 4 support it wouldn't be able to render 99% of the web, and would be utterly useless.
I have no idea where the idea that "Ajax only works with XHTML" comes from, but it's not even remotely true.
Posted by: Simon Willison | January 03, 2006 at 02:53 PM
Umm ... to add to all the criticism ... it appears that your own company site (which does not use Typepad and therefore theoretically should be totally under your control) does not follow you own recommendations.
http://www.sampa.com/
Most of which, I agree with many of the other posters, are not great ideas at all.
Just one that I'll point out: embedding your CSS file in each page makes the download bigger, not smaller, since every single page must now contain all the CCS ... whereas if it's linked to, it only needs to be called once, and then is locally cached in the browser.
I notice that your company's site must not be done by you but a smarter web developer .... it calls an external CSS file.
;-)
Posted by: John Koetsier | January 03, 2006 at 03:01 PM
great list. though i do not agree with the xhtml point. xhtml makes code more readable and clean. as a programmer, I am willing so sacrifice a couple of bytes for that.
Posted by: miscblogger | January 03, 2006 at 03:30 PM
one thing that no one has mentioned is to question the whole premise of this post: that transmission and rendering time are the big problems... in many cases it isn't so!
In many cases the delay between request and render is due to server load. If you look at CNN or Google's markup (apparently the site scale this post is aimed at) they don't strip whitespace, they use external CSS, they use different domains etc.
The magic Google and these sites work is in using clustered databases, load-balance server farms, Akamai to reduce network latencies etc etc etc.
My thoughts on faster-loading pages:
1) Use the XHTML/external CSS/external JS approach
2) use gzip compression
3) make sure any dynamic server-side code is decently fast
anything else is marginal marginal marginal
Posted by: nick | January 03, 2006 at 03:48 PM
There's absolutely nothing to stop you from writing HTML 4.01 code that is just as readable and clean as the equivalent XHTML.
Posted by: Simon Willison | January 03, 2006 at 04:02 PM
"parsing HTML/tag soup almost certainly /is/ faster."
No, it absolutely is not. Valid html could be parsed with an SGML parser, which is slower than an XML parser that would be used on xhtml. But no browsers do this because there is so little valid html out there, so they have to use a slow, complicated parsing engine to decipher the mess that most "html" pages are. XML parsing is WAY faster, like 10x faster.
"I've also heard that Firefox in XHTML mode won't start rendering the document until the entire XHTML page has been loaded, but I'm not 100% sure if this is true."
Of course its true, and not just for firefox. You can't tell if an XML document is well-formed until you have the whole thing. So all browsers rendering xhtml (properly, with the right mime type set) will behave like this.
koenkai, you have no clue what you are talking about. First you say that using xhtml is good, then you say you like using all caps tags. Xhtml requires all tags to be lower case. Xhtml has nothing to do with gayjax, which predates xhtml first of all, and predates its new and retarded acronym. And it doesn't eliminate redundant calls to the server, it just requests less data with each call. 35KB of data will take NINE SECONDS to download on a 33.6K modem connection. That alone is enough time for people to give up on your bloated page and go somewhere else.
Marcelo, you are even more clueless. Your advice is not for sites like google and amazon, they aren't stupid enough to do this crap. You have obviously never worked on a large site, but if you at least look at the source of such sites, you will see they don't follow your terrible advice.
Your points contradict each other (if you are using output compression on the webserver, then stripping whitespace and comments and such gets you nothing), or are completely wrong (javascript and css belong in external files, so they are only requested on the first page view, then cached browser-side) or show your lack of understanding (using multiple subdomains is for spreading the requests for different kinds of content to different servers). At least you got a couple things right (that everyone else has known since 1994).
Ralesk, yes you need to put in image dimensions. The browser doesn't know how big the image is till it starts getting the image, but its already started rendering the page then. So it renders the page not knowing how much space to leave for your images, then as it gets the images it keep re-rendering the page with the right size for the image.
And to all the completely clueless people shouting nonsense like "use standards/xhtml", HTML is a standard. You bandwagon jumping dufuses that insist on pushing xhtml, but use tag soup with an xhtml doctype on your own pages are really annoying. If you have no clue about something, DON'T ADVOCATE IT. All you do is make people think that xhtml is empty hype with no value when you spew nonsense and show that you don't even understand xhtml yourself.
Posted by: Jerry | January 03, 2006 at 05:59 PM
"Marcelo, you are even more clueless. [...] You have obviously never worked on a large site"
You don't count MSN as a large site?
I stand by my statement: modern browsers probably render XHTML slower than HTML, so switching to XHTML because it is faster for browsers to parse is currently a waste of time.
Anyone know of any benchmarks that support or disprove this point?
Posted by: Simon Willison | January 04, 2006 at 02:28 AM
"You don't count MSN as a large site?"
I don't count him as having worked on it. Anyone can make say on their blog that they did anything, that means nothing. Look at MSN, they do not do the completely retarded things Marcelo lists.
"I stand by my statement: modern browsers probably render XHTML slower than HTML, so switching to XHTML because it is faster for browsers to parse is currently a waste of time."
Why do you stand by your statement if you have no idea what you are talking about? Html is an SGML markup. SGML has been around for a long time, and is very well known. Its also very well known that parsing SGML takes longer than parsing XML, because SGML is less strict, allows more possibilities, and more complex/ambiguous nesting. More options to search for means more time parsing, its just common sense. Add to that the fact that browsers can't even just use an SGML parser, they have to add a bunch of extra work arounds for the mistakes people make, and it gets even slower.
You are correct however that it is a waste of time to switch to xhtml to improve browser parsing speed. Because although it will speed up parsing, parsing is the tiniest most unnoticable part of the process, so speeding it up doesn't matter. Its downloading and rendering that takes all the time and benefits from being sped up. The advantage of using xhtml is that I can parse my documents from any language, quickly and easily, and transform it in very powerful ways. This is exceedingly difficult, slow, and unreliable with tag soup.
Posted by: Jerry | January 04, 2006 at 10:18 AM
"Personal Blog of Marcelo Calbucci, founder & CEO of Sampa Corp. (www.sampa.com), and former Microsoftee."
Microsoft. That explains everything. No wonder his advice sucks.
Posted by: Xopl | January 04, 2006 at 10:47 AM
Are you seriously proposing to use HTML instead of XHTML because is shorter than ? Not using quotes for attributes leads to invalid HTML and invalid XML. It might work, but it won't validate. That's a high price to pay for saving two measly bytes for every attribute.
Posted by: Qwerty | January 04, 2006 at 10:54 AM
... because [br] is shorter than [br/]...
Posted by: | January 04, 2006 at 10:56 AM
OK, I'll re-work my performance argument to the following:
The speed difference between parsing HTML and parsing XHTML is small enough to be irrelevant. XHTML does not allow incremental rendering (in current implementations) but HTML does. Hence for any large page and/or slow connection HTML will provide better performance.
So the argument that XHTML is faster to parse than HTML is non-useful.
Posted by: Simon Willison | January 04, 2006 at 01:55 PM
you should start by stripping white spaces from your site :-)
seriously, some points are valuable but the rest is more or less worth only for large traffic websites.
Posted by: Nader | January 05, 2006 at 05:33 AM
Amazing, sad and amazing how many wanna-be xhtml advocates do not know basic things about HTML and the very same XHTML they advocate...
"Not using quotes for attributes leads to invalid HTML and invalid XML."
Seriously, read some specs, you will be surprised. :(
Posted by: Rimantas | January 06, 2006 at 01:02 PM
There is one very bad recommendation here: the DNS one.
Web browsers will typically only open two connections to a host at once. Your user will be downloading content two pieces at a time. Their bandwidth will be inefficiently utilized (especially if you have many small files) because a significant percentage of their time will be waiting on roundtrips, not downloading content.
Moving some content to different hostnames increases the number of files they can download at once. If the user is not bandwidth-limited they will achieve a very significant improvement in their page-load time. Even if the user IS bandwidth-limited (for example, an overseas modem user) they will still see a performance benefit because their internet connection will be working closer to their maximum speed. Their download rate while waiting on a response to a GET is zero - that's bandwidth the other simultaneous connections should be using.
Splitting your site into several smaller clusters also allows you to scale more efficiently (a static image server will scale much differently than your PHP servers).
Posted by: c | January 06, 2006 at 08:10 PM
Rimantas:
“By default, SGML requires that all attribute values be delimited using either double quotation marks (ASCII decimal 34) or single quotation marks (ASCII decimal 39). Single quote marks can be included within the attribute value when the value is delimited by double quote marks, and vice versa. […]
In certain cases, authors may specify the value of an attribute without any quotation marks. The attribute value may only contain letters (a-z and A-Z), digits (0-9), hyphens (ASCII decimal 45), periods (ASCII decimal 46), underscores (ASCII decimal 95), and colons (ASCII decimal 58). We recommend using quotation marks even when it is possible to eliminate them.”
This is for HTML401. I remembered from somewhere that only numerical values were allowed quoteless, but in that case I was wrong.
Now for XML:
“Literal data is any quoted string not containing the quotation mark used as a delimiter for that string. Literals are used for specifying the content of internal entities (EntityValue), the values of attributes (AttValue), and external identifiers (SystemLiteral). Note that a SystemLiteral can be parsed without scanning for markup.”
I.e. single or double quotation marks MUST be used in XML.
Posted by: Ralesk | January 09, 2006 at 09:07 AM
3 seconds?? since when does google take 3 seconds to perform a search term??
for me it takes a tenth of a second and i have the my preferences set at 100 results per page.. (firefox)
Posted by: daniel | January 16, 2006 at 01:31 AM
"Avoid using tables" ? wot the! wot else do you do ?   all over the place??
Posted by: Grundizer | February 20, 2006 at 02:32 AM