Differential Web Pages

Anyone who has used an HTML based navigation system for a web-application of any sort realises that there is too much redundant information that has to be loaded each and everytime one navigates to a certain part.
e.g. when you are checking your Yahoo! mail:

Yahoo-web_application-eg

The regions circled in red stay the same throughout our usage of Yahoo! Mail.

Now each time you click on a link, all those non-transient parts of the webpage too get loaded again. After each navigational click, we have actually reloaded all the 30kb or so of the webpage. The server treats each page of the application as expired-as-soon-as-rendered and so repeated browser requests will re-load the entire page.

Suppose you are waiting for some important mail (and you don’t have Yahoo! Messenger). Then you sign into yahoo, wait at the inbox and keep refreshing the page every 5 mins. For each refresh, you are using re-downloading the 30kb of HTML (the images & css & js will come from the browser cache). This is the worst case of re-downloading redundant data.

As a more general case, around 20% of the webpage is always redundant. The redundant part mainly consists of the header, footer & navigational system.

Diff File: to a direct example to understand what a diff file is:


[file1]
The quick brown fox jumps over the lazy dog.


[file2]
The quick brown fox jumps over the lazy dog!


[diff_file:file2-file1]
(43)-(.)+(!)


(note: this is just an example to highlight the concept. An actual diff file would have a different format)

The only difference between [file1] & [file2] is the punctuation mark at the end of the line (43rd character).
So the diff file only contains the changes that have to be made to [file1] to get to [file2].

That’s how web-pages should be.

The server should just send a diff file rather than the entire web-page.
So only the information needed would be transmitted and the rest of the web-page can remain as it is.

So coming back to the e.g we had started with:
You are refreshing your page every 5 mins to check for new mail.
If you have not got new mail, then the server will return a “null” diff file. finito.
No need to re-download the entire web-page.

Ofcourse, this needs the browsers to be able to read & apply diff files. Might take time. But this technology would respect bandwidth.

  1. One Response to “Differential Web Pages”

  2. By Diu on Aug 30, 2004

    hey is this being done yet by anyone? sounds very sensible but might be diffi to implement, nahi toh ab tak hua hota! Enlighten Me!

Post a Comment