If you were looking for vulnerabilities on a website, you might open up the original page source looking for commented-out code, javascript source, hidden forms, etc. If you suspected an XSS attack on your own site, chances are you might right-click on the page and view source to check for unwanted scripts. If you needed to register for CTP, hack this site, or read the snarky comments in the HTML of www.defcon.org, you would probably need to view the page source.



This is all based on your assumption that when you right-click on the page and select "View Source" the text you see is the HTML source that the server sent to your browser when it requested the URL in your address bar. Unfortunately if you assumed this, you would be wrong. Like me; I always expected that Javascript method calls could only change the appearance of the page, like DOM manipulation, and the "Original Source" window would display the original source of the page. Silly me!

When you use document.write outside of a script tag embedded inline in the page, it replaces the current document content with the new content, which becomes active when document.close is called. This is normal, documented behavior. Surprisingly, at this point if you right-click on the page and click view source in Internet Explorer, it will show the new content, not the old content. And it still displays the original page URL! The original page source has effectively changed.

This technique could be used in any of the above scenarios. For example, in a persistent or type 2 cross site scripting attack, the ability to erase the injected script tags from the page would be very convenient. To see this in action, visit the basic page (normal) and compare to the new page (obvious XSS) or the alt page (hidden XSS - adds an iframe) In Win7+IE8, this is what you see:



If you can find the javascript in the "Original Source" in that screenshot, see a specialist, because it is not there. IE is the worst of the browsers here, because there is no visible indication that the source has been changed. It lies and says that the displayed source is the source of the URL you visited. In reality, it could have been pulled or generated from anywhere.

I have not seen this behavior documented, but I assume it is not a secret among browser developers. For example, let's see how this page looks in Firefox:



Once again, the script has been erased from the page, the address in the browser bar is unchanged, no popup windows have been opened, etc. However there is one small difference, the address in the title bar of the source window is prefixed by "wyciwyg" - no this is not a typo of wysiwyg, it means "what you cache is what you get." Normally these URLs are not visible to the user, but they are used for local cached copies of pages. Nevertheless, the modification is complete; the original source has been completely rewritten and the address bar is unchanged.

So what can you do? I was not able to test every browser, but Chrome displays the actual original source:



Alternatively, you can try to catch the pages via Wireshark but Wireshark will not work with HTTPS. Certain Firefox addons can capture the original source as well, such as the Net panel of Firebug.