Yahoo’s cached pages can be distributing malware.
Yahoo, has allowed users, for several years, to use the “cached pages” options displayed along with its search results on Yahoo-Search. Yahoo has partnered with McAfee’s SearchScan to provide safer searches since about May 2008. This is all good. The intention of providing safer searches to visitors is very noble. Google too, has led the pack in this direction by opening up its SafeBrowsing API and by providing visual warnings in search results boldly claiming “Warning visiting this website may harm your computer”.
Stopthehacker.com has tried to communicate with executives at Yahoo since April 2009 about the potential problems that we have been observing in their cached pages. This has not been met with any real response.
The problem is simple, but very important. Cached versions of web pages displayed on Yahoo Search often contain malware code embedded in them. This is a phenomenon that we have observed repeatedly.
Consider one of our many attempts at communicating this issue to Yahoo (message shortened for brevity).
We have found that Yahoo’s cache results, even with SearchScan on, do not detect the presence of malware on its cached copies of webpages. I have attached some screen shots which prove the point.
Our scanners flagged the code in the cached copies right away. The site in question, for which I looked up Yahoo’s cache is http://www.xxxxxxxx.com
More info on our response to this site is available at http://xxxxxxxxxx.xxx/**stripped**
The screen shots attached with this post show an example of a website which was scraped by Yahoo’s spider, indexed and cached and then when accessed via its search results, pops up the malware code. There does not seem to be any kind of sanitization/scrubbing process going on in the background.
Worryingly, this problem gives rise to a very effective attack vector, where a malicious individual can compromise a site or even simply create a site that contains malicious code. Once the site is crawled by Yahoo’s spider, and is loaded in the cache, the link to this cached page becomes an excellent attack vector to use for social engineering, as it carries the sense of security that comes with Yahoo’s brand name. No need to exploit XSS/CSRF, no back-breaking hours of toil and sweat need to be put in discovering flaws in a site. Just get the infected pages cached in Yahoo! and voila, you have a live exploit launched from official Yahoo property.
Consider the fact that Yahoo search has 18% of the search market in October 2009, the number of visitors to the site is non-trivial! Moreover, Yahoo’s brand image can suffer, if this phenomenon becomes more wide spread or well-known.
Given my failed efforts to discuss this with Yahoo, at this point, I can only hope that this does not become more popular.
I cannot understand how Yahoo is employing SearchScan technology to provide safer search results to visitors, yet fails at the back-end to identify cached pages loaded with malware.
Till next time.