Amazon and Microsoft allow Google indexing files in Blob Storage

Microsoft and Amazon are giants in the cloud storage business. With Microsoft’s Azure platform, and Amazon’s AWS, they dominate the cloud file storage market. Both of these companies take security and privacy seriously, ensuring customers that the data they upload to cloud is safe. This is why it was such a shock to find that they both allow Google to index the files in the blob storage.

Although this was discovered by a security researcher in 2011, it has recently garnered the attention of security professionals on twitter. Mikko Hypponen of F-Secure pointed his followers to try out the “bug” by searching for content on Azure Blob Store with the words “Confidential” in it:

That search query site:core.windows.net "confidential" yields some very interesting results:

To search for content in Amazon’s AWS storage, use the query:

site:http://s3.amazonaws.com "confidential"

You can also search for specific types of files, for example:
site:http://s3.amazonaws.com filetype:xls password
site:http://s3.amazonaws.com filetype:xls secret
site:http://s3.amazonaws.com "TOP SECRET"

Menace of Unicode Domains

Spoofing is the age old method of tricking people to provide their information to malicious actors. Hackers have been spoofing email messages and domains by pretending to be someone else in hopes of gaining access to someones personal information. When ICANN decided to implement a new class of top-level domains, allowing the usage of ASCII in domains, it opened a whole new can of worms. Special characters like the Latin letter р are indistinguishable from the English letter p, allowing hackers to buy domains like “рayрal.com”. Although to the naked eye, this looks exactly the same as the payment processor PayPal.com, the Latin р in the domain make it a completely different domain, with different DNS point to a different resource. These special characters are encoded into what’s called Punycode1. The encoded version of the invalid paypal domain translates to http://xn--ayal-f6dc.com.

As of April 2017, all modern browsers automatically translate these unicode domains to their encoded punycodes, making it easier for users to avoid being the victims of spoofing attacks. However, a few months ago, some browsers left the unicode in the address bar, making it almost impossible for novice users (and most tech savvy users who were not paying attention) from recognising spoofing attempts. The Tech community raised this issue on twitter in March, forcing Chrome to release a fix that translates the Unicode domains in address bar.

 

  1. Punycode is a way to represent Unicode within the limited character set of ASCII used for internet host names.