Since the cloud computing and the available technology platforms are one of my interests to evaluate how all of these platforms evolve with customers' needs.
I came across Google File System which is being used by Google internal environment for storing and searching for huge amount of data that is being stored on Google Data Servers.
One of the topics that amazed me is the architecture design for Google File System that has been developed by Larry Page & Sergey Brin in early days of google. The idea of storing huge amount of data into 64MB chunks and have a master server or node that store all Metadata related to the storage file and dependancy is amazing.
You can look at the high level design component of this file system here:
I found also a pdf research file for a group of researchers who implemented GFS on Linux environment with statistics about the replica files, read/write throughput and performance aspects of the system. I found it very detailed and resourceful.
Check out this research pdf - published in ACM 2003:
Hope this helps.