Hi all - Mike Arndt here, I am a NetApp Systems Engineer and have been working with Thomson Reuters in a variety of roles over the past 6+ years.
The Novus system is a distributed search architecture that uses thousands of SUSE Linux servers, each running proprietary Thomson Reuters software. Each search server is responsible for part of the overall content index, which fits in server memory so it can be accessed extremely quickly. When a search is executed, it hits thousands of machines at once. The results are sent back to a controller, which sorts them, aggregates them, ranks them, and sends that back to the requesting application. By doing it this way, they can get subsecond search performance.
In order to operate at this level of scale, a high performance shared filesystem that can be accessed by any search node at any given time was required. NetApp storage accessed via the NFS protocol provides this capability. Other components of the Novus architecture use Oracle RAC databases to manage relationships between pieces of content, and NetApp storage accessed via NFS is used again in this area to provide a high performance shared filesystem with very fast and efficient backup and recovery capabilities.