Table of Contents
Scott Munger <Scott.Munger@highwinds-software.com> v1.1, September 2004
Usenet volume currently exceeds 1TB/day. Over the past 7 years, Usenet volume has doubled approximately every 9 months and shows little sign of abating. 9 months ago, storage to hold 5 days of binary articles now only holds 2.5 days.
&Highwinds; recognizes first hand the problems this increase in volume creates and is continually devising solutions to hedge the costs of deploying and maintaining a Usenet architecture. &Tornado; significantly reduces storage requirements. And now &StormCellar; drives down the cost of offering extended retention levels.
Anyone who has shopped for storage recently knows that it doesn't all hinge on the Gigabyte. Capacity means nothing if the hardware cannot deliver the data at extremely high rates. I/O backplanes need to support the volume of requests coming in for a storage solution to be successful.
Usenet systems of medium to large size require extremely robust random I/O performance. Such performance is found only in the upper echelon of a storage vendor's offering. Administrators can get burned if they opt for the lowest cost/GB storage, as the throughput will be horrible when thousands of Usenet clients begin their requests.
However, what if there were a way to get the cost benefit of the inexpensive, "slow" storage, but still be able to handle the demands of thousands of readers?
&Highwinds; conducted research to determine that for a given group of Usenet readers most of the demand is in the recent few days of content. This may seem obvious, but the numbers slope off at a less than obvious pace. After day 3, less than 10% of the article "reads" access that portion of the data. Days 1 and 2 have over 80% of the read traffic.
Armed with that information, we can build a storage architecture to exploit it. We can put fast, expensive storage to store the most recent 3 days to handle the massive read traffic. Then we can put cheap storage to store older articles that don't need the read performance. &StormCellar; makes management of this easy.
&StormCellar; is article storage software somewhat analogous to &TornadoBE; that has been specially optimized to run on the barest minimum of hardware. All reading indexes, overview databases, and other medium-weight tasks have been eliminated, so the server can concentrate on storing and serving of articles.
To the &TornadoFE; servers, &StormCellar; masquerades as another &TornadoBE;, and is not treated specially, requiring no unusual configuration changes.
A &StormCellar; receives a feed of expired articles from a &TornadoBE; (a cascade feed) and thus does not carry the newest, most-read articles (first two days), it only carries the older, lesser-accessed articles. This allows leveraging of cheaper hardware due to the precipitous drop-off in load-carrying capacity needed for these older days.