Tuesday, September 30, 2008

Green IT

The Government Printing Office (GPO) was recently named as a finalist by Computer World magazine for Green IT initiatives. This is an impressive honor granted to a 147 year old agency that is slogging through several massive information system transformation initiatives.

GPO’s IT systems are similar to any other organizations; relying upon networks, hardware, trained personnel and the most important, satisfaction of the end user. In the past several months, the department has eliminated redundancy throughout the agency by consolidating servers, reused or recycled all retired hardware, enhanced end user capabilities and reduced energy consumption in areas by more than 50%.

Additionally, the IT organization is specifying all new systems with low energy components, and equipping areas, as appropriate, with low energy Citrix solutions.

Sustainable computing is a priority for GPO and every effort is being taken maximize green computing techniques. As a federal agency, GPO takes pride in our mission to Keep America Informed and believes environmental stewardship is not only good business but good government.

One of the fundamental transformations at GPO is to develop and launch a world-class information management system for Federal publications. This is known as FDsys, GPO’s Federal Digital System (www.gpo.gov/fdsys). This and other system initiatives are being developed in line with GPO’s sustainability initiatives. These digital system initiatives will allow GPO to maintain their mission to Keep America Informed, by offering access to Federal publications electronically, and improving our print operations to efficiently utilize raw materials and energy. All new digital systems are being installed with energy saving components.

We plan to implement virtualization technology for servers once this technology proves to be reliable for our applications. Virtualization offers us the ability to share single physical servers to support multiple applications operating systems. This will further reduce our Information Technology energy consumption.

Friday, September 12, 2008

Cloud computing

Cloud computing is a style of computing where IT-related capabilities are provided as a service, allowing users to access technology-enabled services without knowledge of, expertise with, or control over the technology infrastructure that supports them.

As we face the need to convert several hundred years of paper documents to digital files for quick and easy access, cloud computing may provide a nice solution for us at the Government Printing Office. There have been efforts underway at GPO and other Federal agencies to scan documents and produce collections of TIFF images that can later be processed to create accessible versions. To date we have several terabytes of TIFF data collected that needs to be converted. But, this is just the tip of the iceberg – we anticipate this unprocessed data will grow to multiple petabytes. To process this data, we are faced with either building an in-house computing capability to convert these documents, outsourcing the conversion, or getting creative and exploring capabilities like cloud computing.

Fortunately there are some benchmarks emerging that we can use to help model this problem and guide us to a solution. It appears that processing a page of text in TIFF from and converting this to a searchable PDF takes about 1 minute with OCR engines working on a typical computing platform available today. This will certainly improve over time as Moore’s Law continues to benefit us. This simple benchmark will allow us to cost out the options.

One of the key barriers that we need to anticipate is licensing specific applications that will be used to accomplish this conversion. In particular, licensing OCR technology to be used in a cloud of thousands of virtual machines to parallel process a large collection of TIFFs appears to be one of our major hurdles to clear.

As virtualization technologies continue to emerge and mature, they will play a big role in solving some complex and, at times, short term computing tasks, minimizing the need to build large, on-site computing facilities.