Hyper-Threading is a technology developed by Intel, released in 2002. Hyper-Threading technology previously only applied to Xeon processor, then known as Super-Threading. Gradually after application of the Pentium 4 in the technology mainstream. Early code-named Jackson.
I N T E L X E O N P R O C E S S O R A R C H I T E C T U R E Exploring the Impact of Hyper-Threading on Web Workloads By David J. Morse; Yi-Ming Xiong, Ph.D.; and Ramesh Radhakrishnan, Ph.D. The Intel® Hyper-Threading technology, featured in Intel Xeon™ processors, can enhance Web server performance. This article examines the results of two Web server benchmarks, the SPECweb®99 and Ziff Davis™ WebBench™ programs, to determine the benefits of using Hyper-Threading technology for Web server workloads. ntel® Hyper-Threading technology allows multithreaded I on two benchmarks that simulate a typical Web server workload— operating systems to view a single physical processor as if it the SPECweb®99 and Ziff Davis™ WebBench™ programs. were two logical processors. A processor that incorporates this technology shares CPU resources among multiple threads, thereby Studying the impact on SPECweb99 performance enabling faster enterprise-server response times and providing SPECweb99 is an industry-standard Web server benchmark additional CPU processing power to handle larger workloads. As developed by the Standard Performance Evaluation Corporation a result, server performance can improve. (SPEC). This benchmark measures the maximum number If using Hyper-Threading technology, a Dell™ PowerEdge™ of simultaneous connections that a server can sustain. The 6600 server with four CPUs would expose eight logical CPUs to workload consists of 70 percent static content and 30 percent the operating system. Hyper-Threading must be enabled in the dynamic content. The dynamic content is composed of dynamic BIOS for the operating system to recognize the logical processors. GET commands, Common Gateway Interface (CGI) GET Microsoft® Windows® 2000 Server and Windows .NET Server commands, emulation of form submissions (POSTs), and user operating systems support Hyper-Threading with no modifications. data tracked through HTTP cookies. The workload file set is a Recent versions of the Red Hat® Linux® operating system, including function of the requested number of simultaneous connections; version 7.3 and Advanced Server 2.1, also auto-detect and enable therefore, a more powerful server would be required to serve a Hyper-Threading support. proportionately larger amount of content. For Red Hat Linux 7.1 and 7.2, administrators should first Dell was the first vendor to publish SPECweb99 results for upgrade the kernel to the latest errata package.1 They can then servers using Intel Hyper-Threading technology. Figure 1 shows enable Hyper-Threading by passing the acpismp=force flag the SPECweb99 performance for Dell PowerEdge servers that through the boot loader. To determine whether Hyper-Threading use Intel Xeon™ processors, which incorporate the Hyper- is enabled under Linux, check /proc/cpuinfo; it should show twice Threading technology. the number of physical CPUs in the server. The PowerEdge 6600 server achieved the top four-way This article explores how Hyper-Threading technology affects SPECweb99 result, and the PowerEdge 4600 servers and PowerEdge Web server performance. It examines the impact of Hyper-Threading 2650 server attained the top two-way results.2 These servers used 1 Forinformation about upgrading the Red Hat kernel, refer to the Customization Guides of the Red Hat Linux Manuals at http://www.redhat.com/docs/ manuals/linux. 2 These SPEC® results were published on http://www.spec.org as of May 27, 2002. For the latest SPECweb99 results, visit http://www.spec.org/osg/web99. 50 PowerSolutions August 2002 SPECweb99 test used eight Intel Gigabit Ethernet network interface Company/system Result HTTP version Processor configuration cards (NICs), so eight Web server threads were created and each Dell PowerEdge 6600 5750 Red Hat Content Four Intel Xeon processors Accelerator 2 MP at 1.6 GHz was bound to a NIC IP address. Each NIC interrupt request (IRQ) Dell PowerEdge 4600 4460 Red Hat Content Two Intel Xeon processors was then bound to one CPU.3 Figure 2 shows this configuration. Accelerator 2 at 2.2 GHz To maximize performance when using Hyper-Threading tech- Dell PowerEdge 4600 4320 Microsoft Internet Two Intel Xeon processors Information Services at 2.2 GHz nology for heavily loaded servers, the workload should be distrib- (IIS) 5.0 and Scalable Web Cache (SWC) 3.0 uted across both physical and logical processors. Internal Dell Dell PowerEdge 2650 4130 Red Hat Content Two Intel Xeon processors benchmarks have shown that load balancing the system by binding Accelerator 2 at 2.4 GHz interrupts and threads to specific processors can improve perfor- mance significantly. Figure 1. SPECweb99 results for Dell platforms that use Hyper-Threading technology Studying the impact on WebBench Performance The WebBench benchmark also measures the performance of Web industry-standard operating systems and achieved results higher servers.4 It uses client systems to simulate Web browsers. Upon than those of competing systems that used proprietary versions receiving a response from the server, the client records the informa- of UNIX®, such as the IBM® AIX®, Compaq® Tru64® UNIX, and tion associated with the response and immediately sends another HP-UX operating systems. Hyper-Threading contributed signifi- HTTP request to the server. This benchmark uses multiple clients to cantly to the record-breaking results of the Dell PowerEdge servers. stress the server with a large number of requests, simulating a real- world scenario. WebBench provides both static suites (consisting of Tuning systems for optimal performance HTML, GIFs, and a few test executables) and dynamic suites (con- The PowerEdge servers’ record-breaking SPECweb99 results were sisting of executable CGI and Internet Server Application Program- achieved by fine-tuning the operating system and Web server to ming Interface, or ISAPI, programs) to test server performance. best take advantage of Hyper-Threading. Since Hyper-Threading The Dell System Performance and Analysis team used WebBench allows the operating system to recognize two logical processors to test the impact of Hyper-Threading.5 Figures 3 and 4 show the per physical processor, efficient CPU binding configurations WebBench performance for static and dynamic workloads, respec- can be achieved. For example, the PowerEdge 6600 server in the tively, when Hyper-Threading is turned on and off. The testbed Physical CPUs Logical (Hyper-Threading) CPUs CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 Thread 0 Thread 1 Thread 2 Thread 3 Thread 4 Thread 5 Thread 6 Thread 7 eth0 eth1 eth2 eth3 eth4 eth5 eth6 eth7 184.108.40.206 220.127.116.11 18.104.22.168 22.214.171.124 126.96.36.199 188.8.131.52 184.108.40.206 220.127.116.11 IRQ25 IRQ26 IRQ23 IRQ24 IRQ29 IRQ30 IRQ27 IRQ28 Figure 2. CPU binding configurations for the PowerEdge 6600 server used in the SPECweb99 tests 3 For more information about creating efficient binding configurations, see “Installing and Configuring the TUX Web Server for Optimal Performance” by David J. Morse in Dell Power Solutions, March 2002. 4 For more information about the WebBench benchmark, see http://www.webbench.com. 5 These tests were performed without independent verification by eTesting Labs, Inc. www.dell.com/powersolutions PowerSolutions 51 I N T E L X E O N P R O C E S S O R A R C H I T E C T U R E 10000 8000 9000 2CPU(HT-on)-static 7000 2CPU(HT-off )-static 8000 6000 1CPU(HT-on)-static 7000 1CPU(HT-off )-static 5000 Requests/second Requests/second 6000 4000 2CPU(HT-on)-ISAPI(dynamic) 5000 3000 2CPU(HT-off )-ISAPI(dynamic) 4000 2000 1CPU(HT-on)-ISAPI(dynamic) 1CPU(HT-off )-ISAPI(dynamic) 3000 1000 2000 0 t 8_ ent _c t _c t _c t _c t _c t _c t _c t _c t _c t _c t _c t _c t _c t nt en 12 en 16 ien 20 lien 24 lien 28 lien 32 ien 36 lien 40 lien 44 ien 48 ien 52 lien 5 6 lien 6 0 lien 1000 lie cli cli cli l l l l 1_ 4_ 0 Number of clients t 8_ ent _c t _c t _c t _c t _c t _c t _c t _c t _c t _c t _c t _c t _c t nt en 12 en 16 ien 20 lien 24 lien 28 lien 32 ien 36 lien 40 lien 44 ien 48 ien 52 lien 5 6 lien 6 0 lien lie cli cli cli l l l l 1_ 4_ Number of clients Figure 4. WebBench performance with Hyper-Threading technology enabled and disabled for a dynamic workload Figure 3. WebBench performance with Hyper-Threading technology enabled and disabled for a static workload specializes in Web server performance and is responsible for running the industry-standard SPECweb® Web server benchmark across consisted of 60 clients to generate a workload sufficient to stress the line of Dell PowerEdge servers. David has a B.S. in Computer the Web server. Engineering from the University of South Carolina and is a Red Hat On average, Hyper-Threading technology improved perfor- Certified Engineer (RHCE). mance by 13.4 percent for the static workload on a system with one CPU. The performance improvement was negligible, however, when Yi-Ming Xiong, Ph.D. (email@example.com) is a system the workload was run on a two-CPU system. With two CPUs in the performance engineer on the Enterprise System Performance Team server, the disk system became a bottleneck and the WebBench at Dell. Prior to Dell, he worked as a senior applications scientist at workload did not scale enough to stress the CPUs. Dynamic work- ThermaWave, Inc. He also held an assistant professorship at Tokyo loads showed similar results: Hyper-Threading technology provided University of Agriculture and Technology. Yi-Ming has a Ph.D. in a 12.3 percent performance boost for the one-CPU configuration Experimental Condensed Matter Physics from the University of but a minimal improvement for the two-CPU system. Nebraska at Lincoln. The WebBench tests show that, for multiprocessor servers, the CPUs must be stressed sufficiently to obtain an improvement with Ramesh Radhakrishnan, Ph.D. (firstname.lastname@example.org) Hyper-Threading; otherwise, little or no improvement will be seen. is a design engineer consultant with the Dell System Performance and Analysis Lab. His responsibilities include performance analysis Improving Web server performance of Dell servers and characterization of enterprise-level benchmarks. When tested on two Web server benchmarks, Intel Hyper- Ramesh has a Ph.D. in Computer Engineering from the University Threading technology improved the performance of Dell of Texas at Austin. PowerEdge servers. The SPECweb99 tests reveal that system workloads must be balanced effectively with interrupt binding FOR MORE I NFORMATION and thread affinity to utilize the additional logical processors Intel Hyper-Threading technology: provided by Hyper-Threading. The WebBench testing shows http://www.intel.com/technology/hyperthread that Hyper-Threading technology can have a positive impact on performance for a one-CPU configuration; for multiple CPUs, SPECweb99 benchmark: the benchmark does not stress the server enough to show a http://www.spec.org/osg/web99 noticeable improvement. Bearing these benchmark results in Dell PowerEdge SPECweb99 results: mind, administrators can create the ideal system configurations http://www.dell.com/us/en/esg/topics/ to maximize the benefits of Hyper-Threading technology. products_benchmark_pedge_spec_web.htm WebBench benchmark: http://www.webbench.com David J. Morse (email@example.com) is a senior performance engineer for the Dell System Performance and Analysis Lab. He 52 PowerSolutions August 2002