How We Test PCs

At PCWorld, we take the testing of PCs seriously. It is our aim to use reliable, repeatable tests that provide a broad and accurate picture of how well a laptop, desktop, or all-in-one PC performs. We want you to be able to tell immediately whether one computer is faster than another--and more important, whether it is fast enough for your needs.

How We Test PCs
A team of dedicated professionals in the PCWorld Labs tests and benchmarks all of the PCs we receive, so you know that the results won’t vary because one reviewer failed to run the same tests in the same way as another reviewer did. Our philosophy is simple: Use carefully chosen, reliable, repeatable tests based on real-world applications and workloads. We strive to test every PC by running tasks that real users run, with a careful eye toward making sure that we run those tasks the same way on every PC.

The cornerstone of this philosophy is WorldBench 7, our primary PC testing suite. Recently rebuilt from the ground up, WorldBench runs a set of tasks in real-world applications such as Internet Explorer 9, Adobe Photoshop CS5, and VLC, and distills performance down to a single number. A WorldBench 7 score of 100 represents an “average” PC; a higher score indicates a faster system, and a lower score represents a slower machine. You can compare that score across all PCs, from laptops to desktops to all-in-one systems.

How We Test PCs
In addition to WorldBench 7, we test 3D graphics performance with modern games, as well as battery life on laptops. All these tests combine to form a simple performance score--ranging from 1 to 100--that tells you how well a PC performs relative to other machines in its category. Our Labs analysts also measure each PC's energy consumption during use, at idle, and in sleep mode, to produce a WorldBench Green Score, also on a scale of 1 to 100.

Our goal is to deliver reliable, lab-tested data that gives you the information you need to make the right purchase decision. Does the computer play modern games well? Does it use a lot of energy? Is it a good choice for video and photo editing? Does it boot up quickly? We know that these questions and others matter to PC buyers, and it is the goal of the PCWorld Labs to answer them.

Next Page: How We Calculate a PC’s Score

How We Calculate a PC’s Score

Before I describe how we conduct our tests of laptops, desktops, and all-in-one PCs, I'll quickly review how we determine a product’s star rating. For PCs, the final rating is a weighted combination of three scores: Performance, Design and Usability, and Features and Specs. Each of these three scores ranges from 1 to 100, and together they form an overall score of 1 to 100 that we then translate into 1 to 5 stars, including half-stars.

Performance

How We Test PCs
The Performance score is a combination of the WorldBench 7 score, the results of our 3D game performance tests, and, in the case of laptops, our battery life tests. For each system, we compare the results of these lab tests against the results from other PCs in the same category (desktop replacement laptops or all-in-one PCs under 24 inches, for example). The Performance score is not subjective; it is determined entirely by the results of objective tests.

The weighting of the different tests varies by product category. For instance, for ultraportable laptops we use less-strenuous settings in our 3D game tests, and weight the gaming results less, than we do for desktop replacement laptops. But no matter who is reviewing a laptop or desktop PC, all the tests that make up the Performance score are performed and calculated by the PCWorld Labs.

Design and Usability

This score is a subjective measurement of a variety of important characteristics. The reviewer determines this score by considering the aesthetics and build materials of the product, the quality of the display and off-axis viewing, how easy it is to type quickly and accurately on the keyboard, how easy it is to use the touchpad, the location of the ports, the sound quality, and so on. The reviewer considers these aspects relative to the attributes of other products in the same category, and the scoring remains subjective and ever-changing as design principles and technology evolve.

Features and Specs

As with the Design and Usability score, the Features and Specs score is up to the reviewer to determine, after he or she has considered the options that other products in the same category offer. Among the features we evaluate are the number and type of ports, the connectivity options, and the size and weight. Time and technology change what components PCs include, and the Features and Specs score adjusts to keep pace. For example, systems with USB 3.0 ports were unusual in early 2011, and earned high marks for that inclusion. Now, since USB 3.0 ports are commonplace, they don’t earn a system as many points in our evaluation.

The Performance, Design and Usability, and Features and Specs scores combine to form the final product score. We weight this combination differently for various product categories: For instance, the Performance score is more important for high-powered desktops and desktop replacement laptops than it is for small all-in-ones and ultraportable laptops.

Next Page: WorldBench 7 Scoring

WorldBench 7 Scoring

Though the mix of applications and workloads in WorldBench 7 differs from that of previous versions of WorldBench, the general principle is the same: We run all PCs through the same set of primarily real-world applications and workloads, timing how long each one takes, and comparing the result against a baseline configuration meant to represent an average PC.

For WorldBench 7, the specs of our baseline configuration are as follows:

  • CPU: Intel Core i5-2500K
  • RAM: 8GB DDR3-1333
  • Hard drive: 1TB 7200-rpm
  • Graphics card: Nvidia GeForce GTX 560 Ti

This machine forms the basis for determining the WorldBench 7 score, with the baseline represented as a score of 100. For example, if a PC completes a test 20 percent faster than the baseline, it earns a score of 120; if it is 20 percent slower, it receives a score of 80. We compute this score for each test, and then combine them all into a weighted average to produce the final WorldBench 7 score. We put all PCs--laptops, desktops, and all-in-ones--through the same WorldBench 7 tests, and score them against the same baseline. This consistency means that you can compare the WorldBench 7 scores of a laptop, a desktop, and an all-in-one against one another.

We break the WorldBench 7 tests into five sections: Office Productivity, Content Creation, Web Performance, Storage Performance, and Startup Time. Some of these sections comprise multiple tests, while others are a single test. Each section claims a percentage of the total WorldBench 7 score, as follows:

TESTPercentage of WorldBench 7 score
Office Productivity 30%
Content Creation 25%
Web Performance 20%
Storage Performance 10%
Startup Time 15%
Total: 100%

Next Page: WorldBench 7 Tests in Detail

WorldBench 7 Tests in Detail

How We Test PCs
Each of the five categories of tests in WorldBench 7 consists of one or more tests. Here’s a description of each, along with links to the software we use. Note that we use the 64-bit version of each application if such a version is available and if the machine being tested uses the 64-bit version of Windows.

Office Productivity

Our Office Productivity test is the only somewhat synthetic test in the WorldBench 7 suite. PCMark 7 runs a wide assortment of tests to gauge the performance of various parts of the PC, but we use only its Productivity score, which runs tests of typical office tasks such as editing text, browsing the Web, launching applications, and scanning for viruses. On this test, a higher score is better.

Content Creation

Our Content Creation tests are the most complicated, and consist of three subtests: Audio Encoding, Video Encoding, and Image Editing. The Audio Encoding test makes up 20 percent of the score, while the Video Encoding portion and the Image Editing subtest each make up 40 percent. On each of these tests, a lower score is better.

The Audio Encoding test takes a CD’s worth of uncompressed .wav files and converts them to MP3 format using VLC 1.1.11. (For more about VLC, read our review.)

To test Video Encoding, we use ArcSoft MediaConverter 7.5. This application supports Intel’s QuickSync, AMD’s APP, and Nvidia’s CUDA acceleration; we use whichever is available to the system for best performance. We convert a high-definition video clip (MPEG-4 video with AC-3 audio in an AVI container at 1080p resolution) to iPad format.

The video file we use is the royalty-free short film "Big Buck Bunny."

Our Image Editing test runs through the publicly available HardwareHeaven V3 test script using Adobe Photoshop CS5. This script performs a series of common Photoshop actions and filters on a very large image file, timing how many seconds the system takes to complete each one. We then record the total time as the Image Editing score.

Web Performance

Users spend more time on the Web than ever, and today’s sites and Web applications are more demanding than ever. Our Web Performance test is not concerned with the speed at which pages load, which varies greatly depending on your connection, your choice of browser, and other factors. Rather, we measure how well the system can render highly advanced, dynamic Web content, including HTML 5 and JavaScript. To that end, we run the WebVizBench benchmark test using Internet Explorer 9. This test makes good use of multicore CPUs and GPU acceleration, and easily stresses even powerful systems. We run the tests in a 1280-by-720-pixel window and take the final Frames Per Second result. On this test, a higher score is better.

Storage Performance

We run two tests to determine how fast a PC's storage subsystem is. The first is our File Operations test, in which we take a very large (6.9GB) directory full of images, videos, music files, and documents, and perform a controlled series of common tasks--copies, moves, and deletes--to another directory on the same drive. The second is our Compression test, in which we use that same data set to conduct a series of zip and unzip operations using 7-Zip 9.20. (For more about 7-Zip, see our review.) We time how long the system takes to complete each of these two sets of operations. The File Operations and Compression tests each count for half of the Storage Performance score. On these tests, a lower score is better.

Startup Time

For this basic test, we insert a command to open a simple text file into the PC’s Startup folder. Then we completely power down the PC, and time how long it takes to go from pressing the power button to the point at which the Windows desktop loads our text file. In other words, this test measures how long a PC needs to go from completely off to a state where the user can launch applications. On this test, a lower score is better.

Next Page: Other Tests

Other Tests

In addition to WorldBench 7, we conduct two other sets of tests on all PCs, running them through a series of game performance tests, and measuring their energy consumption. On laptops, we also measure battery life.

Game Performance

How We Test PCs
To test how well a PC runs modern games, we run benchmarks using Crysis 2 and Dirt 3. We run both games at two quality settings: The Low settings have reduced texture detail and effects, while the High settings turn the detail levels up. On desktops, we run both quality settings at four resolutions: 1024 by 768, 1680 by 1050, 1920 by 1080, and 2650 by 1600. In addition, we run an “Ultra” quality setting at 2650 by 1600 with all the settings--including all DirectX 11 features and antialiasing--cranked up. We measure performance in frames per second; higher scores are better, and a result of at least 30 frames per second is generally required for a game to be smooth enough for a good experience.

Laptops run the same two tests as desktops, but at resolutions more appropriate to the screens you find on laptops: 800 by 600, 1366 by 768, and 1920 by 1080.

We perform all tests that can run on a particular system (naturally, a laptop or all-in-one with a lower-resolution screen can’t run the highest-resolution tests). However, we don't use all tests equally in calculating a PC’s Performance score: On low-end systems, such as smaller all-in-ones and ultraportable laptops, we emphasize scores from the Low settings and lower resolutions. On performance desktops and desktop replacement laptops, we emphasize the High settings and higher resolutions.

WorldBench Green Score

The WorldBench Green Score is a measurement on a scale of 1 to 100 that evaluates a system’s energy usage relative to similar types of systems (laptops, desktops, or all-in-ones). For each PC, we measure the watts used and the time taken to complete the PCMark 7 productivity tests, arriving at the total watt-hours needed to handle that moderate workload. Then we let the machine sit for 15 minutes to measure how many watts the system consumes at idle, and convert that measure into watt-hours. We combine these results and compare them with results from similar machines. We weight 75 percent of the score toward the working watt-hours, and weight 25 percent toward the idle watt-hours.

We compare all standard desktops against one another, regardless of their category (performance, mainstream, or budget). All-in-one desktops, however, are an exception, as we compare them only against each other; since they have integrated monitors, we can't compare their energy use with that of standard desktop PCs. We compare all categories of laptops against each other. The higher the Green Score on the 1 to 100 scale, the less energy a system uses. We set laptop and all-in-one PC displays to a brightness of 95 cd/m2.

Battery Life

To measure laptop battery life, we first set the display to 65 cd/m2, or as close to that as possible. That’s a “low but readable” brightness setting, similar to what you would use when trying to save battery life. We then run a script that alternates between simulated typing at the command prompt and playing a full-screen high-definition movie (the same "Big Buck Bunny" video we use in our Video Encoding tests). The simulated typing runs for 10 minutes, then the full-screen video plays in VLC; we take care to ensure that hardware video acceleration is enabled in VLC. After 10 minutes of playing video, the script closes the video and returns to the typing test.

This loop repeats until the battery dies. We then fully recharge the laptop and repeat the test at least once to make sure that the results are consistent.

If you are a PC manufacturer and would like more detailed information about how we perform our tests, contact Jason Cross and Tony Leung.

Subscribe to the Business Brief Newsletter

Comments