IBM Visual Performance Analyzer User Guide Version 6.1

IBM STG - Performance Visual Performance Analyzer IBM Visual Performance Analyzer User Guide Version 6.2 Issue Date: 22/08/2008 Revision Status: Final DLM Alphaworks Page 1 of 374 IBM STG - Performance Visual Performance Analyzer About this Document This document describes how to install and use the Visual Performance Analyzer tool. This document will help you install the tool, learn how to collect performance data on your platform and later analyze the data, using the VPA plug-ins. DLM Alphaworks Page 2 of 374 IBM STG - Performance Visual Performance Analyzer Table of Contents 1. INTRODUCTION......................................................................................................................................................................6 1.1 VPA on Alphaworks..............................................................................................................................................................7 1.2 Release History.......................................................................................................................................................................7 2. VPA BASICS..............................................................................................................................................................................8 2.1 Design Objectives...................................................................................................................................................................9 2.2 Deployment ..........................................................................................................................................................................9 2.3 Software Stack Information..................................................................................................................................................10 3. INSTALLATION.....................................................................................................................................................................12 3.1 Windows...............................................................................................................................................................................12 3.1.1 VPA RCP (Rich Client Platform) Installation..............................................................................................................12 � 3.1.2 VPA with IES (IBM Eclipse SDK) Installation.............................................................................................................17 � 3.1.3 VPA Update Site Installation........................................................................................................................................17 � 3.2 Linux....................................................................................................................................................................................20 3.2.1 VPA RCP Installation...................................................................................................................................................20 � 3.2.2 VPA with IES (IBM Eclipse SDK) Installation.............................................................................................................21 � 3.2.3 VPA Update Site Installation........................................................................................................................................21 � 3.3 AIX.......................................................................................................................................................................................21 3.3.1 VPA RCP Installation...................................................................................................................................................21 � 3.3.2 VPA with IES (IBM Eclipse SDK) Installation.............................................................................................................22 � 3.3.3 VPA Update Site Installation........................................................................................................................................22 � 4. COLLECTING PERFORMANCE DATA............................................................................................................................23 4.1 Using Platform Tools...........................................................................................................................................................23 4.2 Setting up Windows to collect Profiling data.......................................................................................................................24 4.2.1 Verify that your Java Runtime is installed on your system...........................................................................................24 � 4.2.2 Verify that the Windows performance tools are installed.............................................................................................25 � 4.2.3 Verify PI Tprof .............................................................................................................................................................25 � 4.2.4 Copying data files.........................................................................................................................................................25 � 4.3 Setup up AIX to collect Profiling data.................................................................................................................................26 4.3.1 Verify that your Java Runtime is installed on your system...........................................................................................26 � 4.3.2 Verify that the AIX performance tools are installed.....................................................................................................26 � 4.3.3 Verify AIX Tprof............................................................................................................................................................26 � 4.3.4 Verify AIX Gprof...........................................................................................................................................................26 � 4.3.5 Copying data files.........................................................................................................................................................27 � 4.4 Collecting Profiling Data on Linux platform.......................................................................................................................27 4.4.1 Linux Cell BE................................................................................................................................................................27 � 4.4.2 Linux PowerPC.............................................................................................................................................................28 � 4.4.3 Linux X86......................................................................................................................................................................28 � 4.5 Using Remote Connections View........................................................................................................................................29 4.5.1 Create a remote connection..........................................................................................................................................29 � 4.5.2 Create a profile configuration......................................................................................................................................30 � 4.5.3 Define CPC event sets...................................................................................................................................................62 � 4.5.4 SSH public key authentication......................................................................................................................................64 � 4.5.5 Launch remote profiling on AIX or Linux system.........................................................................................................65 � 4.5.6 Launch remote hpmcount or hpmstat profiling tool.....................................................................................................65 � 4.5.7 Launch remote CPC......................................................................................................................................................66 � 4.5.8 Launch PDT on remote Linux Cell BE system..............................................................................................................66 � 4.6 Collecting Pipeline data on PowerPC..................................................................................................................................67 DLM Alphaworks Page 3 of 374 � IBM STG - Performance Visual Performance Analyzer 5. USING THE VPA ANALYSIS TOOLS.................................................................................................................................68 5.1 Profile Analyzer...................................................................................................................................................................68 5.1.1 Load an Existing Profile...............................................................................................................................................68 � 5.1.2 Profile Navigation.........................................................................................................................................................70 � 5.1.3 Profile Comparison.......................................................................................................................................................93 � 5.1.4 Profile Merge................................................................................................................................................................96 � 5.1.5 Symbol Analysis..........................................................................................................................................................100 � 5.1.6 Couple with Code Analyzer.........................................................................................................................................106 � 5.1.7 Create/Configure/Refresh/Discard a Connection.......................................................................................................139 � 5.1.8 Configure database connections and manage cached database files.........................................................................142 � 5.1.9 View call graph in Call Graph view...........................................................................................................................147 � 5.1.10 Create and Use Custom Counter..............................................................................................................................155 � 5.1.11 Load symbols into the Inlined Calls View.................................................................................................................161 � 5.2 Code Analyzer....................................................................................................................................................................163 5.2.1 Load an executable application for analysis..............................................................................................................163 � 5.2.2 Run instrumented executable file and collect profile data remotely...........................................................................166 � 5.2.3 Adding profiling information......................................................................................................................................171 � 5.2.4 Navigate the Executable .............................................................................................................................................172 � 5.2.5 Instruction Properties Analysis ..................................................................................................................................197 � 5.2.6 Statistic Analysis.........................................................................................................................................................206 � 5.3 Pipeline Analyzer...............................................................................................................................................................214 5.3.1 Load an existing pipeline file......................................................................................................................................215 � 5.3.2 Navigating the scroll pipeline view.............................................................................................................................218 � 5.3.3 Navigating the resource view......................................................................................................................................221 � 5.3.4 Manage Assignments..................................................................................................................................................230 � 5.3.5 Tie Cycle Controls......................................................................................................................................................231 � 5.4 Counter Analyzer ..............................................................................................................................................................234 5.4.1 Basic concepts for Counter Analyzer..........................................................................................................................235 � 5.4.2 Load an existing counter data file...............................................................................................................................237 � 5.4.3 Navigate the Counter Analyzer Perspective...............................................................................................................240 � 5.4.4 Watch Properties via Properties Sheet.......................................................................................................................283 � 5.5 Trace Analyzer ..................................................................................................................................................................284 5.5.1 Basic concepts.............................................................................................................................................................284 � 5.5.2 Load an existing trace file...........................................................................................................................................285 � 5.5.3 Navigate the Trace Analyzer Perspective...................................................................................................................286 � 5.5.4 Select Event.................................................................................................................................................................290 � 5.6 Call Tree Analyzer.............................................................................................................................................................290 5.6.1 Basic Concepts............................................................................................................................................................291 � 5.6.2 Load Call Trace Data File..........................................................................................................................................293 � 5.6.3 Call Tree Analyzer Perspective Introduction..............................................................................................................294 � 5.6.4 View Call Trace Data in Execution Flow Editor Page...............................................................................................294 � 5.6.5 View Call Trace Data in Call Tree Editor Page.........................................................................................................297 � 5.6.6 Locate Invocation by Method Name...........................................................................................................................299 � 5.6.7 Filter Invocations........................................................................................................................................................301 � 5.6.8 Filter the method in Method Overview.......................................................................................................................304 � 5.6.9 Filter the method in Type Overview............................................................................................................................306 � 5.6.10 Drill Down one Invocation.......................................................................................................................................308 � 5.6.11 Setup Column Properties..........................................................................................................................................310 � 5.6.12 Show Color Bar Table..............................................................................................................................................316 � 5.6.13 Save Call Tree into Text File....................................................................................................................................319 � 5.6.14 Save Execution Flow Graph as Image......................................................................................................................321 � 5.6.15 Show JProf Information............................................................................................................................................321 � 5.6.16 Show Invocation in Invocation View ........................................................................................................................323 � 5.6.17 View Call Stack ........................................................................................................................................................324 � 5.6.18 View method in a table .............................................................................................................................................338 � 5.6.19 View call graph in Call Graph view.........................................................................................................................342 � DLM Alphaworks Page 4 of 374 � IBM STG - Performance Visual Performance Analyzer 5.6.20 Ways to invoke Invocation View...............................................................................................................................344 � 5.6.21 Analyze memory information....................................................................................................................................347 � 5.6.22 Change time stamp in execution flow graph.............................................................................................................351 � 5.6.23 Highlight in call tree table........................................................................................................................................351 � 5.6.24 Detect repetition in execution flow graph.................................................................................................................353 � 6. APPENDIX A - SAMPLE PROFILING SESSION............................................................................................................357 DLM Alphaworks Page 5 of 374 IBM STG - Performance Visual Performance Analyzer 1.Introduction What is Visual Performance Analyzer? Visual Performance Analyzer (VPA) is an Eclipse-based performance visualization toolkit. It consists of six major components: Profile Analyzer, Code Analyzer, Pipeline Analyzer, Counter Analyzer, Trace Analyzer, and Call Tree Analyzer. VPA now supports Call Graph view which is shared by Profile Analyzer and Call Tree Analyzer in VPA 6.2 version. Profile Analyzer Profile Analyzer, a profile analysis tool,provides a powerful set of graphical and text-based views that allow users to narrow down performance problems to a particular process, thread, module, symbol, offset, instruction, or source line. Profile Analyzer supports AIX tprof profiling tool, Oprofile profiling tool and IBM Performance Inspector (a kind of tprof) . To load huge profile data files and reduce memory footprint, Profile Analyzer now uses database to cache profile files. From 2.0.3 version, Profile Analyzer can integrate with Code Analyzer for better navigation and comparison of module information. Code Analyzer Code Analyzer examines executable files and displays detailed information about functions, basic blocks, and assembly instructions. It is built on top of FDPR-Pro (Feedback Directed Program Restructuring) technology and allows adding of FDPR-Pro and Tprof profile information. Code Analyzer is able to show statistics to navigate the code, to display performance comment and grouping information about the executable files and to map back to source code. Pipeline Analyzer Pipeline Analyzer is to display the pipeline execution of POWER series processor generated from tools, such as Sim_GX.It provides graphical views in two modes: scroll mode and resource mode. You can change their visual settings as you like. Counter Analyzer Counter Analyzer is a common tool to analyze hardware performance counter data among many IBM eServer platforms, which includes systems running on AIX, i5OS, Linux on POWER, Linux on Cell BE. Counter Analyzer accepts hardware performance counter data generated by AIX tools hpmcount and hpmstat in the form of a crossplatform XML file format. The tool uses either build-in hsqldb database engine or external DB2 instance to store the raw performance counter data. The tool provides multiple views to help you identify and eliminate performance bottlenecks by examining the hardware performance counter values, computed performance metrics and also CPI breakdown models. Trace Analyzer Trace Analyzer is an eclipse-based tool to read in traces generated by the Performance Debugging Tool for Cell BE, and display time-based graphical visualization of the program execution as well as a list of trace contents and the event details for selection. Call Tree Analyzer Call Tree Analyzer is the tool to analyze the call trace data collected by the tool such as Performance Inspector JProf profiling tool. The call trace data contains the information such as when one method calls another, how much time is spent in every invocation. Call Tree Analyzer provides two major visualization ways to analyze call trace data, which are execution flow graph and call tree table. DLM Alphaworks Page 6 of 374 IBM STG - Performance Visual Performance Analyzer How does it work? Profile Analyzer parses system profiles into an internal profiling data model that supports the profile hierarchy, offset locations, tick counts, CPU counter data, source line information, and disassembly. The plug-in then displays this data model, using various Eclipse views. The system profiles are those produced by Performance Inspector and AIX® Tprof, and Linux oprofile. However, Visual Performance Analyzer can be extended to support almost any platform by converting a system profile to an XML schema that it understands. Code Analyzer is able to read profiling information generated by AIX Tprof or FDPR-Pro performance tools. It reads in executable files and shared libraries and analyzes them using FDPR-Pro. FDPR-Pro is a post-link analyzer and performance optimization tool that can perform accurate static and dynamic analysis of executable files. Pipeline Analyzer reads the .pipe and .config input files that are produced by the IBM Performance Simulator for Linux on POWER. An instruction trace is first collected and analyzed by a processor model. The two output files are produced for viewing with either the Performance Simulator or Visual Performance Analyzer. Counter Analyzer reads the XML output from hardware data collection tools. The XML is parsed and then displayed or graphed for viewing. If a CPI breakdown model is available, the data can be broken down into individual components and viewed in the CPI tab. The CPI breakdown allows you to view where the workload is spending its processing cycles. Trace Analyzer reads in traces generated by the Performance Debugging Tool for Cell BE, and displays time-based graphical visualization of the program execution as well as a list of trace contents and the event details for selection. Call Tree Analyzer reads the call trace data file, and display execution flow graph and call tree to help you analyze when and where one method invocation happens, and how long it runs. 1.1VPA on Alphaworks Visual Performance Analyzer was released on Alphaworks to explore the use of Eclipse-based performance tools with IBM customers. VPA is built as an Eclipse Rich Client Platform (RCP) package and there are versions for AIX, Linux, and Windows. An RCP release contains IBM JRE, Eclipse runtime files, all required plug-ins and VPA plugins. 1.2Release History Date 09/14/2006 06/08/2007 09/28/2007 01/08/2008 08/22/2008 Description Initial release of VPA to Alphaworks VPA 5.0 Release VPA 6.0 Release VPA 6.1 Release VPA 6.2 Release DLM Alphaworks Page 7 of 374 IBM STG - Performance Visual Performance Analyzer 2.VPA Basics Visual Performance Analyzer is an Eclipse-based tool set that includes: Profile Analyzer, Code Analyzer, Pipeline Analyzer, Counter Analyzer, Trace Analyzer, and Call Tree Analyzer. All of these tools are Eclipse plug-ins. Visual Performance Analyzer � Trace Cell BE PDT Call Tree Analyzer Plug-in Counter Analyzer Plug-in Pipeline Analyzer Plug-in AIX gprof Analyzer Plug-in Code PI jprof AIX hpmstat AIX hpmcount Cell BE cpc FDPR-Pro Analyzer Plug-in Profile Analyzer Plug-in AIX tprof Linux oProfile PI tprof Sim_GX Eclipse Figure 1 System Architecture of Visual Performance Analyzer Profile Analyzer Profile Analyzer is a system profile analysis tool. This plug-in obtains profile information from various platform specific tools, and provides analysis views for you to identify performance bottle necks. Pipeline Analyzer Pipeline Analyzer gets pipeline information of Power processors from the Sim-GX tool, and provides two analysis views which are in scroll mode and resource mode. Code Analyzer Code Analyzer reads XCOFF (AIX binary file format) files or ELF files running on Linux on Power, and displays program structure with block information. With related profile information, it can provide analysis views on the hottest program block as well as some optimization suggestions. Counter Analyzer DLM Alphaworks Page 8 of 374 IBM STG - Performance Visual Performance Analyzer Counter Analyzer reads counter data files generated by AIX hpmcount or hpmstat performance tool, and it provides multiple views to help you identify and eliminate performance bottlenecks by examining the hardware performance counter values, computed performance metrics and also CPI breakdown models. Trace Analyzer Trace Analyzer reads in traces generated by the Performance Debugging Tool for Cell BE, and displays time-based graphical visualization of the program execution as well as a list of trace contents and the event details for selection. Call Tree Analyzer Call Tree Analyzer reads the call trace data file, and display execution flow graph and call tree to help you analyze when and where a method invocation happens, and how long it runs. 2.1Design Objectives The base object of Visual Performance Analyzer is to extend the capabilities of Eclipse by adding plug-in support for system profile, code, pipeline, performance counter, and trace analysis. VPA is a collection of performance data analysis tools that can be used to identify performance bottlenecks. VPA does not supply performance data collection tools. Instead, it relies on platform specific tools, such as AIX Tprof, to collect the performance data. When necessary, multi-platform support is provided by converting data into XML. The XML schema is understood by VPA and is parsed and loaded for analysis. The VPA tool is extensible and it achieves this by allowing for additional plug-ins to be added and also by adding integration between plug-ins, e.g. shared internal data models and linked views. Information about VPA data files: - - The .etm file is the XML file for Profile Analyzer The .etz file is the zipped XML profile data The .opm file is the oProfile XML file for Profile Analyzer The .opz file is the zipped oProfile XML file for Profile Analyzer The Java profile file data from IBM JRE Java profiling tools are merged by TProf tools into a single .etm file. No additional post processing is needed. The pipeline files are: .pipe data file and .config file is the default configuration file. The .pmf file is the XML file for Counter Analyzer. The .pex file is the XML configuration file and the .trace file is the binary data file for Trace Analyzer The .jprof file is from Performance Inspector JProf The gprof remote file and gmon.out file are from AIX gprof 2.2Deployment As a performance analysis tool, Visual Performance Analyzer typically runs on User’s ThinkPad or desktop as a client application. Visual Performance Analyzer can get performance-related data from servers via Remote Connection Plugin (SSH), or by copying the files from FTP or by some other means. DLM Alphaworks Page 9 of 374 IBM STG - Performance Visual Performance Analyzer Figure 2 System Deployment of Visual Performance Analyzer � 2.3Software Stack Information Visual Performance Analyzer Eclipse Rich Client Platform (With dependency plug-ins) Java Runtime Environment Operating System (AIX, Windows, Linux …) Figure 3 Product Stack of Visual Performance Analyzer VPA runs on the following operating systems: (1) Windows XP with SP2 or later (2) IBM AIX 5.3 in the latest maintenance level DLM Alphaworks Page 10 of 374 IBM STG - Performance Visual Performance Analyzer (3) Linux/x86 –RHEL 5.1 and SUSE 10.1 Profile Analyzer, Pipeline Analyzer, Trace Analyzer, Counter Analyzer, and Call Tree Analyzer are Eclipse plug-ins and are 100% JAVA code. They can run on all the above supported platforms. Code Analyzer is also an Eclipse plug-in, but it depends on FDFR-Pro libraries that are platform-dependent libraries. Code Analyzer can run on Windows, AIX 5.3, and Linux x86 in this release. Although VPA only runs on the above operating systems, it’s important to realize that it can analyze the data collected from any platform, if the data is provided in a format understood by VPA. VPA supports only IBM J9 JRE 5. There is an IBM J9 JRE 5 in VPA RCP distribution. VPA supports the following Eclipse platforms: (1) Eclipse 3.3. DLM Alphaworks Page 11 of 374 IBM STG - Performance � Visual Performance Analyzer 3.Installation No installer is required for VPA installation. The VPA installation is as simple as: 1. � Download a newest VPA RCP, VPA with IES, or VPA Update Site release, usually it should be a zip archive or a compressed tar archive 2. Extract the archive 3. � Run the application by executing vpa binary or vpa.sh script. The RCP application does not include the following products: Performance Inspector for Windows and DB2 UDB. If you want to use these features, you must install the corresponding product manually. Configuration No configuration is required for the VPA application installation. Advanced configuration information is provided in online-help. These configurations address some special requirements, such as setting bigger heap size of JVM for Eclipse when you analyze a large profile. Uninstallation No special uninstallation action is required. If you want to uninstall a VPA RCP application, you can simply delete the application directory that VPA was installed to. 3.1Windows These steps will walk you through the installation of VPA on your Windows workstation. 3.1.1VPA RCP (Rich Client Platform) Installation These steps will walk you through the installation of VPA on your Windows workstation. 1. Download the latest VPA (Visual Performance Analyzer) from http://www.alphaworks.ibm.com/tech/vpa, and save vpa-rcp-${version}-win32.zip to your favorite download directory. 2. Install VPA RCP  Right-click the file and select Extract All to open the Extraction Wizard. DLM � Alphaworks Page 12 of 374 IBM STG - Performance Visual Performance Analyzer Figure 4 Extract the VPA RCP package � DLM Alphaworks Page 13 of 374 � IBM STG - Performance � Visual Performance Analyzer Select Next Figure 5 Select Next to continue extraction  As time advances, the new versions of VPA will be released. In order to save you a lot of headaches with new versions, create the new folder with a name containing the version number and install VPA to that directory. If each version is installed this way, you’ll have multiple working versions. When there is a problem, you can go back to the old version. DLM � Alphaworks Page 14 of 374 IBM STG - Performance Visual Performance Analyzer Select a root directory and folder to extract files to. You will need to create a folder yourself since it will not create it automatically Figure 6 Select a root directory and a folder to extract files to � Click to finish Figure 7 Click Finish to complete extraction � DLM Alphaworks Page 15 of 374 IBM STG - Performance � Visual Performance Analyzer 3. Create a shortcut.  A window with the folder and its contents will open if you selected Show extracted files. Create a shortcut on the desktop Figure 8 Create a shortcut on the desktop  Run vpa.exe. Figure 9 The splash screen of Visual Performance Analyzer  If you see this screen when you start up VPA, it means that Eclipse is not running any of the VPA tools. You may click the links in the Welcome view to switch to the tool perspective you want, or you may close the Welcome view and switch to one of the VPA tool perspective by following the procedure as follows. DLM � Alphaworks Page 16 of 374 IBM STG - Performance Visual Performance Analyzer Figure 10 The Welcome view of Visual Performance Analyzer 3.1.2VPA with IES (IBM Eclipse SDK) Installation 1. Download the latest Windows version of VPA IES from http://www.alphaworks.ibm.com/tech/vpa. 2. Unzip the downloaded vpa-ies-${version}-win32.zip file to your VPA working directory, e.g. c:\vpa-ies. 3. Launch Eclipse. 3.1.3VPA Update Site Installation The installation instructions for Windows assume that VPA will be installed in the c:\vpa-update-site directory; however, you can install VPA in any directory you want. You can substitute your installation directory for c:\vpa­ update-site in the commands that follow. Prerequisites: You must install IBM JRE 1.5.x, Eclipse SDK 3.3.x, GEF 3.3.x, and CDT 4.0.x before you start the installation. 1. Download the Windows version of VPA update site from http://www.alphaworks.ibm.com/tech/vpa. 2. Unzip the vpa-update-site-${version}-win32.zip file that you downloaded to c:\vpa-update-site. 3. Lanch your Eclipse. 4. Select Help-> Software Updates -> Find and Install .... from the Eclipse menu. DLM Alphaworks Page 17 of 374 IBM STG - Performance Visual Performance Analyzer Figure 11 Find and install an application 5. Select Search for new features to install and click Next. Figure 12 Select Search for new features to install 6. Click New Local Site. DLM Alphaworks Page 18 of 374 IBM STG - Performance Visual Performance Analyzer Figure 13 New local site 7. Select c:\vpa-update-site, and click OK. And click Finish. Figure 14 Select a new local site � DLM Alphaworks Page 19 of 374 IBM STG - Performance Visual Performance Analyzer 8. Select VPA plugins to install. Figure 15 Select VPA plugins to install 9. Restart Eclipse. 3.2Linux These steps will walk you through the installation of VPA on your Linux workstation. Supported Linux platform is Linux/x86: RHEL 5.1 and SUSE 10.1 3.2.1VPA RCP Installation 1. Download the latest VPA (Visual Performance Analyzer) from http://www.alphaworks.ibm.com/tech/vpa. 2. Install VPA RCP. • • • DLM Go to the directory where gz file is …….cd /favdir. Change file attributes ……………………chmod 755 vpa-rcp-${version}-linux-x86.tgz Decompress the file ……………………..tar –xvfz vpa-rcp-${version}-linux-x86.tgz Alphaworks Page 20 of 374 IBM STG - Performance � Visual Performance Analyzer 3.2.2VPA with IES (IBM Eclipse SDK) Installation These installation instructions assume that VPA will be installed in the /opt/vpa directory; however, you can install VPA in any directory you want. You can substitute your installation directory for /opt/vpa in the commands as follows. 1. Create a VPA directory with the command mkdir /opt/vpa. 2. Download the Linux version of VPA to the created directory. 3. Go to the created directory with the command cd /opt/vpa. 4. Decompress the installation package with the command tar -zxvf vpa-ies-6.2.0-linux-x86.tgz. 5. Launch Eclipse using the following commands step by step:   cd /opt/vpa/vpa-ies ./eclipse or ./vpa.sh 3.2.3VPA Update Site Installation 3.2.3.1Prerequisites • • • • Install Eclipse 3.3 (Download the Eclipse 3.3 from Eclipse.org, and unzip it). Install IBM Java 5 SDK. Install CDT 4.0. Please refer to http://www.eclipse.org/cdt/ Install GEF 3.3. Please refer to http://www.eclipse.org/gef/ 3.2.3.2Install Update Site RPM In order to install VPA features to Eclipse, you should follow these common steps. 1. Install vpa-update-site-6.2.0-1.noarch.rpm, type command rpm –ivh vpa-update-site-6.2.0-1.noarch.rpm to install VPA update site to the /opt/ibm/vpa directory. The other steps to install the VPA plugins are similar to the steps on Windows, refer to step 3~ step9 from VPA Update Site Installation.  3.3AIX These steps will walk you through the installation of VPA on your AIX workstation. 3.3.1VPA RCP Installation 1. � Download the latest VPA (Visual Performance Analyzer) from http://www.alphaworks.ibm.com/tech/vpa and save vpa-rcp-${version}-aix-ppc.zip to your favorite download directory. DLM � Alphaworks Page 21 of 374 IBM STG - Performance Visual Performance Analyzer 2. Install VPA RCP. • • • Go to the directory where gz file is …….cd /favdir Change file attributes ……………………chmod 755 vpa-rcp-${version}-aix-ppc.tgz Decompress the file ………………………gzip –dc vpa-rcp-${version}-aix-ppc.tgz | tar xvf – 3.3.2VPA with IES (IBM Eclipse SDK) Installation The installation instructions assume that VPA will be installed in the /opt/vpa directory; however, you can install VPA in any directory you want.You can substitute your installation directory for /opt/vpa in the commands that follow. The steps to install VPA with IES on AIX are similar to the steps on Linux except step 4, in which you must decompress the installation package vpa-ies-6.2.0-aix-ppc.zip on AIX using the command chmod -R +x /opt/vpa/vpa-ies. Refer to VPA with IES (IBM Eclipse SDK) Installation for further steps. Note: You can download the unzip application from IBM AIX Toolbox (http://www-03.ibm.com/systems/p/os/aix/linux/toolbox/download.html). After you download the unzip application like unzip-x.xx-x.aix5.1.ppc.rpm, upload it to AIX and install it with the command rpm -ivh unzip-x.xx-x.aix5.1.ppc.rpm. For further information, installation tips, and news, refer to the AIX Toolbox for Linux Applications ReadMe. (ftp://ftp.software.ibm.com/aix/freeSoftware/aixtoolbox/README.txt) 3.3.3VPA Update Site Installation These installation instructions assume that VPA will be installed in the /opt/vpa directory and Eclipse is installed in the /opt/eclipse directory; however, you can install VPA in any directory you want and substitute your installation directory for /opt/vpa in the commands that follow. Prerequisites: You must install IBM JRE 1.5.x, Eclipse SDK 3.3.x, GEF 3.3.x, and CDT 4.0.x before the installation. 1. Create a VPA directory with the command mkdir /opt/vpa, 2. Download the AIX version of VPA update site from http://www.alphaworks.ibm.com/tech/vpa. 3. Go to the directory with the command cd /opt/vpa. 4. Decompress the file using the command gunzip vpa-update-site-6.2.0-aix-ppc.tgz or the command tar -xvf vpa­ update-site-6.2.0-aix-ppc.tar. The other steps to install the VPA plugins are similar to the steps on Windows, refer to step 3~ step9 from VPA Update Site Installation. Note: If you install the Code Analyzer plugin, you must type the following command before running Eclipse: export LIBPATH=/opt/eclipse/plugins/com.ibm.vpa.ca.fdprpro.aix_2.2.2/os/aix DLM Alphaworks Page 22 of 374 IBM STG - Performance Visual Performance Analyzer 4.Collecting Performance Data VPA is a collection of performance data analysis tools. It relies on platforms to provide the necessary tools for collecting data and converting the data into a format that is supported by VPA. 4.1Using Platform Tools Visual Performance Analyzer works with the following tools for collecting profile data.         AIX Tprof Performance Inspector for Windows Tprof IBM JRE Java profiling tools Linux oProfile on x86, ppc and Cell BE AIX hpmcount or hpmstat Linux Cell Performance Counter on Cell BE Linux Performance Debugging Tools on Cell BE AIX Gprof Profile data from AIX tprof is converted into XML file by using the –X flag. The .etm file is the XML file for Profile Analyzer; the .etz file is the zipped XML profile data. Profile data from PI Tprof is in a .out format, which Profile Analyzer supports directly. Java profile data form IBM JRE Java profiling tools are merged to the preceding tools. Pipeline data is generated from tools found in the IBM Performance Simulator for Linux on POWER™ project on Alphaworks. The .pipe file is the pipeline data file and the .config file is its default configuration file. DLM Alphaworks Page 23 of 374 IBM STG - Performance Visual Performance Analyzer binary files Visual Performance Analyzer AIX hpmcount / hpmstat Cell BE Linux cpc Linux oProfile Code .opm file Counter Analyzer .pmf file Analyzer Windows Performance Inspector Tprof Command .out file Profile Analyzer Pipeline Analyzer .pipe and .config files AIX Tprof Command Call Tree Analyzer .etm file Trace Analyzer Performance Simulator Project Performance Inspector JProf CELL BE PDT .remote file AIX gprof Figure 16 The performance tools and input files of VPA � 4.2Setting up Windows to collect Profiling data 4.2.1Verify that your Java Runtime is installed on your system Run the following command: java –version You should see something similar to the following: java version "1.4.2" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2) DLM Alphaworks Page 24 of 374 IBM STG - Performance Visual Performance Analyzer Classic VM (build 1.4.2, J2RE 1.4.2 IBM Windows 32 build cn142-20050609 (JIT enabled: jitc)) Note: You need version 1.4.1 or higher 4.2.2Verify that the Windows performance tools are installed VPA runs with the Performance Inspector for Windows performance tools. Run the following command: Swtrace -? You should see something similar to the following: D:\>swtrace -? SWTRACE Version: 7.1.1 Valid SWTRACE commands: … The Performance Inspector for Windows package can be downloaded from here: http://www.alphaworks.ibm.com/tech/pi 4.2.3Verify PI Tprof Using the PI tools themselves, you can verify their operation by capturing a system trace using these steps: 1. Swtrace init 2. Swtrace enable Tprof 3. Swtrace on 4. Swtrace off 5. Swtrace get 6. Swtrace post 7. Post Then you should have a PI profile (.out file) in your working directory that you can find. Refer to PI documentation for details on PI tools. You can capture traces yourself or you can configure VPA to collect traces. Refer to the Profile Analyzer plug-in section in this document. 4.2.4Copying data files Running the Performance Inspector for Windows Tprof produces an ascii profile (.out) file. You can simply use FTP to transfer the file to your system running VPA or open the profile locally if you have VPA installed on the same system. See section 4.5 about using Remote Connections View. DLM Alphaworks Page 25 of 374 IBM STG - Performance � Visual Performance Analyzer 4.3Setup up AIX to collect Profiling data 4.3.1Verify that your Java Runtime is installed on your system Run the following command: java –version You should see something similar to the following: java version "1.4.1" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1) Classic VM (build 1.4.1, J2RE 1.4.1 IBM build cxppc321411-20040301 (JIT enabled: jitc)) Note: You need version 1.4.1 or higher 4.3.2Verify that the AIX performance tools are installed Recent versions of AIX Tprof can generate XML profiles. AIX 5.3.TL5 or higher is required. The utility that produces a VPA profile from the Tprof output is bundled with the bos.perf.tools package. It includes an updated version of Tprof, Symlib and the added tprof2xml utility. Verify the installation of bos.perf.tools package: lslpp –L bos.perf.tools | grep “bos.perf” If not installed, you can use smitty or installp 4.3.3Verify AIX Tprof Using the AIX tools, you can verify their operation by capturing a system trace using these steps: tprof -eukj -X -A -F -r vpa_test -x sleep 5 Then you should have a Tprof profile (vpa_test.etm file) in your working directory that you can find. You can capture traces yourself or you can configure VPA to collect traces. Refer to the Profile Analyzer plug-in section in this document. 4.3.4Verify AIX Gprof Using the AIX tools, you can verify their operation by capturing a system trace by following these steps: Example: To profile the sample program dhry.c: 1. �Recompile the application program with the cc –pg command as follows: � Type cc –pg dhry.c –o dhry command which means to recompile to produce gprof output file. � DLM � Alphaworks Page 26 of 374 IBM STG - Performance � Visual Performance Analyzer 2. Run the recompiled program. A file named gmon.out is created in the current working directory (not the directory in which the program executable resides). Type dhry command which means to execute the program to generate gmon.out file. 3. �Run the gprof command in the directory to produce the call graph file. Type gprof –c command which means to generate gprof.remote file. 4.3.5Copying data files All versions of AIX support FTP. So, once AIX Tprof has produced the XML profile file(.etm file), you can simply use FTP to transfer the file to your system running VPA. If VPA has been installed on the same AIX system you can open the profile locally. See section 4.5 about using Remote Connections View. 4.4Collecting Profiling Data on Linux platform 4.4.1Linux Cell BE     Hardware: Cell BE blade Software Requirement: fedora core 7 and Cell BE SDK 3.0 installation Verify oprofile 0.9.3: opcontrol/opreport –X Tool Usage: After verifying that oprofile has been installed successfully, you should first use “opcontrol --init” to initialize oprofile module; then, use “opcontrol –event=event:count” to add an event to measure for the hardware performance counters (you can refer to the event names and minimal counters by using “opcontrol –l”). Next, use “opcontrol --separate=all” to separate the samples based on the given separator. It is not an optional step, you must process it to meet VPA requirement. You can use “opcontrol --start” to start collecting profiling data, and start one user application; then, use “opcontrol --stop” to stop collecting profiling data; then use “opcontrol --dump” to force a flush of the collected profiling data to oprofile daemon process; Finally, use “opreport -X –g –l –d –o xxx.opm” to generate a specified XML output, which can be imported to Profile Analyzer. The xml output file must be suffixed with the extension ‘.opm’, which identifies an acceptable file. The arguments ‘-g’ is not compulsory, but when user application is compiled and built with debug info, to add the ‘-g’ helps to present source line and other debug info in XML file. You can use “opcontrol --reset” to clear out data, and choose “opcontrol --deinit” to unload the oprofile module.  event mode and timer mode Oprofile has two profiling modes to take process samples, event mode and timer mode. For a number of processors, oprofile supports to be aware of their hardware events and record samples when the events take place, which is called event mode. For those processors not supported by oprofile, there is only timer mode available. In timer mode, oprofile has its profiling samples based on system timer. We can manually set oprofile to enter timer mode even if it is installed on Linux equipped with supported processor, and this involves the configuration for oprofile module parameters. DLM � Alphaworks Page 27 of 374 IBM STG - Performance � Visual Performance Analyzer In event mode, there are some parameters for each hardware events to be specified. You set the event count and masks to control the profiling actions. The sample frequency, data accumulation speed, and sample filtering are also impacted by the parameters. In timer mode, you don’t have to configure any parameter for timer. Oprofile accepts system default timer settings. No mask or filter settings are valid in timer mode. For detailed event mode and timer mode specifications, you can refer to oprofile online documents. Some important command usages: opcontrol --init : Load the oprofile module and oprofilefs opcontrol –event=event_name:count:unit_mask:kernel-space_count:user-space_count : Choose an event with specified event_name, count, unit_mask, kernel-space counting, user-space counting. Here, the unit_mask, kernel-space counting, user-space counting are optional. A default event can be specified with the command “opcontrol –event=default “. Generally, the default event is the system timer of the OS and hardware. opcontrol -l : List event types and unit masks opcontrol –c=#depth : Enable call graph sample collection with a maximum depth. opcontrol --start/--stop/--reset/--deinit : Start running the oprofile, stop oprofile, reset the profile data in default session. Unload oprofile module. opreport -X –g –d –l xxx.opm : Generate a specified XML output (Here: -X: specify the output file in XML format. -g: show source file and line for each symbol. -l: list per-symbol information instead of a binary image summary. -d: show per-instruction details for all selected symbols. -c: show call graph.) 4.4.2Linux PowerPC     Hardware: System p servers or POWER blade Software Requirement: Linux, oprofile (oprofile download Link: http://oprofile.sourceforge.net) Verify oprofile: the same as Cell BE Tool Usage: the same as Cell BE 4.4.3Linux X86  Hardware: X86 based machine DLM � Alphaworks Page 28 of 374 IBM STG - Performance Visual Performance Analyzer    Software Requirement: Linux, oprofile (oprofile download Link: http://oprofile.sourceforge.net) Verify oprofile: the same as Cell BE Tool Usage: the same as Cell BE 4.5Using Remote Connections View You can configure VPA to use Remote Connections view to remotely collect profile data and transfer files. Through the Remote Connections view, firstly, you can create a connection to the remote system, and then create configurations on launching profile data collection remotely towards different systems and profiling tools. 4.5.1Create a remote connection The following steps illustrate how to configure a remote connection.  Open Remote Connections view Launch VPA, open Profile Analyzer or Counter Analyzer perspective, and Remote Connections view appears by default. Or choose Window -> Show view -> Other -> Visual Performance Analyzer -> Remote Connections to open it. Figure 17 Open the Remote Connections view  Configure connection parameters 1. Press "Create Connection button on the upper-right of Remote Connections view. A Connections Properties dialog pops up for fill in. 2. Fill in parameters in the Connections Properties dialog. DLM Alphaworks Page 29 of 374 IBM STG - Performance Visual Performance Analyzer Figure 18 Configure the remote connection properties Host: Generally specify the IP address of the remote system here Port: Because Remote Connections view connects to remote system with SSH protocol, "22" is the default port value here. User: Specify a user name on the remoter system. (Note: You can input the value root here for AIX system or Linux system, or you can input the value administrator here for Windows system, because a user with root or administrator access right can run most system performance counters or tools.) System: Specify the type of the remote system. There are three system types: AIX, Linux and Windows system. 3. Specify a public key file if you set up an authentication between remote SSH server and local VPA. The option Use public-key authentication is optional, and it is required in case that you want to set up a publickey authentication between remote OpenSSH server and local SSH client. Please refer to how to set up a public key authentication. 4. Click the OK button to finish creating a connection. 4.5.2Create a profile configuration 4.5.2.1Create a profile configuration for remote AIX system—Profile Analyzer Right click the Profile Analyzer over remote AIX connection node, and choose the Create Configuration... menu. Then a wizard pops up and leads you step by step to create a remote profiling configuration: DLM Alphaworks Page 30 of 374 IBM STG - Performance Visual Performance Analyzer Figure 19 Create a profile configuration for Profile Analyzer 1. The Profiling tool type and its working directory Profiling tool location are set by default. Select a CPU type in CPU drop-down list, and click Next. Figure 20 Specify the kind of CPU 2. Define related parameters in the following dialog if an application must be launched along with the profiling. If the application is a Java application running an IBM Virtual Machine for Java, select the Enable Java profiling checkbox, which defines the IBM_JAVA_OPTIONS environment variable for the Java process being started, so that JIT-compiled Java methods are profiled. Note: The fields in the following wizard page are optional, leaving them as blank runs profiling in system wide mode. DLM Alphaworks Page 31 of 374 IBM STG - Performance Visual Performance Analyzer Figure 21 Define related parameters Click the ... button to launch an SSH session with the remote system, and browse the file system on remote site. DLM Alphaworks Page 32 of 374 IBM STG - Performance Visual Performance Analyzer Figure 22 launch an SSH session with the remote system 3. Configure the start and stop qualifier options for profiling DLM Alphaworks Page 33 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 23 Configure the start and stop qualifier options for profiling  Start qualifier options: Choose to profile manually by selecting the Manual radiobox or to profile with the application by selecting the With application radiobox(Only Manual profiling option is available unless users specify an application on the second wizard page to launch with).  For fully automatic profiling, choose With application:  If you leave the entry fields Delay profiling start by (seconds) and Profile for with the default value 0, you can start the profiling immediately by clicking the Run icon . If you enter the time in the Delay profiling start by (seconds) entry field, you can run your application automatically but give it a predetermined time to "warm up" before the profiling begins. Thus the profiling will be delayed for a specified time after you click the Run icon. You can also input the time in the Profile for field to specify how long the profiling lasts and halts automatically in the specified time period. The default value for Profile for field is 0 which means the profiling keeps on running without time limit.    For fully manual profiling, choose Manual option. You can define the time for profiling in Profile for entry field to specify how long the profiling lasts. The default value of the time is also 0 which means to start the profiling immediately by clicking the Run icon .  Stop qualifier options: DLM � Alphaworks Page 34 of 374 IBM STG - Performance Visual Performance Analyzer You can choose With application option and input the number of successive run in the field Number of successive run(s) according to your needs and choose to terminate profiling when the application exists or terminate application when the profiling exits. Important note: If you choose running profiling with an application, you have to ensure the tprof profiling tool be aware of where is the application. This could be achieved by setting the PATH environment variable on remote system. 4. Configure time or event based sampling options. Choose supported CPU events to profile in the drop-down list of CPU event, or keep the default choice which is System timer. Figure 24 Configure time or event based sampling options In addition, there are also some options for AIX tprof XML converter could be configured. These options affect the collected profiling data which is output as XML format. 5. Click Finish, a new configuration node appears under the Profile Analyzer product node of the remote connection to AIX system. 4.5.2.2Create a profile configuration for remote AIX system—Counter Analyzer DLM Alphaworks Page 35 of 374 IBM STG - Performance Visual Performance Analyzer Right click the Counter Analyzer node under remote AIX connection node, and choose Create Configuration... on the context menu as the following picture. A wizard pops up and leads users step by step to create a configuration for remote performance counter data collection: Figure 25 Create a profile configuration for Counter Analyzer 1. Specify a profiling tool in Profiling Tool drop-down list and an appropriate processor type for remote AIX system in Processor Type drop-down list. AIX system supports two types of performance profiling tool for Power processors: hpmcount and hpmstat profiling tool. The tmp directory is a default directory to run performance profiling tool. File Prefix text box is optional to fill in, users can define custom prefix for the performance collecting data file. Then click Next. Figure 26 Specify the tool and processor type � DLM Alphaworks Page 36 of 374 IBM STG - Performance Visual Performance Analyzer 2. Define related parameters for an application launched with the performance profiling tool. You must define an application in Application textbox when you select hpmcount in Profiling tool textbox, but if you select hpmstat, the Application field is optional to complete. Then click Next. Figure 27 Define related parameters for an application launched with the profiling tool 3. Configure other options. If you select hpmcount tool in the preceding configuration dialog, a dialog appears as follows: DLM Alphaworks Page 37 of 374 IBM STG - Performance Visual Performance Analyzer Figure 28 Specify the count mode and time base for the hpmcount tool If you select hpmstat tool in the preceding configuration dialog, a dialog appears as follows: DLM Alphaworks Page 38 of 374 IBM STG - Performance Visual Performance Analyzer Figure 29 Specify the options for the hpmstat tool • hpmcount o Count Mode Specify a set of Events Groups or Event Sets to count in the form of names or numbers separated by a comma. Click Show Event Groups or Show Event Sets to get the event group list or event set list from the remote system. o Time Base Timebase mode is not visiable on the preceding wizard page, however, the default value is timebase if neither Enable PURR mode nor Enable SPURR mode checkbox is selected. Select a base for data normalization. The available bases are: time timebase purr PURR time (when available) spurr SPURR time (when available) DLM Alphaworks Page 39 of 374 IBM STG - Performance Visual Performance Analyzer • hpmstat Basically, the hpmstat configuration is the same with hpmcount configuration on Count Mode and Time Base configuration. Interval and Count are two additional options required by hpmstat.Click Show Help to get the event group list or event set list from the remote system. o Interval Input the value of counting time interval and select the unit of time. Its default value is 1 second. o Count You can specify the number of iterations to count the event groups and event sets in this field. 4. Click Finish, and you will see a new configuration node appears under the Counter Analyzer product node of the remote connection to AIX system. 4.5.2.3Create a profile configuration for remote AIX system—Code Analyzer The process to create a configuration on AIX system is basically as same as the process on Linux system. Right click the Code Analyzer node under the remote AIX connection node, and choose Create Configuration... on the context menu. A wizard pops up and leads users step by step to create a configuration for remote profile data collection: Figure 30 Create a profile configuration for Code Analyzer 1. Specify a configuration name in the Configuration Name field and select a kind of processor in the Processor Type drop-down list, and then click Next. Currently only FDPR-Pro performance tool is supported for Code Analyzer configuration, so the Performance Tool has only one selection which is FDPR-Pro. Besides, only ppc processors are now supported for Code Analyzer. DLM Alphaworks Page 40 of 374 IBM STG - Performance Visual Performance Analyzer Figure 31 Specify the name, tool, and processor 2. Specify the application to launch with, the parameters and the working directory of the application in Application, Parameters and Working Directory fields respectively. Note: All the three fields must be completed. Figure 32 Specify the application, parameters and working directory � DLM Alphaworks Page 41 of 374 IBM STG - Performance Visual Performance Analyzer 3. Select the option Add workload data and you can see the Add file... button is enabled. Click the button to add a file remotely or locally and then the name of the added file is displayed in the first text area in the following dialog: Figure 33 Add workload data The following picture shows the layout of the remote file system after clicking Add file... button: DLM Alphaworks Page 42 of 374 IBM STG - Performance Visual Performance Analyzer Figure 34 The layout of the remote file system You don't need to complete the Optimization Options text area on AIX system because the area is disabled. You can click the button Options to get instructions as follows on how to write the options: DLM Alphaworks Page 43 of 374 IBM STG - Performance Visual Performance Analyzer Figure 35 The optimization options help 4. Click Finish, and a new configuration appears under the Code Analyzer product node of the remote connection to AIX system. 4.5.2.4Create a profile configuration for Linux system—Profile Analyzer Right-click the Profile Analyzer node over the remote Linux connection node, and choose the Create Configuration... menu. A wizard pops up and leads you step by step to create a remote profiling configuration: Figure 36 Create a profile configuration for Profile Analyzer 1. Select a CPU type in CPU drop-down list, click Next. Profiling tool type and its working directory Profiling tools location are set by default. Note: Linux system use OProfiles as profiling tool. DLM Alphaworks Page 44 of 374 IBM STG - Performance Visual Performance Analyzer Figure 37 Specify a system and CPU type 2. Define related parameters if an application must be launched along with the profiling. Note: The fields on this wizard page are optional, leaving them as blank runs profiling in system wide mode. DLM Alphaworks Page 45 of 374 IBM STG - Performance Visual Performance Analyzer Figure 38 Define related parameters for the application launched with profiling Pressing the ... button to launch an SSH session with the remote system, and browse the file system on remote site. 3. Configure the start and stop qualifier options for profiling DLM Alphaworks Page 46 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 39 Configure the start and stop qualifier options for profiling  Start qualifier options: Choose to profile manually by selecting the Manual radiobox or to profile with the application by selecting the With application radiobox(Only Manual profiling option is available unless users specify an application on the second wizard page to launch with).  For fully automatic profiling, choose With application:  If you leave the entry fields Delay profiling start by (seconds) and Profile for with the default value 0, you can start the profiling immediately by clicking the Run icon . If you enter the time in the Delay profiling start by (seconds) entry field, you can run your application automatically but give it a predetermined time to "warm up" before the profiling begins. Thus the profiling will be delayed for a specified time after you click the Run icon. You can also input the time in the Profile for field to specify how long the profiling lasts and halts automatically in the specified time period. The default value for Profile for field is 0 which means the profiling keeps on running without time limit.   DLM � Alphaworks Page 47 of 374 IBM STG - Performance � Visual Performance Analyzer  For fully manual profiling, choose Manual option. You can define the time for profiling in Profile for entry field to specify how long the profiling lasts. The default value of the time is also 0 which means to start the profiling immediately by clicking the Run icon .  Stop qualifier options: You can choose With application option and input the number of successive run in the field Number of successive run(s) according to your needs and choose to terminate profiling when the application exists or terminate application when the profiling exits. 4. Configure time or event based sampling options. Choose supported CPU events to profile in the drop-down list of CPU event, or keep the default choice which is System timer. In addition, there are also some options for Linux OProfiles XML converter could be configured. These options affect the collected profiling data which is output as XML format.  x86 and AMD64 CPUs Figure 40 Configure time or event based sampling options for x86 and AMD64 CPUs � DLM � Alphaworks Page 48 of 374 IBM STG - Performance Visual Performance Analyzer  Cell BE Figure 41 Configure time or event based sampling options for Cell BE 5. Click Finish, and a new configuration node appears under the Profile Analyzer product node of the remote connection to Linux system. 4.5.2.5Create a profile configuration for Linux system—Counter Analyzer Right-click the Counter Analyzer node under remote Linux connection node, and choose Create Configuration... in the context menu. A wizard pops up and leads user step by step to create a configuration for remote performance counter data collection: DLM Alphaworks Page 49 of 374 IBM STG - Performance Visual Performance Analyzer Figure 42 Create a profile configuration for Counter Analzyer 1. Currently, for Counter Analyzer configuration on remote Linux system, only Cell BE performance counter (CPC) for Cell BE is supported, so the Profiling Tool has only one option which is CellPerfCount. By default, tmp directory is set as the directory to run performance counting, so the Tool Location field is grey in the following picture. File Prefix is optional to complete with parameter, users can define custom prefix for the performance collecting data file. Figure 43 Specify the configuration name and processor type 2. Choose a monitor mode for CPC in Monitor Mode group. If Monitor CPUs with application is selected, then users must define an application and its arguments to be executed; if Monitor CPUs on system wide mode is selected, users must specify the time duration to monitor in Monitor for (seconds) field, and which Cell BE nodes to monitor(Monitor all Cell BE nodes is selected by default). Please refer to concept Cell BE nodes . DLM Alphaworks Page 50 of 374 IBM STG - Performance Visual Performance Analyzer Figure 44 Specify a monitor mode 3. Configure other more options: DLM Alphaworks Page 51 of 374 IBM STG - Performance Visual Performance Analyzer Figure 45 Specify the events to count • Event list There are 4 counters available in CPC. So 4 events are grouped as an event set. Users can define several event sets in Event list. Please refer to task: Define CPC event sets list . • Switch timeout After defining some event sets, all the defined event sets are loaded into the kernel. The kernel runs each event set for a specific amount of time defined by Switch timeout. Users can specify the timeout value and its unit in the fields of Switching event set for timeout value. DLM Alphaworks Page 52 of 374 IBM STG - Performance Visual Performance Analyzer • Count interval To specify the sampling interval time for CPC, select the check box Interval, and specify a value and type accordingly. The default value of interval time is 100000000. Note: If the value of the interval time is not very large, it causes collecting a great number of performance counter data. • Sampling mode The selection of sampling mode indicates the type of data stored to the sampling buffer. If Interval is selected and specified, the Sampling mode drop-down list is grey and the default option of sampling mode is Store counter values. • Count mode Define CPC count in a specified mode by selecting in the drop-down list Count mode. 4.5.2.6Create a profile configuration for Linux system—Trace Analyzer Trace Analyzer currently is aimed to support Performance Debugging Tool(PDT) on Cell BE, and for the Remote Connection of Trace Analyzer, only a prototype is implemented in VPA 6.0. Right-click the Trace Analyzer node under the remote Linux connection node, and choose Create Configuration... in the context menu. A wizard pops up and leads user step by step to create a configuration for remote trace data collection: 1. Specify a name in Configuration Name field. For Trace Analyzer configuration on remote Linux system, only Cell BE performance debugging tool (PDT) for Cell BE is supported. So there is only one choice in Performance Tool drop-down list. DLM Alphaworks Page 53 of 374 IBM STG - Performance Visual Performance Analyzer Figure 46 Specify the tool and CPU type 2. Specify the application and its parameter to run with PDT tracing. You can specify the application and parameter in Application and Parameters text box. If the application is a 64-bit application, please select the check box Is 64bit application to indicate it. DLM Alphaworks Page 54 of 374 IBM STG - Performance Visual Performance Analyzer Figure 47 Specify the application to run with PDT 3. To run the PDT tracing enabled application remotely, users must set a couple of environment variables for the remote system in the following dialog. These environment variables include: LD_LIBRARY_PATH Click Add a library... button to browse the remote system and to locate the library required in running PDT tracing in the Traced library location field. Users can specify several libraries in this field, and the name of the libraries are separated by ":" automatically. PDT_KERNEL_MODULE A default location path /usr/lib/modules/pdt.ko which is displayed in the following picture is specified in PDT Kernel module location field. Users can change it according to his customize setting. PDT_CONFIG_FILE This variable requires a PDT configuration file. Users can specify a file in PDT configuration file field. PDT_OUTPUT_PREFIX User can specify customize file prefix in Prefix of the output files field through this variable. DLM Alphaworks Page 55 of 374 IBM STG - Performance Visual Performance Analyzer Figure 48 Specify the path to configuration file of running PDT 4.5.2.7Create a profile configuration for Linux system—Code Analyzer Right-click the Code Analyzer node under the remote Linux connection node, and choose Create Configuration... in the context menu. A wizard pops up and leads users step by step to create a configuration for remote profile data collection: Figure 49 Specify the profile configuration for Code Analzyer 1. Specify a configuration name in Configuration Name field and select a kind of processor in Processor Type drop-down list, and then click Next. Currently only FDPR-Pro performance tool is supported for Code Analyzer configuration, so the Performance Tool has only one selection which is FDPR-Pro. Besides, only ppc processors are now supported for Code Analyzer. DLM Alphaworks Page 56 of 374 IBM STG - Performance Visual Performance Analyzer Figure 50 Specify the configuration name and processor type 2. Specify the application to launch with, the parameters and the working directory of the application in Application, Parameters and Working Directory fields respectively. Note that all the three fields must be completed. DLM Alphaworks Page 57 of 374 IBM STG - Performance Visual Performance Analyzer Figure 51 Specify the application to execute � 3. Select the option Add workload data and you can see the Add file... button is enabled. Click the button to add a file remotely or locally and then the name of the added file is displayed in the first text area in the following dialog: Figure 52 Add workload data The following picture shows the layout of the remote file system after clicking Add file... button: DLM Alphaworks Page 58 of 374 IBM STG - Performance Visual Performance Analyzer Figure 53 The layout of the remote file system Note: You don't need to complete the Optimization Options text area on Linux system because this area is disabled on Linux system 4. Click Finish, and a new configuration node appears under the Code Analyzer product node of the remote connection to Linux system. 4.5.2.8Create a profile configuration for remote Windows sytem Right-click the Profile Analyzer over the remote Windows connection node, and choose the Create Configuration... menu. A wizard pops up and leads you step by step to create a remote profiling configuration: Figure 54 Create a profile configuration for Profile Analyzer 1. Performance Inspector tprof is defined as the default profiling tool for Windows system. Press button to browse remote Windows file system and choose the installation location of Performance Inspector tprof. Then select a CPU type in CPU drop-down list. Click Next. DLM Alphaworks Page 59 of 374 IBM STG - Performance Visual Performance Analyzer Figure 55 Specify the CPU type and the path where the tprof tool installed 2. Define related parameters if an application must be launched along with the profiling. If the application is a Java application running an IBM Virtual Machine for Java, select the Enable Java profiling checkbox, which defines the IBM_JAVA_OPTIONS environment variable for the Java process being started, so that JIT-compiled Java methods are profiled. Note: The fields on this wizard page are optional to complete. Leaving them as blank will runs profiling in system wide mode. DLM Alphaworks Page 60 of 374 IBM STG - Performance Visual Performance Analyzer Figure 56 Define related parameters for the application to be launched with profiling 3. Configure the start and stop qualifier options for profiling Figure 57 Configure the start and stop qualifier options for profiling DLM Alphaworks Page 61 of 374 IBM STG - Performance Visual Performance Analyzer  Start qualifier options: Only Manual profiling option is available on Windows system. You can define the time for profiling in Profile for entry field to specify how long the profiling lasts. The default value of the time is 0 which means to start the profiling immediately by clicking the Run icon . 4. Configure time or event based sampling options. Choose supported CPU events to profilein the drop-down list of CPU event, or keep the default choice which is System timer. Figure 58 Configure time or event based sampling options 5. Click Finish, a new configuration node appears under the Profile Analyzer product node of the remote connection to Windows system. 4.5.3Define CPC event sets To open the following dialog, please refer to Create a configuration for remote Linux system--Counter Analyzer 1. Click Add an event set... button to start up a CPC event set creating dialog. Assign event to each specific CPC counter. To skip a counter, just leave the fields blank. Note: The input character C means counting all clock cycles without regard to events. DLM Alphaworks Page 62 of 374 IBM STG - Performance Visual Performance Analyzer Figure 59 Specify the events to count 2. Click OK in Create a CPC event set dialog. You can see a CPC event set is added into the Event list. Multiple event sets can be added. You can also select an existed event set to edit it by clicking Edit... button or remove it by clicking Remove... button. DLM Alphaworks Page 63 of 374 IBM STG - Performance Visual Performance Analyzer Figure 60 Specify the events to count 4.5.4SSH public key authentication Here are the steps to follow when you set up an SSH public key authentication: 1. On AIX or Linux OS, go to .ssh folder 2. Execute command: ssh-keygen -t rsa 3. Execute command: cat id_rsa.pub >> authorized_keys DLM Alphaworks Page 64 of 374 IBM STG - Performance � Visual Performance Analyzer 4.5.5Launch remote profiling on AIX or Linux system 1. Choose a configuration node of Profile Analyzer in Remote Connections view. Then the start tool button is enabled and the progress bar is ready. If the configuration defines an application to run with the profiling tool, the launch application button is also enabled. Figure 61 Select a configuration node for launching 2. Start profiling • Click start tool button to launch remote profiling. According to the configuration, the profiling can stop automatically after running for a specified period of time, users also can click stop tool button to halt the remote profiling manually. Click launch application button to start application manually if an application is defined with the profiling. According to the configuration, the profiling tool might start after the application has been running for a predefined time. • 3. When the profiling is finished, the result data files are downloaded to local VPA automatically. Double-click the result data file in Remote Connections view to launch Profile Analyzer to open it. Figure 62 Double-click the result data file to open it in Profile Analyzer 4.5.6Launch remote hpmcount or hpmstat profiling tool 1. Choose a configuration node of Counter Analyzer in Remote Connections view. Then start tool button is enabled and the progress bar is ready. If the configuration defines an application to run with the performance counter, the launch application button is also enabled. Figure 63 Select a configuration node for launching 2. Start performance counter DLM � Alphaworks Page 65 of 374 IBM STG - Performance � Visual Performance Analyzer • Click start tool button to launch remote performance counter. According to the configuration, the performance counter can stop automatically after running for a specified period of time, users also can click stop tool button to halt the remote performance counter manually. Click start application button to start application manually if an application is defined with the performance counter. • 3. When counting performance data is finished, the result data files are downloaded to local VPA automatically. Double-click the result data file in Remote Connections view to launch Counter Analyzer to open it. Figure 64 Double-click the result data file to open it in Counter Analyzer 4.5.7Launch remote CPC 1. Choose a configuration node of Counter Analyzer in Remote Connections view. Then start tool button is enabled and the progress bar is ready. 2. Click start tool button to launch remote performance counter 3. When counting performance data is finished, the result data files are downloaded to local VPA automatically. Double-click the result data file in Remote Connections view to launch Counter Analyzer to open it. Figure 65 Double-click the result data file to open it in Counter Analyzer 4.5.8Launch PDT on remote Linux Cell BE system 1. Choose a configuration node of Trace Analyzer in Remote Connections view. Then launch application button is enabled and the progress bar is ready. 2. Start traced enabled application • Click launch application button to start application manually DLM � Alphaworks Page 66 of 374 IBM STG - Performance Visual Performance Analyzer 3. After the application is finished, the result data files are downloaded to local VPA automatically. Double click the result data file in Remote Connections view to launch Trace Analyzer to open it. 4.6Collecting Pipeline data on PowerPC Pipeline Analyzer is a port of the IBM Performance Simulator for Linux on POWER™, another alphaWorks technology. Please refer to the directions given by this project for collecting pipeline data. While VPA provides the Pipeline data analysis tool, the project provides the tools necessary for collecting and generating Pipeline data files. DLM Alphaworks Page 67 of 374 IBM STG - Performance Visual Performance Analyzer 5.Using the VPA analysis tools This section describes the use of each plug-in. The structure of the system by first is focused on some typical usage scenarios where various tasks are performed, then outlines the major components of the system and their interactions. You can find this information by selecting Help - Help Contents within VPA. To get context sensitive help, press F1 for Windows and AIX or press Ctrl+F1 for Linux. 5.1Profile Analyzer Profile Analyzer is a tool that allows you to navigate through a system profile, looking for performance bottlenecks. It provides a powerful set of graphical and text-based views to allow you to narrow down performance problems to a particular process, thread, module, symbol, offset, instruction or source line. It supports the profiles generated by Performance Inspector (tprof) and AIX tprof. It also merges IBM JRE Java profile data when it is merged into the preceding profiles. To load huge profile data files and reduce memory footprint, Profile Analyzer now uses database to cache profile files. The current version supports DB2 and an embedded database. You can also find the Profile Analyzer User Guide within VPA. Select Help - Help Contents within VPA. To get context sensitive help, press F1 for Windows and AIX or press Ctrl+F1 for Linux. 5.1.1Load an Existing Profile When you first start Visual Performance Analyzer, close Welcome view and then select Profile Analyzer to load the plug-in. DLM Alphaworks Page 68 of 374 IBM STG - Performance Visual Performance Analyzer Click on Tools, Select Profile Analyzer Figure 66 Click on the toolbar to select Profile Analyzer You can also load Profile Analyzer perspective by choosing Window - Open Perspective - Other - Profile Analyzer. If you already have profiles generated by TPROF or a Profile Analyzer compatible XML-based profile generator, you can select File – Open File to open the profile that Profile Analyzer supports. Profile Analyzer profiles must have one of the following extensions: o o .out, .etn, or .etm .opm and .opz In VPA a profile data file loading process is able to run as a background runnable job. When VPA is loading a file, you can click the button Run in Background to put the loading job to run in the background. While the loading job is running in the background, you can use Profile Analyzer to view the already loaded profile data files, or event start another loading job at the same time. As formerly stated, profile data files are loaded into database tables and kept in database tables until you delete them. Once a profile data file is successfully loaded into a database, further attempts to load the same data file will DLM Alphaworks Page 69 of 374 IBM STG - Performance Visual Performance Analyzer result in the data being reloaded directly from the database tables. Profile Analyzer does not need to read and parse the original file again, which allows for much faster loading of profile data into VPA after the initial database caching. Note: Although further use of a profile data file results in loading from the database, the original file is still required for Profile Analyzer to work properly. This is because not all of the content of the original file is loaded into database tables. For example, time data is kept in original file and we only store the offset and length information in database tables. When needed, this data is read from the original file on-demand. 5.1.2Profile Navigation The following tasks are those you can perform to navigate the profiles within Profile Analyzer. 5.1.2.1Navigate process hierarchy The Process hierarchy view appears by default in the top center pane. It shows an expandable list of all processes within the current profile. You can expand a process to view its module, thread and etc. You can also view the profile in the form of thread or module and etc. Actually, you can define the hierarchy view by right clicking profile and choose Hierarchy Management. Thread data is not available in merged profiles (.etm extension). For more information about Hierarchy Management, you can refer to Navigate Generic Hierarchy Model The following screen capture shows a process hierarchy in its unexpanded state: Figure 67 The layout of Profile Analyzer � DLM Alphaworks Page 70 of 374 IBM STG - Performance Visual Performance Analyzer As in most Profile Analyzer views, objects are sorted from the most to the fewest ticks. In this view you can see that the IdleProcess was the process with the most ticks, indicating either I/O delays or actual idle time during the process (for example, if the application being profiled ran on one CPU and the system had a second, mostly idle CPU). You can expand a process to view the threads or modules beneath it. As you select a process, thread, or module, the Symbol view updates to display the list of symbols that belong to that process, thread, or module. The Samples Distribution Chart also changes, as you select different processes or threads, to display the proportion of ticks used by the most important modules within the selected process or thread. The following two views show part of the preceding process in Process>Thread>Module hierarchy: Figure 68 The hierarchy editors 5.1.2.2Navigate generic hierarchy model A process may have some threads, and each thread can visit some modules (for instance, DLLs) and call procedures or methods (symbols) in these modules. In default condition, you can observe systems in the hierarchy of Process > Core > Thread > Module, Process > Thread > Module, Process-> Module and Module. With the function of generic hierarchy model, you can create your own hierarchy view. For example, if you want to group threads which use a common module, you can display the hierarchy Process > Module > Thread by creating it in the Hierarchy Management. Attach hierarchy file to profile file, please do the following steps: 1. 2. 3. Right-click in the process hierarchy view and choose Hierarchy Management Click the … button to open a new hierarchy file for attaching to the profile file Select the check-box to attach the new hierarchy file to the profile file DLM Alphaworks Page 71 of 374 IBM STG - Performance Visual Performance Analyzer Here are some pictures to show how to attach a new hierarchy file -“testing.xml”: Figure 69 Click … to open a new hierarchy file � DLM Alphaworks Page 72 of 374 IBM STG - Performance Visual Performance Analyzer Figure 70 Attach the new hierarchy file to the profile file � DLM Alphaworks Page 73 of 374 � IBM STG - Performance Visual Performance Analyzer Figure 71 Click Yes to attach the selected file � Figure 72 The selected file is attached � DLM Alphaworks Page 74 of 374 IBM STG - Performance � Visual Performance Analyzer To create your own hierarchy view, follow these steps: 1. � Right-click in the process hierarchy view and choose Hierarchy Management 2. � In the Hierarchy Management wizard, click New 3. Give your hierarchy a specific name if you like, or the system will generate a name for you. 4. � Select the element you want to have in your view. You may reorder your hierarchy by choosing Move up or Move down 5. � Click Apply or Ok Here is a new hierarchy view we create to see the threads under each module: Figure 73 The Hierarchy Management wizard for creating a hierarchy view Click the New button to create a new hierarchy view: DLM � Alphaworks Page 75 of 374 IBM STG - Performance Visual Performance Analyzer Type the name of the hierarchy name Add or remove the available levels Figure 74 Specify the hierarchy name and add or remove the levels When you click Apply or OK, you can see the change in Process Hierarchy View Figure 75 The new created hierarchy view You may add other hierarchy views in the Process Hierarchy View as you like. DLM Alphaworks Page 76 of 374 IBM STG - Performance Visual Performance Analyzer Figure 76 The new hierarchy view added with Thread->Module hierarchy To lines of editor symbol table, you must set the symbol threshold of the hierarchy. Symbol table contains symbols of the selected hierarchy node, but it does not list all the symbols by default. It often lists no more than a certain number of them. This is called threshold. After you set the threshold to another value, the symbol table is refreshed and the listed symbols are no more than the new threshold. The default threshold of editor symbol table is 100, that is, no more than 100 symbols is listed in the table whatever hierarchy node is selected. symbols not more than default threshold Figure 77 Change the threshold for the profile symbol � DLM Alphaworks Page 77 of 374 IBM STG - Performance Visual Performance Analyzer If you select the Change Profile Symbol Threshold item on the context menu, then it pops up the threshold box. Figure 78 The Profile Symbol Threshold dialog The default symbol threshold is 100. In the symbol table, there are no more than 100 rows listed. Figure 79 Change the profile symbol threshold If you set the threshold to new value 200, the editor symbol table is refreshed and the number of the rows is no more than 200. DLM Alphaworks Page 78 of 374 IBM STG - Performance Visual Performance Analyzer symbol rows no more than new threshold Figure 80 The refreshed editor with new threshold no more than 200 If you set the threshold to All, symbol table is refreshed and it lists all symbols of the threshold. Figure 81 Change the threshold to All 5.1.2.3Bucket Management You can manage your buckets settings by doing the followings: 1. 2. Right-click in the process hierarchy view and choose Buckets-> Bucket Management Attach a new bucket management file to the profile file DLM Alphaworks Page 79 of 374 IBM STG - Performance Visual Performance Analyzer 3. Create a new bucket or edit or remove an existing bucket by clicking the corresponding buttons Figure 82 The Bucket Management wizard If you have changed bucket definition, the current opened profile will be automatically reloaded. If there are any other opened profiles that are also affected by the new bucket definition, they will not be automatically reloaded. You must reload them manually. 5.1.2.3.1Add new bucket to existing one If there are existing buckets to group some components, you can add one to another bucket so as to further filter those components and get the views of those you really want. DLM Alphaworks Page 80 of 374 IBM STG - Performance Visual Performance Analyzer Figure 83 Add a new bucket If you clicked Bucket Management…, the bucket selection dialog pops up. Figure 84 Select a bucket to add with After selecting a bucket, the bucket property dialog pops up for you to modify the filters as follows: DLM Alphaworks Page 81 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 85 Specify the bucket properties You can add the filter requirement in this wizard and give this bucket a new name. In order to view this bucket in Process Hierarchy View, you may move it up in the Bucket Management. You can also edit, delete, enable, disable bucket or create new bucket group in Bucket Management. 5.1.2.3.2Create new bucket to group common components In a system, certain groups of modules, threads, or symbols share common components. For example, in a WebSphere java process, threads can be logically grouped by name, with one group containing threads with names like tid_WebContainer*, another with names like tid_Gc_Slave_Thread_*, and another with names like _tid_Alarm_*. Buckets function in Profile Analyzer provides mechanisms for you to group different components, such as objects, together into buckets. It acts as a new layer in the Hierarchy Process View. To create a new bucket, please follow these steps: 1. � Right-click in the process hierarchy view and choose Buckets-> Create new bucket 2. � Give the Bucket Name and choose the type of components you want to group from thread, process, module 3. � Give the components' name you hope to filter in the corresponding blank, such as tid* thread in thread filter 4. � Choose the Bucket Group you want to put your new bucket into. The following is a example of creating a bucket filtering tid* thread DLM � Alphaworks Page 82 of 374 IBM STG - Performance Visual Performance Analyzer Figure 86 Create a bucket filtering tid* thread Now you can define multiple filters for a bucket. You can see a new layer called test2 appeared in the thread hierarchy view. All the threads with the title beginning with tid remain as follows: Figure 87 The new layer test2 created in the thread hierarchy view 5.1.2.4Navigate Java package hierarchy DLM Alphaworks Page 83 of 374 IBM STG - Performance Visual Performance Analyzer You can view the Java package hierarchy for a process, thread, or module that contains JITted methods using the Java/Hierarchy view. The process or thread must contain a JITCODE module, and the module must be the JITCODE module. Note: The Java/Classes hierarchy view is normally displayed at the bottom right of VPA, along with the Disassembly/Offsets view, the Temporal Profiling view, and the Profiling Configurations Console view. If you cannot see it displayed, you can open it through Windows -> Show View -> Other -> Profile Analyzer-> Java/Classes hierarchy. To view the Java package hierarchy for a process, thread, or JITCODE module, click on the Java/Classes hierarchy tab, then navigate the Process hierarchy view. As you select different processes, threads, or modules, the Java/classes hierarchy is updated to show you the Java package hierarchy for any active methods. The following screen shot shows an initial view of the Java package hierarchy for a JITCODE module on an AIX profile: Figure 88 The Java/Classes hierarchy tab You can select a top-level name to view all methods in all packages that match that name (for example java). Or you can expand a top-level name to display the packages and classes beneath it. By selecting a package or class in the hierarchy, you can limit the list of displayed methods to those in that class or hierarchy. The following picture shows the active methods of the BufferedWriter class in the java.io package: DLM Alphaworks Page 84 of 374 IBM STG - Performance Visual Performance Analyzer Figure 89 The methods of the BufferedWriter class in the java.io package When you double-click on a method in the table at the right, the following views are updated to display information for that method: • • The OffsetAsm Information view The Disassembly Resolved Call Information view 5.1.2.4.1Notes on appropriate use of this view The Java/Classes Hierarchy view is useful if you are working on tuning the code for specific classes under your control and are trying to determine what bottlenecks exist in those classes. However, you should avoid the pitfall of focusing simply on the classes you have control over (classes that you can make source code changes to). For example, trying to tune the hottest method in a class you own may provide some benefit, but if that method takes only 1% of total ticks, while java/io/BufferedWrite.write takes 20%, you should have a look at what methods are calling java/io/BufferedWrite.write (using the Disassembly Resolved Call Information view on Windows, Linux­ IA32, and Linux-x86-64, or using a call graph profiling tool such as ITRACE on other platforms). Conversely, just because the methods in your packages do not show significant CPU usage does not mean they are efficiently written or have no impact on performance. For instance, if a significant percentage of profile time is spent in the JVM garbage collection library (e.g. libj9gc23.so or j9gc23.dll), this may indicate that you are making inefficient use of memory by allocating too many objects or failing to make them available to garbage collection when they are no longer needed. 5.1.2.5View counters In Counters view, you can view the ticks of process, thread, module or bucket you selected in the Process Hierarchy View. To open this view, choose Window -> Show View -> Others->Profile Analyzer -> Counters or just find it in left pane. DLM Alphaworks Page 85 of 374 IBM STG - Performance Visual Performance Analyzer For example, if you select a process in the Process Hierarchy View, you can see the total ticks of this process in the Counters view Figure 90 The ticks of the selected process in Counters view 5.1.2.6Select default counter Whenever you have a profile that contains more than one counter, Profile Analyzer allows you to choose the default counter used for sorting. Profile Analyzer supports the “sort by counter” feature. A user can select any available counter as the default “sort” counter. Once the default counter is selected, all editors and views understand this selection will sort or format their outputs according to the current active “sort” counter. The following picture shows this feature: DLM Alphaworks Page 86 of 374 IBM STG - Performance Visual Performance Analyzer Figure 91 Right-click a process and select a counter � Figure 92 The default counter is changed to IFC � DLM Alphaworks Page 87 of 374 IBM STG - Performance Visual Performance Analyzer 5.1.2.7View module sample distribution The Samples distribution chart shows a tick distribution for modules in the currently selected process or thread in the process hierarchy. Note: If you cannot see this view within the Profile Analyzer perspective, select Windows -> Open view -> Other ­ > Profile Analyzer -> Samples Distribution Chart. This view provides a starting point for determining where you should focus your attention. For example, the following screenshot shows the graph for a java process using 23.4% of total profile time: Figure 93 The sample distribution of a java process This graph shows that the JITCODE module (the module containing JIT-compiled Java methods) was the busiest module for this process, which suggests that some tuning of Java methods may be advisable. The following graph (for a different Java program on a different system) shows heavy activity both in j9gc22 and in JITCODE. j9gc22 is the Garbage Collection library of the IBM Virtual Machine for Java (J9) indicating that the application may be memory constrained, or may be creating new objects too frequently. The third column (j9jit22) is the JIT compiler library, indicating that the profile may not have run for a very long time, because long-running applications typically have a small percentage of time used by the JIT compiler library. (A high percentage of time in the JIT compiler library may also indicate excessive use of invoke_interface calls, which require JIT library runtime support even when executing JITted methods). DLM Alphaworks Page 88 of 374 IBM STG - Performance Visual Performance Analyzer Figure 94 The sample distribution of a java program 5.1.2.8View profile details You can see the detailed information of a profile file when you select it in the Process Hierarchy view. We can see it from the following example: DLM Alphaworks Page 89 of 374 IBM STG - Performance Visual Performance Analyzer Figure 95 The Profile Details tab The file A50421C3.out is opened, and its corresponding information shows up in the Profile Details View. 5.1.2.9View resolved call information When you open an IA32 profile, Profile Analyzer can analyze the disassembly in it to identify all call sites that have an immediate address as a target, and can attempt to connect those call sites to target symbols. This is done automatically for you if the Resolved Call Information view is visible. The following picture shows the Resolved Call Information view for a method selected from the jvm.dll module: DLM Alphaworks Page 90 of 374 IBM STG - Performance Visual Performance Analyzer Figure 96 The Resolved Call Information view showing a method selected from the jvm.dll module 5.1.2.10View basic blocks A basic block is a block of instructions that contain a single entry point and at most two exit points. Basic blocks are a concept used by compilers to perform dataflow analysis and to perform effective optimizations. Profile Analyzer attempts to detect basic blocks by analyzing the targets of all branch instructions within the disassembly for a symbol. Note that the basic blocks detected by Profile Analyzer may not match the basic blocks indicated in a compiler list, as the compiler may use a higher-level basic block structure that includes internal branches. For example, a single source or intermediate-language instruction does not span multiple basic blocks from a compiler perspective. However, some source or intermediate-language instructions may result in multiple basic blocks at the disassembly level. An array assignment operation in Java is one such instance: the assignment is a single source statement, but may require both a null check and an array bounds check, each of which are intermediate-language instructions that may result in multiple conditional branches in the resulting disassembly. You can see basic block information by choosing Windows-> Show View -> Other... -> Profile Analyzer -> Basic Block or just find it in the left panes. When you open a process, you can see its basic block information as follows: DLM Alphaworks Page 91 of 374 IBM STG - Performance Visual Performance Analyzer Figure 97 The Basic Block view Each basic block has a number (BB1, BB2 etc.), a tick count, zero or more incoming edges, and one or two outgoing edges (a terminating basic block does not have any outgoing edges). Each block with ticks is colored red, magenta or blue according to the same rules used to determine symbol tick color, and shaded according to the relative tick count of the basic block as compared to the symbol as a whole. You can click on a basic block to highlight its incoming and outgoing edges in red: DLM Alphaworks Page 92 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 98 The selected block with two incoming and outgoing edges highlighted in red In this view, BB2 was selected, and its outgoing edges to BB3 (the "fallthrough" basic block) and BB4 (the target basic block) are highlighted. 5.1.3Profile Comparison The following are tasks you can do to compare profiles within Profile Analyzer: 5.1.3.1Compare two profiles 1. � Click the Compare two profiles button in the Profile Comparison view toolbar. Alternatively, for any open profile, right-click in the process hierarchy view and select Compare this profile with another. 2. � In the Profile Comparison wizard, select two profiles you want to compare. The wizard supports the compressed (.etz) profiles. The comparison of tprof profiles is still limited in that it does not expose compilation levels. 3. � Click Next. Enter the following values for each profile: o Transaction rate: The transaction rate is the number of transactions completed per elapsed second. What we call a transaction varies from workload to workload, but it is a consistent indicator of the amount of work we are doing per second. If we are running background jobs, the number of transactions might be the number of jobs. If we are running an HTTP server, the number of transactions might be the number of HTTP requests we have served. For each type of workload, the transaction is clearly defined. DLM � Alphaworks Page 93 of 374 IBM STG - Performance � Visual Performance Analyzer o CPU utilization: CPU utilization is a percentage that describes how busy servers are. It is defined as the average utilization of all CPUs in a server. o Number of CPUs: This is the number of CPUs available to the system in each of the profile runs being compared. It is very common that we compare runs with different number of CPUs. For example, we may compare a 1-CPU run against a 4-CPU run to determine how well a workload scales up as we add CPUs to the configuration. 4. � Click Finish. Two files are loaded and opened. 5. � Right-click the process, thread or module of the first file in the process hierarchy view which you want to compare and select Mark for comparison. Figure 99 Select a process for comparison 6. � Go to the other profile, right-click the process, thread of module of the second file which you want to compare with and select Compare with . DLM � Alphaworks Page 94 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 100 Compare the selected process with another one 7. � In the Profile Comparison view, the detailed information of modules you have selected to compare is listed as follows: Figure 101 The information of modules to compare listed in Profile Comparison view � 5.1.3.2Understand the calculations DLM � Alphaworks Page 95 of 374 IBM STG - Performance � Visual Performance Analyzer The comparison tool uses the normalization factors entered in the Profile Comparison wizard to calculate the microseconds of CPU consumed per transaction (us/Tx). Since the us/Tx values are computed on a per transaction basis, they can be compared directly from profile to profile. The us/Tx values are calculated as follows: 1. Calculate percentage of total ticks in the specific symbol: CPU% = Ticks in the specific symbol / Total ticks in the profile run 2. Calculate transactions per busy second: ITR = Transaction rate / CPU utilization 3. Calculate total CPU microseconds per transaction: Total microseconds per transaction = 1,000,000 / ITR * Number of CPUs 4. Calculate average CPU microseconds per transaction in the specific symbol: us/Tx = Total microseconds per transaction * CPU% 5.1.3.3Save a profile comparison 1. Click the Save Comparison button in the Comparison view toolbar. Alternatively, right-click anywhere in the view and select Save Comparison from the pop-up menu. 2. The Save As dialog opens. Browse to the desired directory and enter a file name. 3. � Click Save. The comparison will be saved as a Profile Analyzer comparison (.etc) file. This file contains both compared profiles (zipped) and the normalization factors used in the comparison. 5.1.3.4Open a profile comparison 1. In the Navigator view, double-click the Profile Analyzer comparison (.etc) file. 2. � Both compared profiles are opened automatically in the Profile Analyzer editor as temporary files, and the Comparison view opens. 5.1.4Profile Merge If you have profiled a benchmark for several times using TPROF profiling tool, you can merge the .etm or .opm files for these runs using the Merge Wizard in Profile Analyzer. This can be useful for several types of situations: • If you are measuring a short-run application (or a short-run phase such as the startup of a JVM during a benchmark), each individual profile may have too few ticks to draw the meaningful results, but with a merging pattern you may begin to merge the files. • If you are measuring different CPU events (ticks, data cache misses, branch mispredicts) on different runs of a benchmark, you can merge these runs and see the data for all the counters in a single profile • If you want to compare two runs, you can use profile merge to see which processes, modules, symbols, and symbol offsets are active in both runs. To merge several profiles, you must open at least one of them. Then follow these steps: DLM � Alphaworks Page 96 of 374 IBM STG - Performance � Visual Performance Analyzer 1. Right click in the process hierarchy view and choose Merge this profile with another. Figure 102 Launch the Marge Profiles wizard 2. � In the Merge Profiles Wizard, select one or more profiles from the Current project and click Add >, or use the Browse button to open a file dialog to select the profiles from other locations. Note that in Profile Analyzer you can only add profiles whose platform matches with the profile already added to the list. Then click Next. DLM � Alphaworks Page 97 of 374 IBM STG - Performance Visual Performance Analyzer Figure 103 Select profiles to merge � DLM Alphaworks Page 98 of 374 � IBM STG - Performance � Visual Performance Analyzer 3. On the select processor type from platform family page, now you do not have to select a CPU type because it is defined by primary profile CPU type. Click Next. Figure 104 Click Next to continue The merged file is created after clicking the Finish button. The file is loaded immediately into the Profile Analyzer editor. When you merge the profiles, Profile Analyzer creates a new file after a successful merging. You can merge an .etm file to other .etm files, and then Profile Analyzer creates a file with the .etm extension (ETM=e-Tune Merged). You can merge an .opm file to other .opm files and then get a file with the .opm extension. The merging operation only applies to files in the same format, and the output file is attached with a relative extension and is also in the same format. The output file is in the Profile Analyzer-supported XML format. Profile Analyzer opens the output file immediately after the merging. This profile looks much like an ordinary TPROF or an OProfile profile when it is viewed inside the Profile Analyzer, except the following differences: • • • The processes, modules, symbols, and offsets that merged from more than one source profile are colored in green Multiple counter columns may appear in the Offsets view as well as ticks, if you chose different counters for each source profile No threads data is available, because it does not make sense to merge thread data from separate runs. DLM � Alphaworks Page 99 of 374 IBM STG - Performance Visual Performance Analyzer 5.1.5Symbol Analysis You can perform the following are tasks to analyze the symbols within Profile Analyzer: 5.1.5.1Code Miner support The Code Miner support in Visual Performance Analyzer enables you to populate an SQL database with information from Profile Analyzer profiles, and then perform SQL queries on the database to detect performance patterns that are not easily detected by traditional profilers. The database tables allow you to associate profile counter information, symbols, and disassembly instructions so that you can find inefficient or highly active patterns of instructions, instruction sequences, symbols, register usage, and so on. For example, a Code Miner query can be used to find the hottest pairs of sequential instructions, or all symbols that contain a particular instruction sequence, or all symbols that are hotter than a certain threshold that have a certain pattern in their name. Code Miner is ideal for analyzing flat profiles, namely profiles where no single symbol uses more than a fractional percentage of total ticks. In flat profiles, the objective is to find patterns of disassembly code, or usage patterns of certain types of symbols, that are inefficient. Without Code Miner it is extremely difficult to determine which patterns are worth investigating. Because Code Miner lets you determine the overall cost of a particular pattern within an entire module or an entire profile, you can use it to detect the patterns that will yield the maximum benefit when optimized, replaced, or eliminated. Code Miner user interface support within Profile Analyzer includes a Code Miner wizard for populating data from a profile, and two views: a Code Miner Query view, through which you can query the Code Miner tables for a particular profile to find patterns of interest, and it displays the results in a sortable column-based table; and a Query Tree view that saves queries and database configurations so that you can easily locate, edit, re-run past queries or queries imported from another user, such as Compiler listings. 5.1.5.1.1Populate Code Miner Database In Profile Analyzer, you can choose to keep profile or trace file data into DB2 database. This can be realized via Populate Code Miner Database Wizard. Every time when you open Populate Code Miner Database Wizard, you can choose to create a new table, append to the existing table or clean tables. If you decide to keep data into database, the prefix of table which is designed to store data should be defined at first. In the next page of wizard, given name, host, port and admin password, Profile Analyzer can get access to the database and populate data automatically. Please follow the following steps to populate profile file into a new table in the existing database: 1. Open a profile file and right-click the file to choose Populate CodeMiner Database. 2. Define a table prefix, create new tables in database and include proper fields as you need. DLM Alphaworks Page 100 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 105 Specify the profile settings 3. � Input db2 connection information in the following wizard page. Be sure to pass firewall so as to connect database. DLM � Alphaworks Page 101 of 374 IBM STG - Performance Visual Performance Analyzer Figure 106 Pass the firewall in db2 connection 4. Click Finish 5.1.5.1.2Code Miner Database Queries To check data in database, run query in Code Miner Queries. Please follow the following steps: 1. Create a new connection to the preceding db2 database through CodeMiner Queries or Query Tree view • • Press Edit button in CodeMiner Queries view Right-click the connection node and choose Define Database Connection DLM Alphaworks Page 102 of 374 IBM STG - Performance Visual Performance Analyzer Figure 107 The Query Tree view 2. Input SORT in Prefix box and click to list fields and tables with the input prefix Figure 108 Input Sort in Prefix box 3. Choose a line item and right-click the item to add the selected statement to query DLM Alphaworks Page 103 of 374 IBM STG - Performance Visual Performance Analyzer Figure 109 Add the selected statement to query  4. Press to run this statement in database DLM Alphaworks Page 104 of 374 IBM STG - Performance Visual Performance Analyzer Figure 110 Run query 5. Double-click an instruction item in the new pop-up tab to view the profile this instruction belongs to DLM Alphaworks Page 105 of 374 IBM STG - Performance Visual Performance Analyzer Figure 111 View the profile this instruction belongs to 5.1.6Couple with Code Analyzer Profile Analyzer can integrate with Code Analyzer for better navigation and comparison of module information between the profiling file and executable file. This function can be initiated in generic hierarchy view. Then the views and editor in the following list will be synchronized for your selection on one of them. • Profile Analyzer o Disassembly/Offsets View o Compiler Listing View o Source Code View Code Analyzer o Instruction Editor • To couple with Code Analyzer, be sure to have both profile and binary file containing at least one same module. Please do the following steps: 1. Open a profile file DLM Alphaworks Page 106 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 112 The layout of the Profile Analyzer 2. � Navigate generic hierarchy view, choose a symbol in a module, and load its disassembly and offset information by double-clicking or pressing ENTER on the symbol DLM � Alphaworks Page 107 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 113 Select a symbol and load its disassembly and offset information 3. � Open Compiler listing view, double-click or press ENTER on the selected symbol, and choose appropriate listing file to load. The loaded compiler listing information of the selected symbol is as follows: DLM � Alphaworks Page 108 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 114 The loaded compiler listing information of the selected symbol 4. � Open Source code view, double-click or press ENTER on the selected symbol, and choose an appropriate source file to load. The loaded source line information of the selected symbol is as follows: DLM � Alphaworks Page 109 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 115 The loaded source line information of the selected symbol 5. � Right-click this module symbol in generic hierarchy view. On the context menu, choose Open in CodeAnalyzer Figure 116 Open the selected module symbol in Counter Analyzer 6. � Choose the corresponding binary file of this module. DLM � Alphaworks Page 110 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 117 Choose a corresponding binary file 7. � Later, Code Analyzer perspective is opened automatically with this binary file. To synchronize Profile Analyzer with Code Analyzer, be sure to open Disassembly/Offsets view, Compiler Listing view and Source Code view in Code Analyzer perspective. Now, when you select a table row in those views of Profile Analyzer, the Code Analyzer Instructions editor will highlight the instructions which have the same addresses with the selected disassembly lines, or listing lines or source lines accordingly. DLM � Alphaworks Page 111 of 374 IBM STG - Performance Visual Performance Analyzer Figure 118 Select a line in any of the three views to highlight the lines in Code Analyzer Vice versa, after selecting an instruction in Code Analyzer Instruction editor, the views in Profile Analyzer will highlight the corresponding disassembly lines, listing lines and source lines. DLM Alphaworks Page 112 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 119 Select an instruction in Code Analyzer editor to highlight the corresponding lines in the three � views 5.1.6.1View disassembly comparison In this release, you can compare the offset, ticks and disassembly of two profiles in Disassembly Comparison view. You can open Disassembly Comparison view by choosing Windows-> Show View -> Other... -> Profile Analyzer -> Disassembly Comparison or just find it in the right bottom panes. To view the disassembly comparison, you should follow these steps: 1. � Compare two profiles through Profile Comparison Wizard or just open an .etc file. 2. � Open Profile Comparison view by choosing Windows-> Show View -> Other... -> Profile Analyzer -> Profile Comparison Figure 120 The Profile Comparison view 3. Right-click the line in Profile Comparison view and choose Show disassembly comparison DLM � Alphaworks Page 113 of 374 IBM STG - Performance Visual Performance Analyzer Figure 121 Show disassembly comparison 4. The corresponding Disassembly Comparison view opens. Figure 122 The Disassembly Comparison view There are two rows of hottest bars referring to two profiles which are compared. You can navigate Disassembly Comparison view by clicking them as follows: DLM Alphaworks Page 114 of 374 IBM STG - Performance Visual Performance Analyzer Figure 123 Click on the hotness bar to navigate Disassembly Comparison view To right-click and select a menu item, you can sort columns by source line number or offset. Figure 124 Sort columns by source line number or offset 5.1.6.2View offsets and disassembly Profile Analyzer can disassemble the instruction stream for any symbol for which such a stream is available. Disassemble support is available for the following platforms: • Intel IA32 DLM Alphaworks Page 115 of 374 IBM STG - Performance Visual Performance Analyzer • • • AMD-64 or EM64T (same instruction set) PowerPC Cell BE (both PPE and SPE) Whether the profile contains an instruction stream is dependent on the profiling tools used to create it. For JITCODE (JIT-compiled Java methods), instruction streams are available if the JPROF library was loaded with the JVM (using the -Xrunjprof option), the jints sub-option was specified as part of this option, and the log-jita2n* files produced were available at the time that merge tprof was run. Only the IBM Virtual Machine for Java supports the JPROF library. When disassembly can be generated for a symbol, Profile Analyzer displays a table containing instruction addresses, the bytes for each instruction, the instruction sequence, and tick information. The following view shows the disassembly for a Java method on an Intel IA32 system: Figure 125 The disassembly information in OffsetAsm Information view If no disassembly is available, Profile Analyzer displays an Offsets view containing ticks for each offset. The following view shows the offsets for the NTOSKRNL.EXE module of the same profile; this module has a single symbol referred as NoSymbols for that symbol data (and by extension, instruction stream of a symbol) could not be obtained currently: DLM Alphaworks Page 116 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 126 The offset information in OffsetAsm Inforamtion view If you are expecting to see disassembly data for a symbol and instead see only offset data, check the following list: There should be a tprof.micro section (for static-compiled methods) or a log-jita2n section (for JIT-compiled methods) in the profile you have loaded. • The appropriate section should contain instruction data. In the tprof.micro section, the symbol must have a sequence of lines beginning with C:; if no such lines exist in the tprof.micro section, the -off option may not have been specified in the POST options (if you were manually profiling). In the JITA2N section, after each symbol there should be a sequence of binary bytes or hex data. You may be able to view the disassembly by switching to the Disassembly view. Click on the pull-down menu at the top right of the view and ensure that the Show disassembly item is checked. • 5.1.6.2.1Navigating the Disassembly/Offsets View You can quickly navigate to areas of high activity in this view using either the Hotness bar or a combination of sort and selection actions: • Navigate to hot areas by sorting and selecting You can click on any column in the OffsetAsm Information view to sort by that column. Repeated clicks on the same column reverse the previous sorting order. To navigate to hot areas in a symbol you can follow these steps: 1. Click on a column heading that relates to CPU activity, to sort the view by that column (e.g. Ticks, %CPU activity, or a CPU counter if the profile contains CPU counter counts). The lines with the most events are sorted to the top. 2. Select a line of interest; the top line should be the one with the most events in the column you selected. 3. � Click on the Offset column heading to sort by offsets again. The busy line you had selected in the previous step remains selected and remains in the visible area. If your platform supports symbol call resolution (currently only the x86 and x86-64 platforms, because these are the only platforms in which direct relative or absolute branch instructions are used to make calls to DLM � Alphaworks Page 117 of 374 IBM STG - Performance � Visual Performance Analyzer other symbols), you can also quickly find calls to resolved targets by sorting by the Remarks column. The following shows disassembly for a JIT-compiled Java method, sorted by the Remarks column so that lines containing call targets are displayed at the top: Figure 127 Sort by the columns You can use the same three-step sorting technique for offsets, to find calls to a particular symbol: 4. � Sort by the Remarks column. You might select the column header twice, if no call targets are visible the first time you select the column. 5. Select the line containing a call target of interest. 6. � Sort by the Offsets column. The line you had previously selected is now in the visible area and instructions are displayed in offset order. You can double-click on a line containing a call target to switch the current symbol in the OffsetAsm Information view to the target symbol. Loop nest detection and branch target detection When Profile Analyzer loads the disassembly for a symbol it analyzes internal direct branches in the disassembly to determine loop patterns. Any backward branch may be considered the end of a loop, provided certain other parameters are met. Any block of code detected to be within a loop is indented by one space; if multiple nested loops are detected, sections of code may appear more deeply indented. In some cases the level of indentation may be extreme, as in the following example: DLM � Alphaworks Page 118 of 374 IBM STG - Performance Visual Performance Analyzer Figure 128 The indented loop in OffsetAsm Information view Here the indentation shows at least 15 levels of nesting. While it is unlikely that a programmer would have written a loop nest 15 layers deep, this level of nesting may occur where a compiler has inlined calls that occur within loops, and the inlined calls themselves contain other nested loops or further inlined calls. You can remove loop nest indenting by clicking on the icon at the top of the view. If the icon is displayed, clicking on it displays loop nest indenting for a symbol whose disassembly was not indented. Reordering columns You can reorder the columns of the Disassembly/Offsets view to hide or show particular columns or change the order in which columns are displayed. This is one of the features of eclipse 3.1. You can reorder the column just by clicking on the column name and dragging it to the place where you want it to be. It shows as follows: Figure 129 Reorder the columns by clicking the Bytes column Branch and call navigation Any disassembly line that contains a branch to a known target within the current symbol, or a call to another symbol, is indicated by an arrow in the left margin. denotes a forward branch, one whose target is a subsequent instruction. denotes a backward branch, one whose target is a previous instruction. DLM Alphaworks Page 119 of 374 IBM STG - Performance Visual Performance Analyzer denotes a call or branch to another profiled symbol. This is only available for x86 and x86-64 platforms. When you double-click on a line containing one of these icons, the view changes to show the target of the branch (a different location in the current symbol, or the target symbol of a call). To navigate back to the last in-symbol branch you selected, after you have followed that branch, press the button on the offsetAsm Information view toolbar. The hotness bar By clicking on an area in the hotness bar, you will be taken to the corresponding disassembly instructions, or offsets. For lengthy disassembled methods, you must page up or down to find the hot area in question, as a line in the hotness bar that is one pixel high might relate to several pages of disassembly. When you select a line in the Disassembly/Offset Information view, a yellow square appears in the hotness bar to show the currently selected area of the symbol. 5.1.6.3View source code When an executable or library has been compiled with line number information (for example the -g option on some compilers), the platform profiler, like Tprof on AIX, may be able to obtain line number information for profiled symbols in such an executable or library. You can then view source code for these symbols within Profile Analyzer. Line number support is available when the TPROF post-processing command includes the -off option. This option is enabled by default when you use the run.tprof_e script or run the profiling session from the Profiling Configurations view. When you first select a symbol for which line numbers are available from the Symbols view, a dialog is displayed to ask whether you want to view source code for the symbol: DLM Alphaworks Page 120 of 374 IBM STG - Performance Visual Performance Analyzer Figure 130 Open a source file If you choose Yes, a File dialog is displayed that lets you navigate to the path containing the file. The name of the file you choose from this dialog does not have to match the name in the TPROF output, but if the line numbers do not match those of the file from which the code was compiled (for example, if the file has been edited since it was compiled), the tick information may not map to the correct source line numbers. If you choose No, you are not prompted to enter source for any other symbols in the current profile, but when you load a different profile containing line number information, you may again be prompted to locate source files. If you choose Don't ask me again, you will not be asked to open a source file until you exit and restart Profile Analyzer. The following view shows source code for a symbol: DLM Alphaworks Page 121 of 374 IBM STG - Performance Visual Performance Analyzer Figure 131 The Source code view If you choose No or Don't ask me again, no source code will be shown. The source code view shows as follows: Figure 132 The Source code view with no line number information You can again associate source file by pressing the button in the center of the view, or click Associate Source File icon on the toolbar of the view. When you click on different areas of the hotness bar in the right side of the source code view, the corresponding line in the source file you select will be highlighted. You can see it in the following view: DLM Alphaworks Page 122 of 374 IBM STG - Performance Visual Performance Analyzer Figure 133 Navigate the Source code view through the hotness bar You can export these source codes to file by selecting Export to Files... on the menu of the toolbar as follows: Figure 134 Export the source code to files 5.1.6.4Open Profile Summary DLM Alphaworks Page 123 of 374 IBM STG - Performance Visual Performance Analyzer Symbol Summary view can display the summarized information of the profile. This function is enabled when a profile is active on Profile Analyzer editor. The following steps show a typical scenario of examining a profile's summary. 1. Load the profiling file into Profile Analyzer or bring an opened profile onto top on Profile Analyzer editor 2. Load profile summary by symbol. In Symbol Summary view, select Load Summary by Symbol on the context menu, or just press the icon the toolbar. on Figure 135 Load summary by symbol 3. Load profile summary by bucket DLM Alphaworks Page 124 of 374 IBM STG - Performance � Visual Performance Analyzer In Symbol Summary view, select Load Summary by Process Bucket, or Load Summary by Thread Bucket, or Load Summary by Module Bucket on the context menu, or on the toolbar drop-down menu. Figure 136 Load summary by module bucket 3. � Choose custom counters to explore profile summary. This function is enabled only in viewing profile summary by symbol • Select Choose Counters... on the context menu of Symbol Summary view DLM � Alphaworks Page 125 of 374 IBM STG - Performance Visual Performance Analyzer Figure 137 Choose counters to explore profile summary • In the pop-up dialog, select the custom counters to be added DLM Alphaworks Page 126 of 374 IBM STG - Performance Visual Performance Analyzer Figure 138 Select the custom counters to be shown • Additional columns represent custom counters are added. DLM Alphaworks Page 127 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 139 The custom counter columns are added into the view • To remove the shown custom counters columns, select Choose counters... on the context menu and unselect the previously chosen custom counters 5.1.6.5View temporal profiling When a Profile Analyzer profile contains a trace buffer section, Profile Analyzer attributes buffer events to appropriate symbol offsets, symbols, modules, threads, and processes. When you select a profile object (click on a process, thread, or module in the process tree, double-click on a symbol in the symbol list, or double-click on a disassembly or offset line with tick information in the OffsetAsm Information view), Profile Analyzer can display the change of ticks of an event in a temporal graph during profile run when the event occurred. This version of the Temporal Profiling view is called Tick intensity over time. The following screen capture shows the Temporal Profiling view for a java.exe process in a profile that runs for about 7.5 seconds: DLM � Alphaworks Page 128 of 374 IBM STG - Performance Visual Performance Analyzer Figure 140 The Temporal Profiling view You can change the number of intervals of a temporal profile by dragging the interval slider (on the left) to the left or right. This action changes the number of intervals used to display the temporal graph. Changing the granularity of a 25-second trace run to 50 intervals can result in each bar showing the events for a particular half-second. For the same trace run, a granularity of 10 would have 2.5 seconds attributed to each bar. The following screen capture shows the same preceding information but with a granularity of 8 (that is, 8 equal-time intervals): Figure 141 The temporal profiling chart displayed in 8 intervals When the selected object is a Profile, Process, Thread, or Module, up to six of the "natural" children of that object are shown in a line graph superimposed on the bar graph, as can be seen in the preceding images. A child is only shown if it is a significant contributor to the parent tick count. You can hide the tick information of a child by deselecting the check box beside its name below the bar graph. The following screen capture shows the same profile, with only the top two children selected: DLM Alphaworks Page 129 of 374 IBM STG - Performance Visual Performance Analyzer Figure 142 The temporal chart of the top two children of the java.exe process The "natural child" of a profile object is as follows: Parent Profile Natural children Processes • • Thread Module Symbol Modules, if "Ignore thread data" in the process tree checkbox is selected Threads, if "Ignore thread data" checkbox is not selected Process Modules Symbols No children Offset tick No children 5.1.6.5.1Changing the tick scale for children In some profiles, it may be hard to distinguish the lines of the "natural children" of a particular profile object. You can drag the Zoom children vertically slide bar to exaggerate or diminish the scale for the line graphs that are superimposed on the bar graph. The following picture shows the preceding view but with all children selected and their vertical scale exaggerated by a factor to 400% of the parent object scale. The lines for the highest-contributing children extend off the top edge of the chart, but the difference in relative contribution of the three lesser modules is easier to distinguish because their lines are further apart. DLM Alphaworks Page 130 of 374 IBM STG - Performance Visual Performance Analyzer Figure 143 The temporal chart exaggerated by 400% of the original chart 5.1.6.5.2Zooming in on a time range To zoom on a particular time range, hold the left mouse button at one end of the time range, and drag the mouse cursor left or right. A rectangle shows the time range selected (the vertical dimensions of the rectangle are not relevant to the result of the selection). The following picture shows the initial selection of a time range within a profile: Figure 144 Zoom in the chart on a particular time range When you release the left mouse button, the selected area is zoomed: DLM Alphaworks Page 131 of 374 IBM STG - Performance Visual Performance Analyzer Figure 145 The zoomed chart of the selected area You can drag a zoomed view forward or backward in time by holding down mouse button 3 (the right mouse button for left-handed users) and dragging. Note that the chart bars and lines move up and down when you are dragging. This is because each time Profile Analyzer handles an increment of the drag event, it redistributes the trace events according to the current co­ ordinates. You can further zoom in an already zoomed image by selecting a new zoom area: Figure 146 Zoom an already zoomed image This zooms to the time range 5.76 to 9.10 seconds: DLM Alphaworks Page 132 of 374 IBM STG - Performance Visual Performance Analyzer Figure 147 The zoomed area in figure 147 To restore the time view to the full duration of the profile, right-click over the bar chart. 5.1.6.5.3Time cursor From VPA 5.0, temporal profiling view is decorated with two time cursors, one for start time, and the other for end time. Temporal profiling view displays a column chart of time intervals of all the profile data. The time is labeled along below the column chart, once every some time intervals. The time cursor helps you accurately position the time line of the column chart. When you select the cursor and drag it along the column line, the cursor time is displayed as it is moved. At first, the start cursor is at the bottom left and the end cursor is at the top right. end cursor start cursor Figure 148 The start and end time cursor in the Temporal Profiling view � DLM Alphaworks Page 133 of 374 IBM STG - Performance Visual Performance Analyzer It is the temporal profiling view for an .etm profile data file with time data. The two cursors are displayed at the initial positions in the column chart. end cursor move start cursor move to new position Figure 149 Drag the cursors to new positions When you drag the cursors, time line is displayed and time data is changed as the cursor is moved. Here the start cursor is labeled 11.26, and the end cursor is dragged to 23.78, in second. Figure 150 Range the area of time intervals according to the time cursors Time cursor helps you focus on a range within a time interval, and helps you position the mouse to select a concerned range of time interval. In the previous graph, the cursors having displayed a range of time interval, if you DLM Alphaworks Page 134 of 374 IBM STG - Performance Visual Performance Analyzer are interested in the interval of profile data, you can drag a rectangle aligned with the cursors. It helps you range the area of time intervals more accurately. time interval end time interval start Figure 151 The exaggerated chart in a new range When you select a time interval range with mouse, the temporal profiling view refreshes the graph, and displays the new column chart of profile data inside selected time interval. The preceding chart is resulted from selecting time intervals range from 8.91 second and 32.75 second. The start cursor is at the left most and the end cursor is at the right most. In this graph time cursor is able to move its position too. 5.1.6.5.4Time/memory profiling The temporal profiling view also lets you view how tick events for a selected object are distributed within a matrix of memory and time. This is mainly of interest to compiler writers, or other specialists concerned with how well busy sections of a symbol or module are distributed within a processor's instruction cache. To switch to the Time/memory profiling view select the Show memory usage over time icon . Select a profile object for which this view makes sense - typically a symbol or module. (It is not normally productive to view the entire profile or a particular process in this view, because individual libraries within the profile or even a single process may occupy widely different memory ranges.) The following picture shows a time/memory view of the JITCODE module of a java process (the module containing JIT-compiled Java methods): DLM Alphaworks Page 135 of 374 IBM STG - Performance Visual Performance Analyzer Figure 152 The Temporal Profiling view with memory usage information Individual colored rectangles in this view represent profile intensity, for a given time and a given address range. To produce this view, Profile Analyzer divides the profile ticks for the selected object (profile, process, module or symbol) into equal time intervals (determined by the left-hand or time interval slider, as for the Tick intensity over time view) and divides the memory range of the selected object into equal memory intervals (determined by the right-hand or memory interval slider). Individual rectangles that contain ticks represent a region of memory that was busy at a particular time. If there is sufficient space, Profile Analyzer displays the tick count for each active intersection within the rectangle. If you increase the number of intervals the ticks may disappear but the color scheme still gives an indication of which memory ranges are busy at what time with darker shades denoting higher tick counts: DLM Alphaworks Page 136 of 374 IBM STG - Performance Visual Performance Analyzer Figure 153 The memory usage information in increased intervals The table at the bottom of the view displays information about the active objects in a particular rectangle in the view. To use this capability, make sure the Temporal Profiling view is the active view (click on the tab), then hold down the Shift key while moving the mouse. As the pointer moves over different areas of the graph, the objects that occupy the memory range for the current area are shown in the table. Two tick counts are shown for each object: In range identifies the number of ticks the indicated object contributes to the current time or space interval; Total represents the number of ticks the indicated object contributes to the profile as a whole. In the following view, the Shift key is being held down and the pointer is over the rectangle with a tick count of 426. The busiest symbols for that time or memory range are displayed at the top of the table: DLM Alphaworks Page 137 of 374 IBM STG - Performance Visual Performance Analyzer Figure 154 The bottom table with the objects occupying the selected memory range Once you release the Shift key, the table contents do not change. In this way you can use the mouse to scroll through the table after you have chosen a particular rectangle, without mouse movements changing what is displayed in the table. 5.1.6.5.5How to use the time/memory view The following time/memory view is for a static compiled function within the garbage collection module of the J9 virtual machine for Java: DLM Alphaworks Page 138 of 374 IBM STG - Performance Visual Performance Analyzer Figure 155 The time/memory view In this view, each row represents 69 bytes (as shown by the legend beside the right-hand slider bar). You can see that there are two fairly busy ranges: the first range consisting of the top two rows, and the second range consisting of the fifth row (the row whose first displayed value is 21 ticks). One use of the time/memory view is to show whether the code is properly ordered within busy symbols. For instance, the above function might provide better performance if the areas of code that are busy were closer together in memory, as they would likely use fewer I-cache lines if grouped together. Note that you should only attempt code reordering based on the time/memory view after analyzing the same symbol in several profiles, and there are no guarantee that your reordering yields improvements. For example, compilers might completely reorder the sections of your code when they generate the machine code for a symbol. However, this view can help you identify the symbols or modules with time/memory usage patterns that warrant further investigation. 5.1.7Create/Configure/Refresh/Discard a Connection Create Connection In Database Connection view, select New Connection on the context menu, and you can create a new repository. Local Repository is created by default and can only be configured and refreshed. Description view lists basic information of the repository. DLM Alphaworks Page 139 of 374 IBM STG - Performance Visual Performance Analyzer Figure 156 The Database Connections view with repositories Configure Connection The following picture shows the Configure Dialog of Connection. Figure 157 The DB2 connection configuration window Refresh Connection Double click a connection or click on the + icon to expand the tree. A Password Dialog opens when you open the repository for the first time. If the password is saved, you no longer have to re-enter it when VPA is closed and restarted later. If the password is not saved, you are required for it every time you start VPA. DLM Alphaworks Page 140 of 374 IBM STG - Performance Visual Performance Analyzer Figure 158 The Password dialog You can also choose to clear the password on the context menu. Figure 159 Clear the saved password Discard Connection You can choose to discard the connection on the context menu. Be sure to delete all its children first. DLM Alphaworks Page 141 of 374 IBM STG - Performance Visual Performance Analyzer Figure 160 Delete a connection 5.1.8Configure database connections and manage cached database files To improve the performance, Profile Analyzer will load its files into a database, either hsqldb or DB2, instead of keeping them in the memory, sometimes in page files. Then, each action just executes a query to get the data needed without any useless data. To open this view, click Window - Show View - Other, and then select Database Connections under Visual Performance Analyzer category, as follows. Figure 161 Open the Database Connections view When you start VPA for the first time, a default connection of hsqldb is created, as well as the product supports under this connection node, as follows. DLM Alphaworks Page 142 of 374 IBM STG - Performance Visual Performance Analyzer Figure 162 The default connection and product supports The only Profile Analyzer support is set as active, so that you can open files without any setting actions. However, you can create other connection or edit this default connection as you like, such as modifying its path. Figure 163 Create hsqldb connection � DLM Alphaworks Page 143 of 374 IBM STG - Performance Visual Performance Analyzer Figure 164 Create DB2 connection After you created these two connections, the view looks as follows. DLM Alphaworks Page 144 of 374 IBM STG - Performance Visual Performance Analyzer You can create Profile Analyzer support under each connection. For common use, hsqldb is enough, but for performance consideration, db2 is the better choice. Figure 165 Create Profile Analyzer support under hsqldb connection � Figure 166 Create Profile Analyzer support under DB2 connection Hsqldb, as we known, is an embedded database system, so it must have some limitations. To prevent too much disk space occupied as more and more files being loaded in, you should set a size limitation. The number of Size Limitation stands for the upper limit of the hsqldb’s data file. If the number of the size of the database files was larger than the limitation number, the system will delete oldest files, until the size of the database file is smaller than the size limitation. During the auto-delete process, the file whose size is larger than the size limitation will be deleted first. DLM Alphaworks Page 145 of 374 IBM STG - Performance Visual Performance Analyzer Figure 167 Set hslqdb’s size limitation for Profile Analyzer support You can also delete files manually to release disk space. Multi-selecting is allowed. See the following picture. Figure 168 Delete cashed files To set one Profile Analyzer support as active, you should switch the perspective to Profile Analyzer perspective, so that the related menu named Set Active occurs. Click it to set the connection you like as active. Figure 169 Select active connection Active Profile Analyzer Support cannot be deleted. To delete it, you should set other support as active first. The connection cannot be deleted, until you delete all the product supports under this connection. The following view supports sorting operation. You can sort the files by name, size or date. To sort, select the sort mode on the action bar, or click the title directly. Figure 170 Click the column title to sort the files � DLM Alphaworks Page 146 of 374 IBM STG - Performance Visual Performance Analyzer Figure 171 Select a sort mode 5.1.9View call graph in Call Graph view Call Graph view is a common view shared by VPA tools including Profile Analyzer and Call Tree Analyzer. The Call Graph view shared by Profile Analyzer supports the output files with call graph information from the following tools: Platform Linux/X86 Tools oprofile 0.9.3 (--callgraph option) To start Call Graph view, choose Window -> Show View -> Other. In the popped-up dialog Show View, choose Visual Performance Analyzer -> Call Graph View, or type Call Graph View in the textbox shown in the following picture and then select Call Graph View. Click OK to open the view. Figure 172 Open the Call Graph view � There are totally three kinds of display modes within Call Graph view. They are Expansion mode, Overall mode and Compound mode. DLM Alphaworks Page 147 of 374 IBM STG - Performance Visual Performance Analyzer Expansion mode allows you to choose one function as starting point, and then you can click the caller button or callee button to view the functions that call this function or this function calls. The callee functions always appear at the right or below of the function you are working on, depending on expansion direction, vertical or horizontal. So more than one node of a function can appear in the graph. This mode is useful when you want to view the function calling relationship deeply. The following picture shows the graph in Expansion mode. Overall mode displays the overall calling relationships among functions by displaying the whole graph in the view. You can use this mode to get an overview of the calling relationships of the functions. In this mode, only a function appears only once in the graph. The following picture shows the graph in Overall mode. Compound mode mainly displays the calling relationships between modules like library, Java class, or Java package. All the functions of the same module are grouped in a rectangular box which represents a module. The connections between modules are marked in blue arrows, and the blue text in the top of the rectangular is the name of the module. The following picture shows the graph in Compound mode. The following example shows how to view the graph in Expansion mode. Example: View the graph generated by opm file First, you need to launch Profile Analyzer by clicking the icon and select an opm file. In the editor, right-click a function item, and select Show Symbol in Call Graph in the context menu as shown in the following picture: Figure 173 Display the selected symbol in call graph Then the corresponding function you selected is displayed in Call Graph view: DLM Alphaworks Page 148 of 374 IBM STG - Performance Visual Performance Analyzer Figure 174 The selected symbol main displayed in call graph The following examples show how to view the graph in Overall mode. Example: View the graph generated by opm file First, you need to launch Profile Analyzer by clicking the icon . Then choose File -> Open File_.. and select an opm file. In the editor, select a module item under the hierarchy Modules. Select a module item under Modules: Figure 175 Select a module Right-click the selected item and choose Show Call Graph ... in the context menu. Note: To open this dialog, you can only select a module item under the hierarchy tree starting with the name Modules. DLM Alphaworks Page 149 of 374 IBM STG - Performance Visual Performance Analyzer Figure 176 Open the Show Call Graph dialog In the popped up dialog Show Call Graph, choose Overall Call Graph. Click OK and you can see the graph is displayed in the view. Figure 177 Select the overall display mode � DLM Alphaworks Page 150 of 374 IBM STG - Performance Visual Performance Analyzer Figure 178 The call graph in overall mode The following example shows how to group insignificant function nodes into an oval node in Overall mode. Example: Group insignificant function nodes Launch the Show Call Graph dialog, and in the popped up dialog, choose Overall Call Graph in Display Mode, and then select You can specify base time threshold to group insignificant nodes. Specify a value in Base time threshold. You can click the button Update to locate the scroll bar below the field according to the specified base time. You can also drag the scroll bar below the field and the value of the base time in the field changes accordingly. Note: The option You can specify base time threshold to group insignificant nodes is enabled only in Overall mode. DLM Alphaworks Page 151 of 374 IBM STG - Performance Visual Performance Analyzer Figure 179 Specify a base time threshold to group insignificant nodes Then click OK to finish this dialog. If you specify 500 in Base time threshold, the function nodes bearing the base time smaller than 500 but not having child function nodes bearing the base time larger than 500 are grouped in an oval node. The following picture shows that the functions test1 and test11 which have the base time smaller than 500 are grouped in an oval node. 772.0 is the total value of the base time of the two functions. You may put your mouse on the oval node to get the name list of the grouped function nodes. DLM Alphaworks Page 152 of 374 IBM STG - Performance Visual Performance Analyzer Figure 180 The nodes with the value smaller than the base time threshold grouped in an oval The following example shows how to view the graph in Compound mode. Example: View the graph generated by opm file First, you need to launch Profile Analyzer by clicking the icon . Then choose File -> Open File_.. and select an opm file. In the editor, select a module item under the hierarchy Modules. Select a module item under Modules: Figure 181 Select a module node Right-click the selected item and choose Show Call Graph ... in the context menu. Note: To open this dialog, you can only select a module item under the hierarchy tree starting with the name Modules. DLM Alphaworks Page 153 of 374 IBM STG - Performance Visual Performance Analyzer Figure 182 Open the Show Call Graph dialog In the popped up dialog Show Call Graph, choose Group by Module. Click OK and you can see the graph is displayed in the view. Figure 183 Select the call graph in compound mode The following picture shows the call graph in Compound mode. The functions of the same module are grouped in the same rectangular box. The arrow in blue shows the calling relationship between modules. The text in blue is the name of the module. Each rectangular box has a collapse icon . You can click this icon to collapse all the function nodes in the rectangular box. DLM Alphaworks Page 154 of 374 IBM STG - Performance Visual Performance Analyzer Figure 184 The call graph in compound mode The following picture shows the collapsed module: Figure 185 The call graph with a collapsed module 5.1.10Create and Use Custom Counter You can define custom counter with native counters and profile details to view data more insightfully. For example, you can define Ticks / IFC as CPI, which calculates CPI value for all modules or symbols with Ticks and IFC values. Open Custom Counter Management Dialog Right-click in hierarchy tree of Profile Analyzer editor and choose Custom Counter Management... DLM Alphaworks Page 155 of 374 IBM STG - Performance Visual Performance Analyzer Figure 186 Open the Custom Counter Management dialog You will see Custom Counter Management dialog, in which you can add, edit, or delete custom counter from one configuration file. DLM Alphaworks Page 156 of 374 IBM STG - Performance Visual Performance Analyzer Figure 187 The Custom Counter Management dialog Create New Custom Counter 1. Click on button New..., and you can see Custom Counter Properties dialog. DLM Alphaworks Page 157 of 374 IBM STG - Performance Visual Performance Analyzer Figure 188 Create a new custom counter 2. Give a Name for the custom counter, for example "My_CPI". This property is demanded. 3. Give a Description for the custom counter. This property is optional. 4. Be sure to create a correct Formula. You can use Standand Formula edit dialog by click the Edit button. It helps you build a correct formula. DLM Alphaworks Page 158 of 374 IBM STG - Performance Visual Performance Analyzer Figure 189 Build a formula 5. Choose a Unit for the custom counter if needed. 6. Define Precision for the custom counter if needed. Figure 190 Define the number of precision for the custom counter Label Example shows example number with pointed precision. Select Custom Counter to Display After you create a custom counter, you can select it from the counter list to display it. Now right-click in the symbol table of Profile Analyzer editor and choose Select Counters to Dispaly. You can see all counters listed, including native counters and custom counters. DLM Alphaworks Page 159 of 374 IBM STG - Performance Visual Performance Analyzer Figure 191 Select the counters from counter list to display it Select My_CPI now. Then it is shown as an added column. DLM Alphaworks Page 160 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 192 The counter My_CPI is added and displayed in the symbol view 5.1.11Load symbols into the Inlined Calls View The Inlined Calls view displays two tables related to inlining: a Symbols with inlining table that shows what profiled symbols inline other symbols, and an Inlined symbols table that lists inlined symbols, and for each such inlined symbol, which profiled symbols inline it. Once a profile is loaded that contains JIT inlining information or for which inlining XML files were subsequently imported, select an item in the profile hierarchy, or one or more symbols in the symbol list, to display inlining information for that hierarchy branch or symbol selection. The selection action results in a two-stage operation: 1. � A list of symbols is created internally within Profile Analyzer, and provided to the Symbols with inlining table in the view. The Symbols with inlining table displays all symbols in the selection that contains inlined symbols or macros, and the inlined symbols or macros beneath each such symbol. If the selection in the profile hierarchy or symbol list consists of ten or fewer symbols, all symbols are displayed. If the selection consists of more than ten symbols, only those containing inlined symbols or macros are displayed. DLM � Alphaworks Page 161 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 193 The Inlined Calls view displaying the inlined symbol information of the selected symbol Troubleshooting FAQ Q: No inlined children display for a parent symbol A: If no "+" or "-" symbol is displayed, then no information about inlined symbols or macros is available for that symbol, which could be caused by any of the following list: • • • The symbol did not inline any symbols or macros For Java methods compiled by the IBM Virtual Machine for Java* (J9), the inlining information was not provided in the line number buffer, or no line number buffer was provided (e.g. -Xjit:enableJVMPLineNumbers was not specified) For listing-handler generated inlining XML files, the XML file was not imported, or the XML file was generated for a different profile or for a profile hierarchy selection that did not result in inclusion of the symbol in question. DLM � Alphaworks Page 162 of 374 IBM STG - Performance Visual Performance Analyzer 5.2Code Analyzer You can also find the Code Analyzer User Guide from within VPA. Select Help - Help Contents within VPA. To get context sensitive help, press F1 for Windows and AIX or press Ctrl+F1 for Linux. 5.2.1Load an executable application for analysis When you first start Visual Performance Analyzer after installation, what you see is Welcome view. To open Code Analyzer, you can choose Windows -> Show Perspective -> Other-> CodeAnalyzer. The following screen capture shows the default workbench window of Code Analyzer. Figure 194 The workbench of Code Analyzer Choose File -> CodeAnalyzer -> Analyze Executable. DLM Alphaworks Page 163 of 374 IBM STG - Performance Visual Performance Analyzer Figure 195 Open a binary executable file In the pop-up wizard, select the executable application you want to analyze. Figure 196 Select an executable file If you click Open, the executable application will be loaded in Code Analyzer. DLM Alphaworks Page 164 of 374 IBM STG - Performance Visual Performance Analyzer Figure 197 The Code Analyzer perspective loaded with an executable file When an executable application is loaded, a wizard for further action opens as follows for you to load profile information. Figure 198 The wizard to load profile information If you want to view the profile information of the loaded executable application, you can choose the address of profile file in the preceding wizard. After you clicked OK, the profile information is added into Code Analyzer workbench window. DLM Alphaworks Page 165 of 374 IBM STG - Performance Visual Performance Analyzer Figure 199 The Code Analyzer perspective loaded with a profile file 5.2.2Run instrumented executable file and collect profile data remotely Code Analyzer allows you to run the instrumented executable application on a remote target host, and collect the generated profile data. This is one of the typical steps in creating an optimized version of an application. Please follow the following steps to perform this task: 1. Load an executable file to be analyzed 2. Instruments the executable file 3. Run the instrumented executable file and collect profile data This function is enabled only after you successfully instrumented the executable file. To launch the action, choose File -> CodeAnalyzer -> Actions -> Run Instrumented or just click the button on the CodeAnalyzer toolbar. Then a set of dialogs and wizard will lead you to run the instrumented executable file remotely and load the collected profile data, as follows: • In the following dialog, specify the output path to which the instrumented executable file is generated. DLM Alphaworks Page 166 of 374 IBM STG - Performance Visual Performance Analyzer Figure 200 Specify the output path for the instrumented file to be generated Create a new remote connection: Figure 201 Create a remote connection � DLM Alphaworks Page 167 of 374 IBM STG - Performance � Visual Performance Analyzer Select an existing remote connection: � Figure 202 Select an existing remote connection • Configure the instrumented file execution. Basically, you need to specify the working directory on remote host and the command to be executed. In addition, you must select some workload data to upload in case of the workload data helping FDPR-Pro to run instrumented file. DLM � Alphaworks Page 168 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 203 Configure the instrumented file execution • Press the Finish button to go ahead. Confirm to continue because the following process takes a while normally. Figure 204 Confirm to continue • Authenticate on the remote host. Usually you must input a password DLM � Alphaworks Page 169 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 205 Input a password for authentication • • After successfully authentication and connecting to the remote host, a Progress Information dialog showing the steps in execution After you finish remotely running the instrumented executable file, confirm to load the collected profile downloaded from remote host Figure 206 Confirm to load the collected profile • If you choose to continue, CodeAnalyzer will reload the original executable file and load the collected profile file to it, then user can examine the frequency of each basic block in loaded executable file. DLM � Alphaworks Page 170 of 374 IBM STG - Performance Visual Performance Analyzer Figure 207 The Instructions editor loaded with the executable file and collected profile file 5.2.3Adding profiling information If you have been working with an executable application, without profiling information, you can add the profiling information by Choosing File -> CodeAnalyzer -> Add Profile Information. If you find the profile information file of the loaded executable application, you can choose a profile file in the wizard. After pressing OK, the profile information will be added into Code Analyzer workbench window. The following screen capture shows the workbench window after the profile information is loaded. DLM Alphaworks Page 171 of 374 IBM STG - Performance Visual Performance Analyzer Figure 208 The Code Analyzer workbench window loaded with a profile file 5.2.4Navigate the Executable 5.2.4.1Navigate the Program Tree Program tree displays a hierarchical view of the loaded executable. It is automatically opened in the left side of the workbench window when you first use CodeAnalyzer. You can open program tree view by choosing Windows -> Show View -> Program Tree. First, you must load an executable. Then navigate the program tree by expanding all, collapsing all, sorting, going into opening source code and getting control flow. 5.2.4.1.1Expand All To get all the functions under each file in the loaded executable, press button program tree expands as follows: on the toolbar. The result view of DLM Alphaworks Page 172 of 374 IBM STG - Performance Visual Performance Analyzer Figure 209 The expanded program tree 5.2.4.1.2Collapse All To close all sub-items of each file in the executable, press button shows as follows: on the toolbar. The result view of program tree DLM Alphaworks Page 173 of 374 IBM STG - Performance Visual Performance Analyzer Figure 210 The collapsed program tree 5.2.4.1.3Sort There are three kinds of sorting operations in program tree view: alphabetic order, ascending order and descending order. You can press their corresponding buttons , , on the view's toolbar or right-click any object in the view and choose these items. 5.2.4.1.4Go Into If you want to get detailed information of the items which are under certain object, you can select this object and click . The program tree then displays all its sub-items. If you want to go to upper level, just click can lead you to the root view. . Clicking For example, if you want to see all the functions in the file os.c, choose os.c file in program tree as follows: DLM Alphaworks Page 174 of 374 IBM STG - Performance Visual Performance Analyzer Figure 211 Select a tree node The result view displays all the functions within this file. DLM Alphaworks Page 175 of 374 IBM STG - Performance Visual Performance Analyzer Figure 212 The subfunctions of the os.c file You can also navigate forward or backward as you like. 5.2.4.1.5Open source code To open source code, right-click any object in program tree and select Open Source Code. DLM Alphaworks Page 176 of 374 IBM STG - Performance Visual Performance Analyzer Figure 213 View the source code of the function 5.2.4.1.6Get Control Flow To get calling functions and called functions, select a function in the program tree, right-click and choose Control Flow. DLM Alphaworks Page 177 of 374 IBM STG - Performance Visual Performance Analyzer Figure 214 Get the calling and called function The called functions of the selected function in the preceding screen capture show as follows: Figure 215 The called functions of the selected function in figure 315 � DLM Alphaworks Page 178 of 374 � IBM STG - Performance Visual Performance Analyzer The calling functions of the selected function in the preceding screen capture show as follows: Figure 216 The calling functions of the selected function in figure 315 5.2.4.2View Static Color Bar Static color bar gives an overview of frequency distribution of basic blocks in the loaded executable. You can open this view by choosing Windows -> Show View -> Static Color Bar. To obtain this information, do the following steps: 1. Load an executable 2. Add profile information After the profile information is added, there is a yellow pointer on the static color bar. It indicates the position of basic block you have selected. The yellow pointer scrolls in accordance with the basic block you select in the Instructions Table or Program Tree. The following screen capture shows the selection of a specific color bar. DLM Alphaworks Page 179 of 374 IBM STG - Performance Visual Performance Analyzer Figure 217 Navigate the Instructions editor through the Static Color Bar view 5.2.4.3Navigate Instructions View Instructions view is the default view of Instructions Table. It shows the contents of an executable or shared object as a table of assembly instructions, with its control flow graph drawn vertically at its side. To open this view, select Switch to instructions in the Instructions Table's title bar. The explanation of functions of some buttons in title bar can be found in Instructions Table. In Instructions view, when a new file begins, there is a line of table beginning with icon and followed by its name. When a new function starts, there is also declaration in a separate line. Instructions which belong to the same basic block are organized together with blank lines between them. The last basic block in a function begins with a line as follows. DLM Alphaworks Page 180 of 374 IBM STG - Performance Visual Performance Analyzer Figure 218 The UNREACHED line indicating the beginning of the last basic block After adding FDPR-Pro profile information, you can get the frequency of each instruction in color. The denotation of each kind of color can be found in the lower part of workbench window. Instruction groups are indicated in the front of each instruction. In Comment column, there are red triangles which contain performance comments to specific instructions. DLM Alphaworks Page 181 of 374 IBM STG - Performance Visual Performance Analyzer Figure 219 The Instructions editor with frequency If you right-click any line item in Instruction view, a menu list opens. Typically, there are four sections of this menu. In the first section, you can get PPC( Power PC) assembly reference help, branch profile information, dispatch information, and set points to collect the value of important resources during profiling. In the second section, you can choose to find specific instruction or open source code of the selected instruction. In the third section, the menu shows the target of the basic block (Fall thru means the next basic block) to which the selected instruction belongs. The last section shows the callers of the basic block to which the selected instruction belongs. The first and the last two sections might be varied according to the selected instruction. 5.2.4.3.1Get PPC help To get the assembly reference of certain instruction, right-click the instruction and choose Show PPC Help. You can also select the instruction and click the button on the toolbar. 5.2.4.3.2Show Dispatch Information To get the dispatch information of certain instruction, right-click this instruction and choose Show Dispatch Info. You can also select this instruction and click the button on the toolbar. To obtain more detailed description, please refer to View dispatch information. 5.2.4.3.3Show Branch Information To get the branch information of certain basic block, right-click the last instruction of this block and choose Show Branch Info. You can also select the last instruction of this basic block and click the button on the toolbar. To obtain more detailed description, please refer to View branch profile. DLM Alphaworks Page 182 of 374 IBM STG - Performance Visual Performance Analyzer 5.2.4.3.4Collect Value Profiling Before the profile information of the loaded executable is added, you can collect the value of certain resources of specific instructions. To get this information, you need to first select these instructions by right-clicking an instruction and choosing Collect Valuing Profiling. A wizard then opens for you to choose resources value you try to get. You can also do this by clicking the button on the toolbar. To obtain more detailed information, please refer to View value profile. 5.2.4.3.5Find Instructions To find specific instructions in this view, click the button instruction and choosing Find. on the toolbar. You can also open it by right-clicking any The following screen capture shows the initial activating of the Find wizard. Figure 220 Find an instruction A Find Instruction wizard opens. DLM Alphaworks Page 183 of 374 IBM STG - Performance Visual Performance Analyzer Figure 221 The Find Instruction wizard The default instruction to be sought is the selected instruction. The initial scope of this wizard is set between the start address and the end address of the loaded executable application or shared object. You may change the scope as you like, but it should not exceed the boundary. To choose the direction of searching, select Forward or Backward. Error and information messages are displayed at the bottom of the dialog. 5.2.4.3.6Open Source Code To get the source code of the selected instruction, right-click this instruction and choose Open Source Code. You can also use the button on the toolbar. For more information, please refer to View source code. 5.2.4.3.7Go to callers and callees Right-click the first instruction in this basic block. The menu list shows its callers, their addresses and functions name which they belong to. The caller is defined to be the first instruction of basic block which calls the selected instruction. DLM Alphaworks Page 184 of 374 IBM STG - Performance Visual Performance Analyzer Figure 222 Go to caller To get target basic blocks of the selected instruction, right-click the last instruction of a basic block. The menu list then shows its target basic blocks, with addresses of their first instructions and functions name which they belong to. If its target is the next basic block, the menu list shows Fall thru only. DLM Alphaworks Page 185 of 374 IBM STG - Performance Visual Performance Analyzer Figure 223 Go to the next basic block The preceding screen capture shows that the selected basic block has two target basic blocks. One is the following one, and the other is in the function .main. You can jump to these basic blocks simply by clicking it. You can verify this relationship by referring to the graph beside it. 5.2.4.3.8Navigate along with program tree In the program tree on the left side of the workbench window, a hierarchical organization of the loaded executable application or shared object is displayed. You can navigate along the Instructions Table by selecting objects in Program Tree, or vice versa. For example, if you select the first file in the Program Tree, the first instruction of this file will be highlighted in instructions view, the first basic block of this file will be highlighted in blocks view and the first function of this file will be highlighted in functions view. The following screen capture shows the selection of the first file in Program Tree. DLM Alphaworks Page 186 of 374 IBM STG - Performance Visual Performance Analyzer Figure 224 The selection of the first file in Program Tree view The selected instructions in instructions view changes correspondingly. 5.2.4.4Navigate the Blocks View Blocks view shows the detailed information of loaded executable or shared object in the form of basic blocks. To open this view, select Switch to blocks on the Instructions Table's toolbar. The explanation of functions of some buttons on toolbar can be found in Instructions Table .In Blocks view, there is a button in the front of the first basic block of a file in loaded executable or shared object. When a new function begins, the button is displayed in the front of its first basic block. The following screen capture shows blocks view of a loaded executable. DLM Alphaworks Page 187 of 374 IBM STG - Performance Visual Performance Analyzer Figure 225 The Basic Blocks editor loaded with executable file After adding FDPR-Pro profile information, you can get the frequency of each basic block in color. The denotation of each kind of color can be found in the lower part of the window. DLM Alphaworks Page 188 of 374 IBM STG - Performance Visual Performance Analyzer Figure 226 The Basic Blocks editor loaded with profile file If you right-click a basic block, a menu list is displayed. Typically, there are three sections in this menu. In the first section, you can search for specific basic block or open source code of the selected basic block. In the second section, the menu shows the target of this basic block (Fall thru means the next basic block). The last section gives information of all the callers of this basic block. The last two sections might be varied according to the selected basic block. 5.2.4.4.1Find basic blocks To find specific basic blocks in this view, press button basic block and choosing Find. on the toolbar. You can also open it by right-clicking any The following screen capture shows the initial activating of the Find wizard. DLM Alphaworks Page 189 of 374 IBM STG - Performance Visual Performance Analyzer Figure 227 Find a basic block A Find BB wizard opens. DLM Alphaworks Page 190 of 374 IBM STG - Performance Visual Performance Analyzer Figure 228 The Find BB (Basic Block) wizard The default basic block to be sought is the selected one. The initial scope of this wizard is set between the start address and the end address of the loaded executable or shared object. You may change the scope as you like, but it should not exceed the boundary. To choose the direction of searching, select Forward or Backward. Error and information messages are displayed in the bottom of the dialog. 5.2.4.4.2Open Source Code To get the source code of the selected basic block, right-click this basic block and choose Open Source Code. You can also use the button on the toolbar. For more information, please refer to View source code. 5.2.4.4.3Go to callers and callees Right-click any basic block in this view. The menu list shows all its callers and their addresses. The address of caller is defined to be that of the first instruction in the basic block which calls the selected basic block. The menu list also displays the information of its target basic blocks, along with the addresses of their first instructions and functions name which they belong to. If its target is the next basic block, the menu list shows Fall thru only. By choosing its callees or callers, you can jump directly to these basic blocks. In the following screen capture, the selected basic block has two callees and one caller. DLM Alphaworks Page 191 of 374 IBM STG - Performance Visual Performance Analyzer Figure 229 Go to the caller basic block You can verify these relationships by referring to Graph column on the right side. 5.2.4.5Navigate Functions View Functions view shows the detailed information of loaded executable or shared object in form of functions. To open this view, select Switch to functions in the Instructions Table's title bar. The explanation of functions of some buttons in title bar can be found in Instructions Table .In Functions view, there is a button in the front of the first function of a file in loaded executable or shared object. The following screen capture shows the functions view of a loaded executable. DLM Alphaworks Page 192 of 374 IBM STG - Performance Visual Performance Analyzer Figure 230 The Functions editor loaded with executable file After adding FDPR-Pro profile information, you can get the frequency of each function in color. The denotation of each kind of color can be found in the lower part of the workbench window. DLM Alphaworks Page 193 of 374 IBM STG - Performance Visual Performance Analyzer Figure 231 The Functions editor loaded with profile file If you right-click a function, a menu list opens. Typically, there are three sections of this menu. In the first section, you can choose to find functions or open source code of the selected function. In the second section, the menu shows the target functions of this one. The third section displays the callers of this function. The last two sections might be varied according to the selected function. 5.2.4.5.1Find functions To find specific functions in this view, press button function and choosing Find. on the toolbar. You can also open it by right-clicking any The following screen capture shows the initial activating of the Find wizard. DLM Alphaworks Page 194 of 374 IBM STG - Performance Visual Performance Analyzer Figure 232 Find a function A Find Function wizard opens. DLM Alphaworks Page 195 of 374 IBM STG - Performance Visual Performance Analyzer Figure 233 The Find Function wizard The default function to be sought is the selected one. The initial scope of this wizard is set between the start address and the end address of the loaded executable or shared object. You may change the scope as you like, but it should not exceed the boundary. To choose the direction of searching, select Forward or Backward. Error and information messages are displayed at the bottom of the dialog. 5.2.4.5.2Open Source Code To get the source code of the selected function, right-click this function and choose Open Source Code. You can also use the button on the toolbar. For more information, please refer to View source code. 5.2.4.5.3Go to callers and callees Right-click any function in this view. The menu list shows all its callers and their addresses. The address of caller is defined to be that of the first instruction in the basic block which calls the selected function. The menu list also displays the information of its target basic blocks, along with addresses of their first instructions and functions name which they belong to. If its target is the next basic block, the menu list shows Fall thru only. By choosing its callees or callers, you can jump directly to these functions. In the following screen capture, the selected function has one callee and six callers. DLM Alphaworks Page 196 of 374 IBM STG - Performance Visual Performance Analyzer Figure 234 Go to caller and callee functions You can verify these relationships by referring to the Graph column on the right side of the view. 5.2.5Instruction Properties Analysis You can perform the following tasks to analyze the instruction properties within Code Analyzer: 5.2.5.1View Branch Profile Branch Profile table shows the detailed information of targets of the instruction in the end of an instruction group. This information is available only after loading a profile file. To get the information, please follow these steps: load an executable, add profiling information, be sure to open the Instructions Table, set instructions view in the Instructions Table, select the last instruction of an instruction group and right-click it and then choose Show Branch Profile. DLM Alphaworks Page 197 of 374 IBM STG - Performance Visual Performance Analyzer Figure 235 Display branch profile information � You can also click the button on the toolbar of Instructions Table. DLM Alphaworks Page 198 of 374 IBM STG - Performance Visual Performance Analyzer Figure 236 Click on the toolbar to display branch information Then branch profile tab of instruction properties opens at the right bottom of the workbench window. It displays the addresses (including function's name) and counts of the target basic blocks of the selected instruction group. Figure 237 The Branch Profile tab with branch profile information To simultaneously display branch profile information while scrolling along Instructions Table, click in Instruction Properties view's toolbar. If you select an instruction within an instruction group, branch profile will show the following information: DLM Alphaworks Page 199 of 374 IBM STG - Performance Visual Performance Analyzer Figure 238 No branch profile information available for the nonbranch instruction selected 5.2.5.2View Dispatch Information In Power5 or Power6 architecture, instructions are tracked in groups of one to five instructions rather than as individual instructions. Groups are formed that contain up to five internal instructions, each occupying an internal instruction slot (numbered 0 through 4). Each internal instruction slot in a group feeds separate issue queues for the floating-point units, the branch execution unit, the CR execution unit, the logical CR execution unit, the fixedpoint execution units and the load/store execution units. With profile information, CodeAnalyzer can display this information in Dispatch information tab in Instruction Properties view. To get dispatch information of an executable file, please follow these steps: Load an executable file, add the profiling information, open the Instruction Table, set the instructions view and click in the Code Analyzer toolbar. Select the kind of Power architecture your executable is run on in the following wizard. Figure 239 Group the instructions in specific architecture In the Instruction Table view, select an instruction, right-click and choose Show Dispatch Info. The architecture you have chosen in the previous step will display alongside the menu item. DLM Alphaworks Page 200 of 374 IBM STG - Performance Visual Performance Analyzer Figure 240 Show dispatch information 5.2.5.3View Value Profile Value Profile is used to show the resources, their values and counts of specific instruction in the loaded executable. To collect these values, please follow these steps: load an executable, open the Instructions Table view and set instructions view and then set to collect resources of some instruction by right-clicking each of them and choose Collect Value Profiling. DLM Alphaworks Page 201 of 374 IBM STG - Performance Visual Performance Analyzer Figure 241 Collect value profiling Choose resources and their types in the following wizard. Figure 242 Select the resources and their types � Click Ok. The following screen capture shows an icon that may be displayed on the selected instructions after the previous steps. DLM Alphaworks Page 202 of 374 IBM STG - Performance Visual Performance Analyzer Figure 243 The instructions to be collected value profiling with grey icons 1. Click on the Code Analyzer toolbar to run instrumentation. 2. Click to write the instrumented file to disk. If you are running on a windows, please make sure to set the output of profile-file value to the directory ./ 3. Upload the instrumented executable and the profile file you have created to an AIX machine. Make sure to put them under the same directory. 4. Run the instrumented executable with some training data. It will write the information of collected value to your original profile file. You can get relevant training data from SPEC2000. 5. Copy the profile file created back to the windows. 6. Load the original executable in Code Analyzer again. 7. Add profile file you've created in --profile--file. 8. The following screen capture shows that the original grey button might change to green icon in the front of the selected instruction after doing the previous steps. DLM Alphaworks Page 203 of 374 IBM STG - Performance Visual Performance Analyzer Figure 244 Grey icons of the instructions changed to green icons To view the value of collected resources, right-click the marked instruction and choose Show Value Profile. The resources value is displayed in Value Profile table. You can also click on Instructions Table view's toolbar. Figure 245 The Value Profile tab with value profile information To simultaneously display value profile information while scrolling along Instructions Table, click on Instruction Properties view's toolbar. If you select an instruction without green icon, value profile shows as follows: DLM Alphaworks Page 204 of 374 IBM STG - Performance Visual Performance Analyzer Figure 246 No value profile information available for the selected instruction without green icon 5.2.5.4Collect Comments Code Analyzer can display comments generated by FDPR-Pro engine. Different type of files have different comments for collection. So far there are three sources of comments: Power 5, Power 6 and general comments. All the general comments are dependent on profile information. To view their comments, you need to collect comments first. Please do the following the steps: 1. Load an executable 2. Add profile information of this executable 3. Be sure to open Instructions Table 4. Select File - > Code Analyzer - > Collect Hazard Info or click button 5. Click button to open Comments view to choose type of comments to collect Figure 247 The Comments view 6. Click button or on toolbar to navigate each comment in Instruction table DLM Alphaworks Page 205 of 374 IBM STG - Performance Visual Performance Analyzer 7. Click button to display the currently collected comments statistics 8. Please notice that if you have restricted grouping to some architecture, the Comments view might remove inappropriate comments from it. 5.2.6Statistic Analysis Code Analyzer provides you with a graphical display of statistics gathered on the loaded executable based on different degrees of granularity. There are three perspectives of analysis: file, function and instruction mix. They are shown in the form of tab in Statistics view. All the graphs are drawn to scale in cylinder. Each column is colored according to its frequency heat. The Statistics view normally displays top (hottest) files, functions or instructions. In the upper level of each tab, there is filter button. Only items that pass the average threshold you set in filter value is displayed. Therefore, if you enter 0% and click refresh, the view shows all the files. And if you enter 100% and click refresh, a single column is displayed, representing the hottest execution unit. To open these views, you need to load an executable and add profile information first. Then click the corresponding buttons on the toolbar of Code Analyzer. You can also open it from File -> Code Analyzer -> Statistics. When your cursor stops in a column in the graph for a while, there is reference information as follows: Figure 248 The file heat graph 5.2.6.1View File Heat Graph To open File Heat Graph, click button on Code Analyzer toolbar or choose File -> CodeAnalyzer -> Statistics -> Files Heat. You can also select the tab within Statistics view. After that, load an executable and add necessary profile information. The following screen capture shows a typical file heat graph of loaded executable. DLM Alphaworks Page 206 of 374 IBM STG - Performance Visual Performance Analyzer Figure 249 The file heat graph of a loaded executable file The x-ordinate shows the names of files in the loaded executable. The y-ordinate denotes execute count distribution. Each column of the graph refers to a file in the loaded executable. Their color shows how frequently they are called. You can set graph options to customize the view. There are three methods in the average method box: simple average, weighted average and highest function value. By inputting the minimum percentage in the latter box and clicking Refresh, you can get the count distribution of those files whose percentages calculated by the average method are above this value. For example, if you use the highest function value as average method and try to filter 25% files, the graph options should be set as follows: Figure 250 The graph options of file heat graph Click Refresh. Then the files whose highest function value percentages are at least 25% of the maximum value is listed in the graph as follows: DLM Alphaworks Page 207 of 374 IBM STG - Performance Visual Performance Analyzer Figure 251 The result file heat graph with value percentage under 25% and sorted by highest function value 5.2.6.2View Function Heat Graph To get Function Heat Graph, you can click button on Code Analyzer toolbar or choose File -> CodeAnalyzer -> Statistics -> Functions Heat. You can also select the tab in Statistics view. After that, load an executable and add necessary profile information. The following screen capture shows a typical function heat graph of loaded executable. DLM Alphaworks Page 208 of 374 IBM STG - Performance Visual Performance Analyzer Figure 252 The function heat graph The x-ordinate shows the names of functions in the loaded executable. The y-ordinate denotes execute count distribution. Each column of the graph refers to a function in the loaded executable. Their color shows how frequently they are called. You can set graph options to customize the view. There are four average methods: simple average, weighted average, highest BB value and prolog value. By inputting the minimum percentage in the latter box and clicking Refresh, you can get the count distribution of those functions whose percentages calculated by the average method are above this value. For example, if you use the highest BB value as average method and try to filter 25% files, the graph options should be set as follows: Figure 253 The graph options of function heat graph Click Refresh. Then the functions whose highest BB value percentages are at least 25% of the maximum value is listed in the graph as follows: DLM Alphaworks Page 209 of 374 IBM STG - Performance Visual Performance Analyzer Figure 254 The result function heat graph with value percentage under 25% and sorted by highest basic block value 5.2.6.3View Instruction Graph To open Instruction Graph, click button on Code Analyzer toolbar or choose File -> CodeAnalyzer -> Statistics -> Instruction Mix. You can also select the tab in Statistics view. There are two modes of instruction mix graph: count and percentage. You can click the button Show Executions or Show Count to switch between them. The following screen capture shows a typical instruction count graph of loaded executable. DLM Alphaworks Page 210 of 374 IBM STG - Performance Visual Performance Analyzer Figure 255 The instruction count graph The x-ordinate shows the names of instructions in the loaded executable. The y-ordinate denotes count distribution of instructions. Each column of the graph refers to an instruction. Their color shows how frequently these instructions appear in the loaded executable. The following screen capture shows a typical instruction executions graph of loaded executable. DLM Alphaworks Page 211 of 374 IBM STG - Performance Visual Performance Analyzer Figure 256 The instruction executions graph 5.2.6.4View Comments Graph To get Comments Graph, be sure to collect comments first. Then press click on toolbar or choose File -> CodeAnalyzer -> Statistics -> Comments . You can also select its tab in Statistics view directly. There are two options of graph: show graph for comment and show graph for function. In graph for comments, you can select type of comments to display. DLM Alphaworks Page 212 of 374 IBM STG - Performance Visual Performance Analyzer Figure 257 The comments graph In the preceding graph, the x-ordinate shows the names of functions that have comments of Load After Store in power 5. The y-ordinate denotes the number of this comment for each function. Each column of the graph refers to a function. DLM Alphaworks Page 213 of 374 IBM STG - Performance Visual Performance Analyzer Figure 258 The comments graph for the .checkfneg function In graph for function, you can select any of functions which have comments and display the number of different comments it contains. In the preceding graph, the x-ordinate shows the name of comments we try to collect in the first step. The y-ordinate denotes the number of these comments in .checkfneg function. You can filter the value of columns by clicking Refresh. 5.3Pipeline Analyzer Pipeline Analyzer is a port of the IBM Performance Simulator for Linux on POWER™, another alphaWorks technology. Pipeline joins the VPA toolkit to provide VPA users with the means of examining how code is executed on various IBM POWER processors. Pipeline Analyzer displays the pipeline execution of instruction traces generated by a POWER series processor. It does so by providing a scroll view and a resource view of the instruction execution. You can also find the Pipeline Analyzer User Guide within VPA. Select Help - Help Contents within VPA. To get context sensitive help, press F1 for Windows and AIX or press Ctrl+F1 for Linux. DLM Alphaworks Page 214 of 374 IBM STG - Performance Visual Performance Analyzer 5.3.1Load an existing pipeline file The IBM Performance Simulator for Linux on POWER™ project has directions for capturing an instruction trace and generating Pipeline data files. Once you have made a run and generated a .pipe and .config file you can use the Pipeline Analyzer to look at them. When you start Visual Performance Analyzer, what you see is Welcome view. To open Pipeline Analyzer, choose Window -> Open Perspective -> Other -> Pipeline Analyzer. The following screen capture shows the initial layout of Pipeline Analyzer. Figure 259 The initial layout of Pipeline Analyzer � DLM Alphaworks Page 215 of 374 IBM STG - Performance Visual Performance Analyzer Choose File -> Open File_ , and in Open File dialog select .pipe file with scroll mode information for inspection. Figure 260 Open a scroll-mode pipeline file Please note if your .config file has different name as .pipe file, a second dialog for corresponding .config file will turn up for you to choose. DLM Alphaworks Page 216 of 374 IBM STG - Performance Visual Performance Analyzer Next, the Pipeline Analyzer perspective loads the data of .pipe file. A scroll editor is opened automatically. The general information of this .pipe file is displayed in the panel of Pipeline view. The following screen capture shows the data loading of this file. Figure 261 Loading the pipeline data � DLM Alphaworks Page 217 of 374 IBM STG - Performance Visual Performance Analyzer To open .pipe file with resource mode information, choose File -> Open File_. A resource editor is opened as follows: Figure 262 The Pipeline Analyzer perspective with resource information 5.3.2Navigating the scroll pipeline view Each time Pipeline Analyzer perspective is opened, a Pipeline View turns up in the left part of perspective. It shows the detailed information of the currently active editor. To open Pipeline View manually, choose Window -> Show View -> Pipeline Category -> Pipeline View. Note every perspective contains only one Pipeline View. To view pipeline file containing scroll mode information, select File -> Open File and choose a corresponding file. A scroll editor opens in the Pipeline Analyzer editor and Pipeline View display its information at the same time. DLM Alphaworks Page 218 of 374 IBM STG - Performance Visual Performance Analyzer 5.3.2.1Using the overview graph In the overview graph, the green box indicates the boundary of data displayed in the currently active scroll editor. To display data elsewhere, click its location in the graph. You will see that the green box moves to where you just click and scroll editor displays the data in detail. Figure 263 Navigate the scroll editor through overview graph 5.3.2.2Zoom in or out To zoom in or out this graph, click buttons on toolbar. To fit both width and height of the graph to overview panel, choose on toolbar. Note that no scroll bars display in overview panel after this operation. 5.3.2.3The event message and offset panel The Event Message and Offset panel displays the denotation of instruction event which your mouse cursor at in the Scroll editor. The following screen shot shows the change of this information when the mouse moves from symbol D to symbol M in the same line of table. DLM Alphaworks Page 219 of 374 IBM STG - Performance Visual Performance Analyzer Figure 264 The information of the selected symbol displayed in the Event Message and Offset panel If two more events in an instruction execution occur at the same time cycle, their corresponding symbol is highlighted and all the events be listed in Event Message panel. The following screen shot shows the event message of a highlighted symbol "F" in the preceding picture. Figure 265 The denotation of the symbol F If there are too many event messages to display, you can resize this panel. DLM Alphaworks Page 220 of 374 IBM STG - Performance Visual Performance Analyzer To scroll simultaneously with the currently active editor, press effect. on toolbar. The following screen shot shows this Figure 266 Scroll simultaneously with the current active editor Please note that this function ensures that the green box is always within the overview graph of Pipeline View. 5.3.3Navigating the resource view To view pipeline file containing resource mode information, select File -> Open File and choose a corresponding file. A resource editor opens in the Pipeline Analyzer editor and Pipeline View displays its information at the same time. DLM Alphaworks Page 221 of 374 IBM STG - Performance Visual Performance Analyzer The following screen shot shows the appearance of resource editor after a pipeline file is loaded. Figure 267 The Pipeline Analyzer perspective with resource editor opening In resource editor, a side bar named Resource Name in the right lists all the resources recorded in pipeline file. Each line of table in the left shows the usage distribution of this resource during time period. Each symbol in the table means an instruction event. Its denotation is shown in Event Message panel of Pipeline View. If more than two events struggle for one resource at the same time, its symbol in the table turns red automatically. In this case, all the event messages are listed in Event Message panel of Pipeline View. From General panel of Pipeline View, you can see that this file has a total of 507 cycles and 11648 lines. The current time divider is 1, which means each symbol in the table indicates one time cycle. The two red sliders in the table are named slider bar. It focuses on the cell which your mouse cursors at. Its ordinate is shown as Cursor value of Offset panel in Pipeline View. The grey sliders in the table are named base axes. It focuses on the latest click of your left mouse. Its ordinate is shown as Base value of Offset panel in Pipeline View. The distance between slider bar and base axes is calculated and shown as Offset value of Offset panel in Pipeline View. To navigate the editor, you can press left, right, up, down keys or 'H', 'L', 'K', ‘J' keys. 5.3.3.1Zoom in or out To zoom in or out the table in scroll editor, click buttons on tool bar or select it in Pipeline menu. DLM Alphaworks Page 222 of 374 IBM STG - Performance Visual Performance Analyzer 5.3.3.2Show Dots To show dots in the table, press dots. on toolbar or select it in Pipeline menu. The following picture shows a table with Figure 268 Show dots in the table 5.3.3.3Assign Automatically To assign symbols to the events automatically, press in tool bar or select it in Pipeline Analyzer menu. The following pictures demonstrate that the events are shown as symbols assigned automatically. The following two pictures demonstrate the editor when Assign Automatically button is not pressed or pressed respectively. No manual assignment or automatic assignment: Figure 269 The resource editor without manual assignment or automatic assignment � DLM Alphaworks Page 223 of 374 IBM STG - Performance Visual Performance Analyzer The Assign Automatically button is pressed: Figure 270 The resources assigned automatically in the editor 5.3.3.4Assign Manually To assign symbols to events manually, select Assign Manually in Pipeline Analyzer menu to open Manage Assignments dialog. In the resource editor, an events are shown as the symbols according to the following regulations:  If the event matches with an assignment which was set manually, it is shown as the symbol in the assignment.  If you press the button Assign Automatically, the event is shown as the symbol which is automatically assigned.  Otherwise, the event is shown as blank. The following pictures demonstrate the editor when some assignments are set manually. No manual assignment and automatic assignment: DLM Alphaworks Page 224 of 374 IBM STG - Performance Visual Performance Analyzer Figure 271 The resource editor without manual assignment or automatic assignment Then the Assign Automatically button is pressed: Figure 272 Assign the resources automatically Then an assignment (event="Deprefetch", value="1*", symbol="G") is added: DLM Alphaworks Page 225 of 374 IBM STG - Performance Visual Performance Analyzer Figure 273 An assignment(event="Deprefetch", value="1*", symbol="G") is added Then the Assign Automatically button is released: Figure 274 Release the Assignment Automatically button 5.3.3.5Show Hover To show the denotation of each symbol in the table, press on tool bar or select it in Pipeline menu. While your mouse cursors at a line in the table for a while, a label which explains this line of resource usage is displayed. The following screen shot shows the symbol message for a confile point in resource editor. DLM Alphaworks Page 226 of 374 IBM STG - Performance Visual Performance Analyzer Figure 275 The hover tag of the selected resource 5.3.3.6Show Slider To show slider bar while scrolling around the table, press on toolbar or select it in Pipeline menu. 5.3.3.7Hide Unvisited Resources When the usage distribution of pipeline file is distributed loosely, you can hide those unvisited resources. To do this function, press in tool bar or select it in Pipeline menu. The following screen capture shows the condensed resource table for the above opened file. DLM Alphaworks Page 227 of 374 IBM STG - Performance Visual Performance Analyzer Figure 276 The condensed resource editor 5.3.3.8Change Color for symbol To change color for each symbol in the table, choose Pipeline -> Settings... -> Trace. DLM Alphaworks Page 228 of 374 IBM STG - Performance Visual Performance Analyzer Figure 277 The Trace tab of the Preferences dialog For example, if we set Dprefetch event in the opened resource file in the first screen shot to yellow, the resource editor changes as follows: Figure 278 The color of the Dprefetch event changed into yellow Please note, the red lines in this view mean conflict for resource use. This is a system-defined color. 5.3.3.9Change Time Divider To change the number of time cycles for each symbol in the table, choose Pipeline -> Change Time Divider. DLM Alphaworks Page 229 of 374 IBM STG - Performance Visual Performance Analyzer Figure 279 The Change time divider dialog The default value of this box is the current time divider. The following screen capture shows the resource editor after setting time divider to three. Figure 280 The resource editor with the time divider changed to 3 5.3.4Manage Assignments Select Assign Manually in Pipeline Analyzer menu to open Manage Assignments dialog. DLM Alphaworks Page 230 of 374 IBM STG - Performance Visual Performance Analyzer Figure 281 The Manage Assignments dialog Add an Assignment   1 Click the Add button to add a new assignment. 2 Complete the new assignment with the following filed values. Enabled: It indicates whether the assignment is enabled. Event: You can either input an event or choose one from the drop-down list. The events in the drop-down list is obtained from the pipeline config file. Pattern: You can input only digital, '?' and '*' in this field. Symbol: You can specify the symbol to be assigned in the field. The symbol is from the characters set [0-9a­ zA-Z].  3 Press the OK button to confirm the assignments. Delete an Assignment   1 Select an assignment in the table. 2 Press the Delete button to delete the selected assignment. Enable or Disable an Assignment    1 Select an assignment in the table. 2 Specify the Enabled check-box to enable or disable the assignment. 3 Press the OK button to confirm. 5.3.5Tie Cycle Controls When more than two editors are opened, you can scroll either of them at the same time. To make control of several editors, you need to open two or more pipeline files.The types of files can be scroll mode or resource mode. If you DLM Alphaworks Page 231 of 374 IBM STG - Performance Visual Performance Analyzer open just one editor, a warning box will turn up. If you open two editors in the Pipeline Analyzer editor, you can drag one to the bottom to make the two editors in a vertical layout like the following screen shot. Similarly, if you open three editors, you can drag two of them to make the three editors in a vertical layout. The following picture shows the vertical layout of two editors,one shows scroll mode file, and the other shows resource mode file.: Figure 282 The editors in a vertical layout Note: When you are dragging an editor to the bottom, be sure to drag it to the bottom where the arrow of the mouse turns into a black arrow. Note that each editor has a horizontal scroll bar. Choose the preceding editor to be the active editor. Select Pipeline -> Tie Cycle Controls... DLM Alphaworks Page 232 of 374 IBM STG - Performance Visual Performance Analyzer Figure 283 The Tie Cycle Control dialog The list box in Tie Cycle Controls dialog shows all the other editors. In the preceding picture, there is only one choice:scroll-mode. Choose this editor and click OK. The following screen capture shows the result effect. Figure 284 The two editors tied in cycle control � DLM Alphaworks Page 233 of 374 IBM STG - Performance Visual Performance Analyzer The scroll bar of selected editor becomes invalid. You can scroll two editors horizontally at the same time using scroll bar of chosen editor in the last operation of tie cycle controls. The following screen capture shows the effect of controlling two editors. Figure 285 Scroll the two editors at the same time cycle To clear this setting, close any editor or choose Pipeline-> Tie Cycle Controls... and select Clear. 5.4Counter Analyzer The Counter Analyzer tool is a common tool to analyze hardware performance counter data among many IBM eServer platforms, which includes systems running on AIX, i5OS, zOS, Linux on POWER, Linux on Cell BE. The Counter Analyzer tool accepts hardware performance counter data in the form of a cross-platform XML file format. The tool uses either build-in hsqldb database engine or external DB2 instance to store the raw performance counter data. The tool provides multiple views to help you identify the data. The views can be divided into two categories: one category is the “table” views, which are basically two-dimension tables displaying data. The data could be raw performance counter values, derived metrics, counter comparison results and so on. Another category DLM Alphaworks Page 234 of 374 IBM STG - Performance Visual Performance Analyzer is the “plot” views. In these views data are represented by different kind of plots. The data could also be raw performance counter values, derived metrics, and comparison results and so on. Besides these “table” views and “plot” views, there are also some “utility” views to help user configure and customize the tool. You can also find the Pipeline Analyzer User Guides from the VPA. Select Help - Help Contents within VPA. To get context sensitive help, press F1 for Windows and AIX or press Ctrl+F1 for Linux. 5.4.1Basic concepts for Counter Analyzer  Performance Monitoring Counter Performance monitor counter provides comprehensive reports of events that are critical to performance on IBM systems. It is able to gather critical hardware events, such as the number of misses on all cache levels, the number of floating point instructions executed, the number of instruction loads that cause TLB misses.  Metrics Metric is calculated with user-defined formula and event count from performance monitor counter. It's used to provide performance information like CPU utilization rate, million instructions per second. This helps the algorithm designer or programmer identify and eliminate performance bottlenecks.  CPI Breakdown Model Cycles per instruction (CPI) is the measurement for analyzing the performance of a workload. CPI is simply defined as the number of processor clocked cycles needed to complete an instruction. It is calculated as CPI = Total Cycles / Number of Instructions Completed. A high CPI value usually implies underutilization of machine resources. On a POWER5 system, you can break down your workload CPI into individual components, as the POWER5 has several programmable counters available to count events that can calculate the components of CPI and allow you to determine how to improve performance on a given workload. The following is an instance of CPI breakdown model: Table 1 The CPI Breakdown Model DLM Alphaworks Page 235 of 374 IBM STG - Performance Visual Performance Analyzer Total cycle Completion cycles PowerPC Base completion cycles <# cycles> overhead of cracking/microcoding and grouping restriction Completion Table empty (GCT empty) cycles I-cache miss penalty Branch redirection (branch misprediction) penalty others (Flush penalty etc.) Completion Stall cycles Stall by LSU instruction Stall by reject Stall by translation (rejected by ERAT miss) Other reject Stall by D-cache miss Stall by LSU basic latency, LSU Flush penalty Stall by FXU instruction Stall by any form of DIV/MTSPR/MFSPR instruction Stall by FXU basic latency Stall by FPU instruction Stall by any form of FDIV/FSQRT instruction Stall by FPU basic latency others (Stall by BRU/CRU instruction, flush penalty (except LSU flush), etc.) The preceding table represents a CPI breakdown model where the total cycles of a workload is divided into three components: Completion cycles, Completion Table empty(GCT empty) cycles, and Completion Stall cycles. The DLM Alphaworks Page 236 of 374 IBM STG - Performance � Visual Performance Analyzer base completion cycles are the number of cycles that would be needed if grouping was perfect. Otherwise, stalls happen, and they can be attributed to either Completion Table empty or Completion Stall cycles. A Completion Table empty condition occurs when no groups are completing on a given cycle because of either Icache miss or branch misdirection. Meanwhile, the Completion Stall cycles are those stalls caused by any of the following instructions: LSU, FXU, FXU long (all forms of div, mtspr, mfspr), FPU, and FPU long (all forms of fsqrt, fdiv); or by other events such as Dcache miss, Reject, and Reject by translating (ERAT miss). 5.4.2Load an existing counter data file 5.4.2.1Open Counter Analyzer Perspective 1. � When you first start Visual Performance Analyzer after installation, what you see is Welcome view. To start Counter Analyzer, choose Window -> Open Perspective -> Other. 2. � In the Open Perspective dialog choose Counter Analyzer and click OK. Figure 286 Open the Counter Analzyer � DLM � Alphaworks Page 237 of 374 IBM STG - Performance Visual Performance Analyzer 3. The following screen capture shows the initial layout of Counter Analyzer. Figure 287 The initial layout of Counter Analyzer 4. You can also open Counter Analyzer perspective simply by clicking on the toolbar. 5.4.2.2Load in counter data file Choose File -> Open File_, and in Open File dialog select one counter data file with suffix ".pmf". DLM Alphaworks Page 238 of 374 IBM STG - Performance Visual Performance Analyzer Figure 288 Open a counter data file You can also load in counter data from repositories, which are not covered here. 5.4.2.3Brief Introduction to Counter Analyzer Perspective After loading in the counter data of .pmf file, the Counter Analyzer perspective displays the data in its views and editors. Primary information of details, metrics and CPI breakdown is displayed in Counter Editor. Resource statistics information of the file (if available) will be showed in tabular view Resource Statistics. View Graph illustrates the details, metrics and CPI breakdown in a graphic way. View Database Connection lists local and remote repositories, with their basic information displayed in view Description. DLM Alphaworks Page 239 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 289 The Counter Analyzer perspective with loaded counter data 5.4.3Navigate the Counter Analyzer Perspective You can perform the following tasks to navigate around counter data within Counter Analyzer. 5.4.3.1Open Counter Data There are two ways to open counter data using Counter Analyzer, either from counter data files or from database repository. 5.4.3.1.1Open Counter Data from PMF Files 1. � Choose File -> Open File_, and in Open File dialog select one counter data file with suffix ".pmf". 2. � If one counter file is selected, it is opened in editor, and its resource statistics information (if available) is showed in tabular view Resource Statistics. DLM � Alphaworks Page 240 of 374 IBM STG - Performance Visual Performance Analyzer Figure 290 The counter data with its resource statistics By default, a local repository is provided by VPA tool. You can also connect to remote DB2 repositories. Remote repositories can be created, configured, refreshed, and discarded. Files in these repositories can be opened in counter editor and deleted. Besides, you can import counter data files into these repositories and query counter data from these repositories. 5.4.3.1.2Open Counter Data from Connection View By default, a local repository is provided by VPA tool. You can also connect to remote DB2 repositories. The local � connection is a local connection, which you can add Counter Analyzer Support under it, as well as other supports. � Remote connection is a DB2 connection. It can be created, configured, refreshed, and deleted. �  Create/Configure/Refresh/Discard a Connection Create Connection In Database Connection view, select New Connection on the context menu, you can create a new repository. Local Repository is created by default and can only be configured and refreshed. Description view lists basic information of the repository. DLM Alphaworks Page 241 of 374 IBM STG - Performance Visual Performance Analyzer Figure 291 The Database Connections view Configure Connection The following dialog is the configure dialog of the connection. Figure 292 Configure a connection Add Product Support Right-click on the connection node, choose the product support you want to add to the connection node, a confirm dialog then opens. You can create a new repository, or select an existing repository from the Existing list. DLM Alphaworks Page 242 of 374 IBM STG - Performance Visual Performance Analyzer Figure 293 Specify a repository Refresh Connection Double click a connection or click the + button to expand the tree. A Password Dialog opens when opening the repository for the first time. If the password is saved, you no longer have to re-enter it when Counter Analyzer is closed and restarted later. If the password is not saved, you are required for it every time they start Counter Analyzer. DLM Alphaworks Page 243 of 374 IBM STG - Performance Visual Performance Analyzer Figure 294 Input a password for authentication You can also choose to clear the password on the context menu. Figure 295 Clear the saved password Discard Connection You can choose to discard the connection on the context menu. Be sure to delete all its children first. DLM Alphaworks Page 244 of 374 IBM STG - Performance Visual Performance Analyzer Figure 296 Delete an existing connection Discard Support You can choose to discard the Product Support on the context menu. Figure 297 Discard a support node  Import File Right-click the supported node, select Import File action, and then choose the file you want to import from the file dialog. DLM Alphaworks Page 245 of 374 IBM STG - Performance Visual Performance Analyzer Figure 298 Import a file  Open File in Editor � Right-click the file node, and select Open in Editor, or double-click on the file. � Figure 299 Open the counter data file in editor 5.4.3.2View Counter Data You can see raw counter data, metrics data and CPI breakdown data in different pages of editor. All these data are retrieved from one counter file. 5.4.3.2.1View Raw Counter Data Row counter data are shown in tab Details. On the context menu, you can switch display mode here. DLM Alphaworks Page 246 of 374 IBM STG - Performance Visual Performance Analyzer Figure 300 Switch the display mode of the raw counter data In all tabs, Filter is used to filter processors and events. DLM Alphaworks Page 247 of 374 IBM STG - Performance Visual Performance Analyzer Figure 301 The Filter dialog Row counter data can be displayed in three modes: Event, Group/Event/Time Slice, and Group/Time Slice/Event. If the counter data has time slice information, all these three modes can be supported. If not, only Event mode is enabled, and the other two modes are disabled. "Event" Mode In this mode, all events and their counter data are listed in the editor. Data here are normalized event count instead of actual data in counter data file. DLM Alphaworks Page 248 of 374 IBM STG - Performance Visual Performance Analyzer Figure 302 The raw counter data in Event mode "Group/Event/Time Slice" Mode If the counter data has time slice information, there are two modes to display more. In Group/Event/Time Slice mode, the data can be grouped first by group, and then by event. Data here are actual event count in counter data file. Figure 303 The raw counter data in Group/Event/Time Slice mode "Group/Time Slice/Event" Mode If the counter data has time slice information, the other mode to display more information is Group/Time Slice/Event mode, in which counter data is grouped first by group, and then time slice. Data here are also actual event count in counter data file. DLM Alphaworks Page 249 of 374 IBM STG - Performance Visual Performance Analyzer Figure 304 The raw counter data in Group/Time Slice/Event mode 5.4.3.2.2View Metrics Data Metrics data are shown in tab Metrics. Display Metric Group Metrics editor page show all metric groups as default. Expand a group to see all metrics under this group. Figure 305 The Metrics tab of the counter editor � DLM Alphaworks Page 250 of 374 IBM STG - Performance Visual Performance Analyzer If you want to see all metrics without grouping, simply choose Show All Metrics on the context menu. Figure 306 Show all metrics in the editor Then, you can see all metrics are listed without grouping. Figure 307 The metrics without grouping Edit/Load/Save Metrics Variables In the right part of the editor, all variables associated with the current metrics are listed in a table with their names and values. Left click the Value column of each variable, and you can modify its value. On the context menu, you can further choose Load Variables to load one variables file, and choose Save Variables to save current variables in the editor to one variables file. If one variable file is loaded, all variables' values are updated, which makes all metrics be calculated again. (Only variable total_time is read only and cannot be overwritten.) DLM Alphaworks Page 251 of 374 IBM STG - Performance Visual Performance Analyzer Figure 308 The context menu of the variable table Change Metrics You can choose to change metrics file on the context menu, and applies it to the active counter data. Figure 309 Change the metrics file � DLM Alphaworks Page 252 of 374 IBM STG - Performance Visual Performance Analyzer The metrics file to be selected can either be external files or built-in files. The Change Derived Metrics dialog is as follows: Figure 310 The Change Derived Metrics dialog 5.4.3.2.3View CPI Breakdown Data CPI breakdown data are shown in tab CPI Breakdown. Change CPI Breakdown Model You can change CPI Breakdown Model on the context menu, and apply it to the active counter data. DLM Alphaworks Page 253 of 374 IBM STG - Performance Visual Performance Analyzer Figure 311 Change the current CPI Breakdown Model The CPI Breakdown Model to be selected can be either external files or built-in files. Your selection history of external files will be prompted in the combo box. If you choose built-in XML files, you can click Copy to copy its absolute path, and by pasting it in browser, you can easily access the file. The Select CPI Breakdown Model dialog is as follows: DLM Alphaworks Page 254 of 374 IBM STG - Performance Visual Performance Analyzer Figure 312 Select a CPI Breakdown Model file Export as HTML Choose Export as HTML on the context menu. You can export CPI breakdown data into an HTML file. DLM Alphaworks Page 255 of 374 IBM STG - Performance Visual Performance Analyzer Figure 313 Export the current CPI breakdown data into an HTML file The CPI breakdown HTML file is as follows: Figure 314 The exported CPI breakdown HTML file � DLM Alphaworks Page 256 of 374 IBM STG - Performance Visual Performance Analyzer 5.4.3.3View Temporal Chart Temporal chart displays the counter data by samples. Each sample has one value represented in chart. You can aggregate several samples as one using Aggregation Scale. Aggregation Scale is used to scale the aggregation rate of samples. You may feel overwhelmed in front of too many samples in Graph view in temporal mode sometimes. And with the aggregation scale, it is easy for you to choose the number of samples to display. For example, you may choose to display 20 samples in Graph View by dragging the button on the aggregation scale to the top:: Figure 315 The Graph view showing the temporal chart And you may also choose to display 10 samples here by dragging the button on the aggregation scale to the bottom: Figure 316 The temporal chart displaying the minimum 10 samples Filter is used to filter processors. DLM Alphaworks Page 257 of 374 IBM STG - Performance Visual Performance Analyzer Figure 317 Filter the processors In temporal mode, when you select one event in tab Details, one metric in tab Metrics, or one CPI component in tab CPI Breakdown, the temporal information of hits item is displayed in Graph view. The chart type can be switched between Multiple Bar Chart and Line Chart. • Line chart of event: Figure 318 The line chart of an event Multiple bar chart of one event: DLM Alphaworks Page 258 of 374 IBM STG - Performance Visual Performance Analyzer Figure 319 The multiple bar chart of an event 5.4.3.4View temporal comparison chart When Comparison Editor is opened, comparison data are displayed in Graph view. You can view temporal chart of the comparison data by selecting Display Temporal Chart to enter the temporal mode which is set as the default chart mode. To display a temporal chart, you must first check if the button on the top right of the Graph view is grey or released. If not, click it and select a CPI component in the comparison editor. The following screen capture displays the temporal comparison line chart of one event. The temporal comparison chart of one metric and one CPI breakdown component is much the same with that of one event. The comparison is based on the sum of all processors, the average of all processors, or both. You can specify it in Filter. Figure 320 The temporal comparison line chart Note: This graph mode is supported by details data, metrics data, and CPI breakdown data. DLM Alphaworks Page 259 of 374 IBM STG - Performance Visual Performance Analyzer 5.4.3.5View CPI Breakdown chart In CPI breakdown mode, when you select one CPI component in tab CPI Breakdown, CPI breakdown data can be displayed in Graph view. Display CPI leaves breakdown stacked bar chart 1. Choose Create Chart Graph ... on the context menu. Figure 321 Create a chart graph 2. Select CPI data type and click Next. Figure 322 Select a data type to be displayed 3. Then click All Leaves button and click Next. DLM Alphaworks Page 260 of 374 IBM STG - Performance Visual Performance Analyzer Figure 323 Select the CPI components to be displayed 4. Choose Events/Metrics in Choose Series group and click Next. Figure 324 Select the processors and series � 5. Choose Stacked Bar Chart or Multiple Bar Chart according to your requirements to finish creating chart graph. The following graph shows the stacked bar chart: DLM Alphaworks Page 261 of 374 IBM STG - Performance Visual Performance Analyzer Figure 325 The result CPI breakdown stacked bar chart Display CPI breakdown multiple bar chart Figure 326 The result CPI breakdown multiple bar chart Note: This graph mode is only supported by CPI breakdown data. 5.4.3.6View CPI Breakdown comparison chart When Comparison Editor is opened, comparison data are displayed in Graph view. In this mode, chart type can be switched between Multiple Bar Chart and Stacked Bar Chart. The comparison is based on the sum of all processors, the average of all processors, or both. You can specify it in Filter. DLM Alphaworks Page 262 of 374 IBM STG - Performance Visual Performance Analyzer Multiple bar chart is set as default, so when you select a CPI component in a comparison editor, you can see the multiple bar chart in Graph view. Figure 327 Change to a type of chart To display a stacked bar chart: 1. Choose Create Chart Graph ... on the context menu. Figure 328 Create a chart graph 2. Then select CPI data type and click Next. DLM Alphaworks Page 263 of 374 IBM STG - Performance Visual Performance Analyzer Figure 329 Select a data type 3. Choose All Leaves or All Children and click Next. Figure 330 Select the CPI components to be displayed 4. Choose CPI and click Next . DLM Alphaworks Page 264 of 374 IBM STG - Performance Visual Performance Analyzer Figure 331 Select a data organization form 5. Choose Stacked Bar Chart or Starked Bar Chart (%) to finish creating comparison chart. DLM Alphaworks Page 265 of 374 IBM STG - Performance Visual Performance Analyzer Figure 332 Select a chart type 6. The following picture shows a CPI breakdown comparison chart: Figure 333 The CPI breakdown comparison stacked bar chart Note: This graph mode is only supported by CPI breakdown data. DLM Alphaworks Page 266 of 374 IBM STG - Performance Visual Performance Analyzer 5.4.3.7Create Chart Graph You can use chart wizard to customize some complicated chart. Let us see some examples. Example one: Show some events data in average. Example two: Compare some CPI data. 5.4.3.7.1Example one: Show some events data in average. 1. Open a counter file. Choose Create Chart Graph... on the context menu of Graph view to enter chart wizard. Figure 334 Create a chart graph 2. For the first step of wizard, please choose what kind of data you want to display, Events/Metrics or CPI. Here we choose Events/Metrics for this demonstration. Figure 335 Select a data type 3. Choose some events, metrics and click Next DLM Alphaworks Page 267 of 374 IBM STG - Performance Visual Performance Analyzer Figure 336 Select the events and metrics to be displayed 4. Decide which processor to display. You can also decide the data organization from Choose Series group. DLM Alphaworks Page 268 of 374 IBM STG - Performance Visual Performance Analyzer Figure 337 Select the processor and series 5. Last step, choose a chart type you like. DLM Alphaworks Page 269 of 374 IBM STG - Performance Visual Performance Analyzer Figure 338 Select a chart type 6. Finally, we get a chart which shows some events data in average. Figure 339 The result bar chart displaying data in average 5.4.3.7.2Example two: Compare some CPI data. 1. Open a comparison editor. Choose Create Chart Graph... on the context menu of Graph view to enter chart wizard. 2. Choose what kind of data you want to display, Events/Metrics or CPI. This time we choose CPI for this demonstration. 3. Choose some CPI and click Next. DLM Alphaworks Page 270 of 374 IBM STG - Performance Visual Performance Analyzer Figure 340 Select the CPI components to be displayed 4. Then we enter Group By page. Choose File as series means that each multiple bar group shows one specified CPI value of all files, The form is just like "FileA.CPIA FileB.CPIA FileA.CPIB FileB.CPIB". However, choose CPI as series means that each multiple bar group shows all of the selected CPIs of one file. The form is just like "FileA.CPIA FileA.CPIB FileB.CPIA FileB.CPIB". DLM Alphaworks Page 271 of 374 IBM STG - Performance Visual Performance Analyzer Figure 341 Select a type of series 5. The last step is choosing Compare Mode. In one word, Compare Side By Side shows all files, but Compare Against Baseline omits baseline file. DLM Alphaworks Page 272 of 374 IBM STG - Performance Visual Performance Analyzer Figure 342 Select a mode to compare 6. Because the only form of this chart is multiple bar chart, so Choose Chart Type page is hidden. Finally, we get a chart which compares some CPI data of different files. Figure 343 The result CPI comparison multiple bar chart 5.4.3.8Compare Counter Data DLM Alphaworks Page 273 of 374 IBM STG - Performance Visual Performance Analyzer You can click on the toolbar to select several files to compare. In the pop-up dialog, you must select 2~4 target files to compare. You can either select files from the explorer, or select files from repositories (you have to refresh repositories in Counter Repository view in advance). Select one file, and click the Add button to add it. The first file you add is regarded as the base file. You can further specify the metrics and CPI Breakdown Model to apply to the counter data, or do it later. The following screen snapshot displays the Select To Compare dialog: Figure 344 The Select To Compare dialog 5.4.3.8.1View Comparison Data Comparison Editor is opened to display comparison result side by side. The following items can be compared: • • • The total counts of one event are compared. The derived metrics values are compared. The CPI breakdown values are compared. The following screen snapshot displays the comparison data of each event from different data sources. The symbol triangle (△) refers to difference between the target file and the base file, but the symbol % refers to the target files' proportion of the base file. DLM Alphaworks Page 274 of 374 IBM STG - Performance Visual Performance Analyzer Figure 345 The comparison editor with its context menu open The comparison is based on the sum of all processors, the average of all processors, or both. You can specify it in Filter. DLM Alphaworks Page 275 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 346 The Filter dialog 5.4.3.9Edit Metrics/CPI Breakdown Model 5.4.3.9.1Edit Metrics You can select on the toolbar to open one derived metrics file to edit. You can view and edit derived metric formula in Metrics Editing dialog. The supported operation includes: • Add new derived metric • Edit derived metric • Delete derived metric After modification, you can save the file or save it as another derived metrics file. Following is an example about “How to create new metrics” In Edit Metrics dialog, there are three tab pages - Generic Info, Metrics List and Metrics Group. 1. � In Generic Info page, input general information for the derived metrics file in General Info tab page. you can enter the name of the metrics in text box Metrics Name, and enter its description in text box Description. You can also specify the processor and OS family from the drop-down list Processor and OS Family. DLM � Alphaworks Page 276 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 347 The General Info tab of the Edit Metrics dialog 2. � In Metrics List page, you can use button New and Remove to add and delete metrics. Click the button Clear will remove all the metrics, so be careful. DLM � Alphaworks Page 277 of 374 IBM STG - Performance Visual Performance Analyzer Figure 348 The Metrics List tab of the Edit Metrics dialog Formula editor helps you write formula if you click on the button ..., and then you can edit the formula of the metric in text box Formula. Operators can be "+", "-", "*", "/", "(", ")". You can either specify your own operands, or select operands from Events, Variables and Derived Metrics. These operand candidates are subjected to the processor that is defined previously. DLM Alphaworks Page 278 of 374 IBM STG - Performance Visual Performance Analyzer Figure 349 The formula editor 3. In Metrics Group page, you can define metrics group to put related metrics together: DLM Alphaworks Page 279 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 350 The Metrics Group tab of the Edit Metrics dialog 5.4.3.9.2Edit CPI Breakdown Model 1. � You can select on the tool bar to open one CPI Breakdown Model definition file. 2. � You can view and edit CPI Breakdown Model in Edit CPI Breakdown Model dialog. The supported operation includes: o Add new component o Edit component o Delete component DLM � Alphaworks Page 280 of 374 IBM STG - Performance � Visual Performance Analyzer Figure 351 The Edit CPI Breakdown Model window 3. After modification and validation, you can save the file or save it as another CPI breakdown model definition file. 5.4.3.10Import Counter Data File into Repository To create Counter Analyzer Support, you should create a database connection first. Please refer to section 5.1.7 for details on how to create and manage database connections. Then, right-click the connection and select Create Counter Analyzer Support, If you are using hsqldb connection, all existing database under this path will be listed, you can create a new one, or attach an existing one. If you are using db2 connection, all existing schema under this database will be listed, you can create a new one, or attach an existing one. DLM � Alphaworks Page 281 of 374 IBM STG - Performance Visual Performance Analyzer Figure 352 Create Counter Analyzer support After Counter Analyzer Support has been created, then, you can import file into this support, open it in editor, or delete it. Figure 353 Import a file into the Counter Analyzer support and open the file in editor You can delete files manually to release disk space. Multi-selecting is allowed. See the picture below. DLM Alphaworks Page 282 of 374 IBM STG - Performance Visual Performance Analyzer Figure 354 Delete imported files from Counter Analyzer support 5.4.4Watch Properties via Properties Sheet 5.4.4.1Open Properties Sheet Choose Window -> Show View -> Other menu item. You can find "Properties" in "General" folder. Figure 355 Open the Properties view � DLM Alphaworks Page 283 of 374 IBM STG - Performance Visual Performance Analyzer Choose it and click "OK". 5.4.4.2Watch Properties in Properties Sheet Click a tree node in System Information View, and you will see all the properties of that node in Properties Sheet. Figure 356 The Properties view showing the information of the selected node in System Information view � 5.5Trace Analyzer Trace Analyzer visualizes Cell BE traces containing information such as DMA communication, locking/unlocking activities, mailbox messages, etc. Trace Analyzer provides several views that help the user make sense of the trace data. The trace can be plotted in a graphical view, organized by core, along a common timeline. Alternatively, the user can traverse the trace records in a textual table. Another view provides the detailed data for each kind of records, for example, lock identifier for lock operations, accessed address for DMA transfers, etc. You can also find the Trace Analyzer User Guides from the VPA. Select Help - Help Contents within VPA. To get context sensitive help, press F1 for Windows and AIX or press Ctrl+F1 for Linux. 5.5.1Basic concepts  Events Events are records that have no duration, for example, records describing non-stalling operations, such as releasing a lock. Events’ input on performance is normally insignificant, but they may be important for understanding the application and tracking down sources of performance problems.  Intervals DLM Alphaworks Page 284 of 374 IBM STG - Performance Visual Performance Analyzer Intervals are records that may have non-zero duration. They normally come from stalling operations, such as acquiring a lock. Intervals are often a very significant performance factor, and identifying long stalls and their sources is an important task in performance debugging. A special case of an interval is live interval, that starts when an SPE thread begins to execute and ends when the thread exits. 5.5.2Load an existing trace file 5.5.2.1Open Trace Analyzer Perspective When you first start Visual Performance Analyzer after installation, what you see is Welcome View. To start Trace Analyzer, choose Window -> Open Perspective -> Other. In the Select Perspective dialog choose Trace Analyzer and click OK. The following screen capture shows the initial layout of Trace Analyzer. Figure 357 The Trace Analyzer perspective loaded with trace data � DLM Alphaworks Page 285 of 374 IBM STG - Performance Visual Performance Analyzer 5.5.2.2Load in trace file Choose File -> Open File_, and in Open File dialog select one trace file with suffix ".pex".. 5.5.2.3Brief introduction to Trace Analyzer Perspective After loading in the trace data, the Trace Analyzer Perspective displays the data in its views and editors. Going from the top left clockwise, we see: • • • • • Navigator View Trace Editor showing the trace visualization by core Details View showing the details of the selected record (if any) Color Map View, allowing the user to view and modify color mapping for different kinds of events Trace Table View, which shows all the events on the trace in the order of their occurrence 5.5.3Navigate the Trace Analyzer Perspective The following are tasks that you can perform to navigate around trace data within Trace Analyzer. 5.5.3.1View Trace Data Graph A graphical presentation of the trace is shown in the editor window. Figure 358 The trace editor loaded with trace data � DLM Alphaworks Page 286 of 374 IBM STG - Performance � Visual Performance Analyzer Data from each core is displayed in a separate row, and each trace record is represented by a rectangle. Time is represented on the horizontal axis, so that the location and size of a rectangle on the horizontal axis represent the corresponding event's time and duration. The color of the rectangle represents the type of event, as defined by the Color Map View. In the rows corresponding to the SPEs, note the full-height green rectangles. They show the live intervals starting with the context switch that takes the thread on CPU and ending with a context switch that takes it off CPU. On top of them are painted representations of the events that occurred during the thread execution.. Trace Editor Components The following figure shows the different components of the Graphical View. Figure 359 The trace editor components Going from top to bottom, we have: • The marker ruler shows where the selected record (if any) is located on the trace's timeline (look for the orange-and-white selection marker). Clicking on the marker ruler scrolls the view to make the selected event visible. Note that there are also vertical marker rulers, located in each row between the core id and the graph. These rulers show on which core the selected event occurred. The scrollbar can be used to scroll back and forth in time. • DLM � Alphaworks Page 287 of 374 IBM STG - Performance � Visual Performance Analyzer • The ruler bar shows the time values, in the same units as those used in the trace. Trace Editor Tools When a trace is open in the Graphical View, the following toolbar is added to the standard Eclipse toolbars: . This toolbar is only active when the focus is on the Trace Editor. The following tools are available: • : Selection tool. Pick this tool and click with it a record on the Graphical View to select the record. This will scroll the Record List View to the selected record and display its details in the Record Details View. : Zoom-in point tool. Pick this tool and click one of the graphs to zoom in while keeping the time value at the click point at the same location. : Zoom-out point tool. Pick this tool and click one of the graphs to zoom out while keeping the time value at the click point at the same location. : Zoom-all tool. Pick this tool and click anywhere in the graph to fit all the trace into the view. : Zoom-in area tool. To fit a specific region into the view, pick this tool and in one of the rows mark the area you want to fit into the view. : Drag tool. To scroll the view back and forth along the time axis, pick this tool, and hold the right mouse button pressed while dragging the graph • • • • • Trace Editor Coloring Conventions When analyzing a trace, it is often important to distinguish between a large number of short intervals and a single long interval, which may be a good target for optimization. In order to aid in this analysis, Trace Analyzer emphasizes intervals with a border whose color is a darker hue of the event's color. Please refer to the Color Map View for the exact color mapping of any particular editor instance. Long interval Multiple short intervals Figure 360 The trace editor coloring conventions 5.5.3.2View List of Trace Records You can view the list of the records in the trace in the Trace Table View. Click on a row to select a record and see its details in the Record Details View. Trace Table View selection is also synchronized with the selection in the Trace Editor, so that each scrolls to and highlights the selection done in the other. Controls at the top of the view allow to change the start and size of the trace chunk visible in the table, or to scroll to the next/previous chunk. To DLM � Alphaworks Page 288 of 374 IBM STG - Performance Visual Performance Analyzer save the trace in a text file, click the icon at the top of the view, or select "Save as Text File" in the view menu. The file will be placed next its corresponding .pex file, under extension .txt. Figure 361 The Trace Table view 5.5.3.3View Record Details The Record Details View shows the names and values for all the fields defined for the selected record. Figure 362 The Record Details view � DLM Alphaworks Page 289 of 374 IBM STG - Performance Visual Performance Analyzer 5.5.3.4Change Colors used in Trace Editor The Color Map View holds a hierarchical list of record types and their color mapping. To change the color assigned to a particular type, double-clicking the corresponding row in the color map to open the color chooser dialog, changing a color for a category changes the color for all record types in this category. To change the color of an individual type within a category, click on the plus sign at the category row's left to expand it, and then double-click the line that corresponds to the desired type to change its color mapping. Figure 363 The Color Map View 5.5.4Select Event An event can be selected by one of the following means: • • In the graphical view, by selecting the selection (arrow) tool and clicking on the event In the trace overview, by clicking on the event Those views are synchronized with regard to selection. For example, if the selection is done in the graphical view, the trace overview will also show the selection, scroling the table if necessary. Likewise, if the selection is done in the tabular view, the graphical view will scroll to the selection, if this event is at all shown in the graphical view. Every selection will cause the selected record to be shown in the record detail view. In addition, the graphical view's selection marker ruler will be updated to show the selection's location in the trace. Clicking on this ruler will cause the graphical view to scroll to make the selection visible. 5.6Call Tree Analyzer Call Tree Analyzer is the tool to analyze the call trace data collected by the tool like Performance Inspector JProf. The call trace data contains the information like when one method calls another, how much time is spent in every invocation, and so on. Call Tree Analyzer provides the two major visualization ways to analyze call trace data, which are execution flow graph and call tree table. DLM Alphaworks Page 290 of 374 IBM STG - Performance Visual Performance Analyzer You can also find the Call Tree Analyzer User Guides from the VPA. Select Help - Help Contents within VPA. To get context sensitive help, press F1 for Windows and AIX or press Ctrl+F1 for Linux. 5.6.1Basic Concepts 5.6.1.1Call Tree and Call Context Tree The two common representations of call trace data are call tree and call context tree. In a call tree, each method invocation is represented as one tree node. The caller method invocation is parent node, but the callee method invocation is the node itself. In a call context tree, all child nodes of the same method are merged to one node, which is also attached under the node of caller method invocation. If we have the call sequence as follows (“A -> B” means A calls B, and “<-“means B returns back to A.) : A -> B, B -> C, C -> D, <-, C -> E, <-, <-, B -> C, C -> E, <-, <-, <-, A -> B, B -> F, <-, Open Perspective -> Other. In the Open Perspective dialog choose Call Tree Analyzer and click OK. The following screen capture shows the initial layout of Call Tree Analyzer. DLM Alphaworks Page 293 of 374 IBM STG - Performance Visual Performance Analyzer Figure 366 The Call Tree Analyzer loaded with call tree data Choose File -> Open File_, and in Open File dialog select one call trace data file with suffix ".jprof". 5.6.3Call Tree Analyzer Perspective Introduction Call Tree Analyzer perspective contains one editor and some views. Going from the top left clockwise, we see:   Navigator view Call Tree Analyzer editor containing Execution Flow editor page and Call Tree editor page, which display execution flow graph and call tree. Call Stack view showing the call stack of the selected invocation. Information view showing the overall information of the opened call trace data file. Invocation view showing the parent invocation and children invocations of the selected invocation.    5.6.4View Call Trace Data in Execution Flow Editor Page DLM Alphaworks Page 294 of 374 IBM STG - Performance Visual Performance Analyzer Execution Flow editor page provides a powerful way to visualize how one application executes. It consists of execution flow graph and call tree table. Figure 367 The Execution Flow page of call tree editor Execution flow graph visualizes the application execution thread by thread or process by process depending on different call trace data. Each thread or process has one rectangle area, in which all method invocations are visualized. At the rightmost side, there is one time axis. The unit of time axis can be cycle, instruction, or second depending on different call trace data. DLM Alphaworks Page 295 of 374 IBM STG - Performance Visual Performance Analyzer Figure 368 The execution flow graph The color bar represents the life cycle of one method invocation, and its height represents how long this invocation lasts from the entry to exit. The red line between the color bars represents that the method on the left side invokes the method on the right side. So from this graph, you can know how long an invocation lasts, and when one invocation initiates another one. Figure 369 The exaggerated execution flow graph There are many tool buttons to zoom in or out, navigate, and save image of execution flow graph. DLM Alphaworks Page 296 of 374 IBM STG - Performance Visual Performance Analyzer Table 3 The toolbar button of Call Tree Analyzer perspective Icon Description Zoom in. Zoom out. Zoom to fit the current window. Save the visible graph of the current window as an image. Save the whole graph as an image. Select one invocation. Select all invocations of the same method. Drag and move the graph. The call tree table has the following attributes for each method invocation. � Figure 370 The call tree table � Table 4 The attribute types of the call tree table � Attribute Invocation Method Set Starting Time Base Time Method name. Method Set name. This can be class name for Java method, or module name for C/C++ method. The time stamp when the current invocation is initiated. The time spent on the current invocation itself. It doesn't include the time its child invocations spend. Description Cumulative Time The time spent on the current invocation itself and its child invocations. Cumulative % The percentage of the cumulative time relative to the time period of the whole trace. 5.6.5View Call Trace Data in Call Tree Editor Page Call Tree editor page is used to analyze the relationship between caller and callee. It consists of one call tree table and a set of invocation relationship tables. Call tree table displays how and when a method calls another method, and it is as same as the one in Execution Flow editor page. When you double-click a DLM Alphaworks Page 297 of 374 IBM STG - Performance Visual Performance Analyzer method invocation in call tree table, the invocation relationship tables display the parent invocation and child invocations of the selected one. Figure 371 The Call Tree page of call tree editor The call tree table has the following attributes for each method invocation. Figure 372 The call tree table of the Call Tree page � Table 5 The attribute types of the call tree table � Attribute Invocation Method Set Starting Time Base Time Method name. Method Set name. The name can be class name for Java method, or module name for C/C++ method. The time stamp when the current invocation is initiated. The time spent on the current invocation itself. It doesn't include the time its child invocations spend. Description Cumulative Time The time spent on the current invocation itself and its child invocations. DLM Alphaworks Page 298 of 374 IBM STG - Performance Visual Performance Analyzer Cumulative % The percentage of the cumulative time relative to the time period of the whole trace. Invocation relationship tables consists of three tables to display parent invocation, selected invocation, and child invocations. Double-click the invocation in the call tree table, and you can display the parent invocation, the selected invocation and the child invocations in the invocation relationship tables. It's possible to navigate through the invocation selection history by using the buttons . Figure 373 The invocation relation tables 5.6.6Locate Invocation by Method Name Sometimes, we know the method name, and want to locate it in the enormous call trace data. You can follow the following two steps to locate it. Select one runnable (thread/process) in call tree table, and choose Find ... on context menu. Figure 374 Find an invocation by method name � DLM Alphaworks Page 299 of 374 IBM STG - Performance Visual Performance Analyzer For example, you want to locate the method "startsWith(String)". In the Find dialog, input the method name "startsWith", and click Find Next. Figure 375 The Find dialog The first invocation which contains name "startsWith" is found and selected in call tree table. Figure 376 The string containing the name “startsWith” found in the call tree table � DLM Alphaworks Page 300 of 374 IBM STG - Performance Visual Performance Analyzer Select this invocation, and choose Highlight in Execution Flow on the context menu. � Figure 377 Highlight the selected invocation in execution flow graph This invocation is highlighted in execution flow graph. Figure 378 The selected invocation highlighted in execution flow graph � 5.6.7Filter Invocations DLM Alphaworks Page 301 of 374 IBM STG - Performance Visual Performance Analyzer The call trace data file contains a mess of runnables (threads and processes) and invocations. You can filter some of them out by using invocation filter. There are two ways to support filtering, which are filtering the runnables and filtering the methods Figure 379 Filter the runnables and invocations You can select the runnables to display in the Runnable tab of the Filters dialog . DLM Alphaworks Page 302 of 374 IBM STG - Performance Visual Performance Analyzer Figure 380 Select the runnables and methods to be displayed You can define rule to include or exclude some methods whose names match some pattern. In the following example, all methods whose names start with "java/" will be excluded. Figure 381 Define a rule to filter with DLM Alphaworks Page 303 of 374 IBM STG - Performance Visual Performance Analyzer 5.6.8Filter the method in Method Overview When you input some key words in Filter combo, the button OK is enabled. Click OK, and then the filter results are shown in Method Overview. There are two situations, using regular expression and common string. If you input a common string (for example, "java/lang/string"), click OK, and then the filtered results are displayed in the view. Figure 382 Filter with a common string The following picture shows the filtered result by using a common string. Figure 383 The filtered result using a common string � DLM Alphaworks Page 304 of 374 IBM STG - Performance Visual Performance Analyzer If you input a regular expression (for example, ".*java/lang.*"), select the ReEx checkbox, and click OK. Then the filtered results are displayed in the view. Figure 384 Filter with a regular expression The following picture shows the filtered results by using a regular expression: Figure 385 The filtered results using a regular expression � DLM Alphaworks Page 305 of 374 IBM STG - Performance Visual Performance Analyzer If you input an illegal expression (for example, "????a"), select the ReEx checkbox and click OK, an error dialog pops up. Figure 386 Filter with an illegal expression The error dialog pops up: Figure 387 The warning dialog Then click the Reset button, and filter the methods again. 5.6.9Filter the method in Type Overview DLM Alphaworks Page 306 of 374 IBM STG - Performance Visual Performance Analyzer Double click a type in Type Overview, and then all the methods that allocate the object type are filtered. Before the filtering: Figure 388 Double-click a type to filter the methods allocating it Double click a type and the methods are filtered as follows: DLM Alphaworks Page 307 of 374 IBM STG - Performance Visual Performance Analyzer Figure 389 The methods allocating the selected type filtered in Method Overview 5.6.10Drill Down one Invocation When you open one call trace data file, you can view all invocations in the execution flow graph. If you want to narrow the scope to analyze, you can select one invocation and choose to drill down from execution flow graph or call tree table. Figure 390 Drill down an invocation in execution flow graph � DLM Alphaworks Page 308 of 374 IBM STG - Performance Visual Performance Analyzer Figure 391 Drill down an invocation in call tree page After you choose Drill down in Execution Flow Graph, the selected invocation will be displayed in a new Execution Flow editor page. Figure 392 The invocation drilled down in execution flow graph and call tree table � After you choose Drill Down in Call Tree, the selected invocation will be displayed in a new Call Tree editor page. � DLM Alphaworks Page 309 of 374 IBM STG - Performance Visual Performance Analyzer Figure 393 The drilled down invocation in Call Tree page 5.6.11Setup Column Properties You can set up columns properties in Method Overview, Call Stack view and call tree table by setting in the Select Column dialog. There are three properties that you can set:    Show or hide columns Change the order of columns Set the width of columns in Method Overview to open the Select Column dialog. You can click the button Figure 394 Setup the column properties of the Method Overview � DLM Alphaworks Page 310 of 374 IBM STG - Performance Visual Performance Analyzer You can click the button in Call Stack view to open the Select Column dialog. Figure 395 Setup the column properties of the Call Stack view You can click the button in Invocation view to open the Select Column dialog. Figure 396 Set up the column properties of the Invocation View � You can select the context menu item Select Columns... of call tree table to open the Select Column dialog. � DLM Alphaworks Page 311 of 374 IBM STG - Performance Visual Performance Analyzer Figure 397 Open the Select Columns dialog Take columns of Call Stack view for instance. Show or hide columns When you first open the dialog, only the names of the displayed columns are selected. You can select more items to display these columns in the view. It is important to note that the first column should be always selected. Select more column items in the dialog. DLM Alphaworks Page 312 of 374 IBM STG - Performance Visual Performance Analyzer Figure 398 Select the columns to be displayed in Call Stack view If a memory jprof file is opened, such as this example, the columns labels can be explained as follows: Column Label CALLS AO AB LO LB More columns are displayed. � Description Calls to this method (Callers) Allocated Objects Allocated Bytes Live Objects Live Bytes Figure 399 The selected columns displayed in Call Stack view � DLM Alphaworks Page 313 of 374 IBM STG - Performance Visual Performance Analyzer The button Select All means all items are selected. The button Deselect All means no item is selected except the first one. Change Columns Order Open the dialog and select any item. Click the button Move Up or Move Down. The selected item is moved up or � down once by clicking the button once. � Select the item and choose moving style. � Figure 400 Change the columns order by clicking Move up or Move down Move the selected item. Figure 401 The reordered column list � DLM Alphaworks Page 314 of 374 IBM STG - Performance Visual Performance Analyzer Columns' order is changed in Call Stack view. � Figure 402 The reordered columns in Call Stack view Set the Width of Columns Open the dialog and select any item. Input a new number in text box behind the label Selected column width (pixel): . Figure 403 Set the width of columns by specifying a number The width of the selected column item is changed. Figure 404 The column width changed in Call Stack view � DLM Alphaworks Page 315 of 374 IBM STG - Performance Visual Performance Analyzer 5.6.12Show Color Bar Table Method Overview and Call Stack view can be displayed by color bar. Open the context menu of the view and select an item under Display as Color Bar (it lists all the shown columns that are represented by numbers). Select an item under Display as Color Bar in Method Overview. Figure 405 Display the Calls column as color bar Select an item under Display as Color Bar in Call Stack view. Figure 406 Display the CYCLES column as color bar Take context menu of Call Stack view for instance. Open a .jprof file (or other JPROF files), and select all the items under Display as Color Bar of the context menu. Select the item under Display as Color Bar. DLM Alphaworks Page 316 of 374 IBM STG - Performance Visual Performance Analyzer Figure 407 Display CYCLES column in Call Stack view as color bar The selected column display color bar instead of numbers. Figure 408 The CYCLES column displayed as color bars � Open Select Column dialog, and then check more column items to be shown in Call Stack view. � DLM Alphaworks Page 317 of 374 IBM STG - Performance Visual Performance Analyzer Figure 409 Select more columns to be displayed in Call Stack view Open context menu again, and more items are shown under Display as Color Bar. Figure 410 More items shown under Display as Color Bar The selected column displays color bar instead of numbers. DLM Alphaworks Page 318 of 374 IBM STG - Performance Visual Performance Analyzer Figure 411 The AO column displayed as color bars It is able to make all the columns numbers displayed by color bar. Figure 412 Display all the columns as color bars 5.6.13Save Call Tree into Text File Before you save the call tree, you can use Expand Next Three Levels on the context menu to expand the children of an invocation. Figure 413 Expand the next three levels of the selected invocation � DLM Alphaworks Page 319 of 374 IBM STG - Performance Visual Performance Analyzer After that, you can see the children of three levels. � Figure 414 The children of the selected invocation expanded to the third layer Select the invocation you want to save, and select Copy Selection to Clipboard on the context menu. Figure 415 Copy the selection to clipboard Finally paste them to any text editor. DLM Alphaworks Page 320 of 374 IBM STG - Performance Visual Performance Analyzer Figure 416 The selected information pasted into a text editor 5.6.14Save Execution Flow Graph as Image Open one call trace data file, and you can save the visible area of execution flow graph by clicking . Figure 417 Save the visible execution flow graph Or you can save the whole of execution flow graph, by clicking . 5.6.15Show JProf Information If a jprof file is opened, it can reference Information view. Information view only can display the jprof file information. Open a .jprof file. The information of this file is displayed in Information view. DLM Alphaworks Page 321 of 374 IBM STG - Performance Visual Performance Analyzer Figure 418 The Jprof information shown in the Information view In the text box, there are descriptions as well as explanations for some conceptions. Column Labels include base value types, and Method Types of all methods that are listed in Method Overview are explained. DLM Alphaworks Page 322 of 374 IBM STG - Performance Visual Performance Analyzer Figure 419 The explanation of the column labels and method types 5.6.16Show Invocation in Invocation View When you select an invocation in call tree table, the selected one and its parent invocation, children invocations are shown in Invocation view. Figure 420 The Invocation view showing the relationship of the selected invocation in call tree table When you select an invocation in Call Stack view, the selected one and its parent invocation, children invocations are shown in Invocation view. DLM Alphaworks Page 323 of 374 IBM STG - Performance Visual Performance Analyzer Figure 421 The Invocation view showing the relationship of the selected invocation in Call Stack view Double-click parent invocation or child invocations, and parent invocation and child invocations of this invocation are displayed, while it is shown in Selected table 5.6.17View Call Stack 5.6.17.1Interact with views or pages Call Stack view is able to interact with Call Tree editor page, Method Overview and Invocation view. DLM Alphaworks Page 324 of 374 IBM STG - Performance Visual Performance Analyzer Select in Call Tree Table Select one invocation in call tree table. Call Stack view lists all ancestor invocations of the selected one. The top invocation is the furthest ancestor invocation; the undermost one is the nearest one. The first base value of invocations is shown in Call Stack view. Figure 422 The call stack of the selected invocation in call tree table Select in Method Overview Select one method in Method Overview. Call Stack view lists all traces invocations of the selected method. The undermost one is the number one method invocation, but others are its predecessors by sequence. DLM Alphaworks Page 325 of 374 IBM STG - Performance Visual Performance Analyzer Figure 423 The call stack of the selected method in Method Overview Show Invocation Relationship in Invocation View Open Invocation view, and select one invocation in Call Stack view. The selected invocation, its parent and children are displayed in Invocation view. DLM Alphaworks Page 326 of 374 IBM STG - Performance Visual Performance Analyzer Figure 424 The invocation relationship of the selected invocation in Call Stack view Select one and synchronize with Method of Object View Select one invocation in Call Stack view, and then its type object are displayed in Method of Object View. DLM Alphaworks Page 327 of 374 IBM STG - Performance Visual Performance Analyzer Figure 425 The method information of the selected invocation displayed in Method of Object View Select one and synchronize with Instruction View Select one invocation in Call Stack View, and then its instructions are displayed in Method of Object View, if it has instruction information. Figure 426 The instruction information of the selected invocation displayed in Instructions editor Call Stack view is used to display the ancestor invocations of the selected method invocation. DLM Alphaworks Page 328 of 374 IBM STG - Performance Visual Performance Analyzer Figure 427 The Call Stack view The first column Level means the depth of invocation of call stack. The second column CYCLECS means the Cycles value of this invocation. The third column Call Stack lists all invocations in call stack. The label Occurrence means there are how many invocations that the selected method invokes, and which one is shown at present. 5.6.17.2The context menu of Call Stack view There are six action items on the context menu: DLM Alphaworks Page 329 of 374 IBM STG - Performance Visual Performance Analyzer Figure 428 The context menu of Call Stack view Drill down in the Execution Flow Graph Select an invocation, and choose Drill Down in Execution Flow Graph, and then the selected invocation is displayed in the new Execution Flow editor page. Figure 429 Drill down an invocation in execution flow graph Drill Down in Call Tree DLM Alphaworks Page 330 of 374 IBM STG - Performance Visual Performance Analyzer Select an invocation, and choose Drill Down in Call Tree, and then a new tab named after the selected invocation is opened in the editor. Figure 430 Drill down an invocation in call tree Display as Color Bar Choose Display as Color Bar, and choose a column item (only those that are shown and are displayed by number can be seen). The column displays color bar instead of numbers. Figure 431 Display a column as color bar Select All Choose Select All, and all items in Call Stack view are selected. DLM Alphaworks Page 331 of 374 IBM STG - Performance Visual Performance Analyzer Figure 432 Select all the invocations in Call Stack view Copy Selection to Clipboard Select an invocation, and choose Copy Selection to Clipboard. Open a new txt file, then press CTRL+V or select copy in the context menu, all the information of the selected invocation is printed. Find... Choose Find... and a Find dialog pops up. Figure 433 Find an invocation For example, you want to locate the method initialize. In the Find dialog, input the method name initialize, choose Backward and click Find Next. DLM Alphaworks Page 332 of 374 IBM STG - Performance Visual Performance Analyzer Figure 434 Find the method containing the string “initialize” The invocation which contains the name initialize is found and listed in Call Stack view. DLM Alphaworks Page 333 of 374 IBM STG - Performance Visual Performance Analyzer Figure 435 The method containing the string “initialize” found 5.6.17.3Show different paths and levels Open a jprof file (or other JPROF files). Select a method invocation in Method Overview, and the trace invocations of the selected one are listed in Call Stack view. DLM Alphaworks Page 334 of 374 IBM STG - Performance Visual Performance Analyzer Figure 436 The invocations of the selected method shown in Call Stack view Change the occurrence path to the second one by clicking the button Figure 437 Change the occurrence path to the second one Show all the successors levels of the selected method invocation by clicking the button DLM Alphaworks Page 335 of 374 IBM STG - Performance Visual Performance Analyzer Figure 438 Display all the successor levels of the selected invocation Collapse the last level Figure 439 Collapse the last level Open the Select Column dialog, and change the first base value type that is shown. DLM Alphaworks Page 336 of 374 IBM STG - Performance Visual Performance Analyzer Figure 440 Change the first base level type The cumulative value is the cumulative value of that first shown value type, and the successors levels list is changed, too. Figure 441 The changed cumulative value and successor level list Select an invocation in call tree table, and all the ancestor invocations of the selected one are listed. In this situation, buttons , , , are enabled. DLM Alphaworks Page 337 of 374 IBM STG - Performance Visual Performance Analyzer Figure 442 Display the ancestor levels of the selected invocation in Call Stack view Show all the successor levels of the selected method invocation by clicking the button Figure 443 Display all the successor levels of the selected invocation It can be seen that the list in Call Stack view is the same as the relative list in call tree table. 5.6.18View method in a table When you open the .jprof file or .ctb file and so on, Method Overview will list all method names of this file. These methods are arranged by their Calls number, which is in descending order. The first base value type is shown in Method Overview. In the following example, CYCLES values are shown by default. DLM Alphaworks Page 338 of 374 IBM STG - Performance Visual Performance Analyzer Figure 444 The Method Overview The first column Calls means how many calls to this method occurs. The second column CYCLECS means the Cycles value of this method. The third column Name lists all method names in Method Overview. When you input key words in Filter Combo, button OK is enabled. Click OK, then the filter results are shown in Method Overview. There are two situations, using regular expression and common string If you input a common string (for example, "java/lang/string"), click OK directly, then the filter results are displayed. DLM Alphaworks Page 339 of 374 IBM STG - Performance Visual Performance Analyzer Figure 445 Filter by a common string The following picture shows the filtered result by using a common string. Figure 446 The filtered results using a common string If you input a regular expression (for example, ".*java/lang.*"), select ReEx checkbox and click OK, then the filter results are displayed. DLM Alphaworks Page 340 of 374 IBM STG - Performance Visual Performance Analyzer Figure 447 Filter by a regular expression The following picture shows the filtered results by using a regular expression: Figure 448 The filtered results using a regular expression � DLM Alphaworks Page 341 of 374 IBM STG - Performance Visual Performance Analyzer 5.6.19View call graph in Call Graph view The Call Graph view shared by Call Tree Analyzer supports the output files with call graph information from the following tools: Table 6 The performance tools on different platforms Platform Linux/X86 AIX Java oprofile 0.9.3 (--callgraph option) gprof (remote file and gmon.out file) PI JProf (rt-log file, or generic file) Tools For the general concepts of Expansion, Overall and Compound mode of Call Graph view, you can refer to 5.1.9 View call graph in Call Graph view Load an existing gprof remote file To open a gprof remote file, choose File -> Open File_ , and in Open File dialog select a kind of gprof file. Click Open to open the file. Figure 449 Open a gprof remote file � DLM Alphaworks Page 342 of 374 IBM STG - Performance Visual Performance Analyzer VPA tool will automatically open a gmon.out file if it exists in the same directory of gprof remote file. If not, a dialog named Method Overview Editor pops up and leads you to locate the gmon.out file, which is shown in the following picture. Figure 450 Confirm to open a gmon.remote file Click Yes and locate the corresponding file. The default name of the out file is gmon.out. You can also rename the out file according to your needs. Figure 451 Locate a corresponding gmon.out file The following picture shows the details of the remote file in editor. DLM Alphaworks Page 343 of 374 IBM STG - Performance Visual Performance Analyzer Figure 452 The editor displaying the remote file data The way to view the graph in Expansion, Overall and Compound mode within Call Tree Analyzer is as same as the way to view the graph within Profile Analyzer. However, you must first launch Call Tree Analyzer by clicking the icon . Call Tree Analyzer supports the input files such as gprof remote and jprof file. To get examples about how to view the graph, you can refer to 5.1.9 View call graph in Call Graph view. 5.6.20Ways to invoke Invocation View Select an invocation in call tree table, you can see the selected invocation, its parent invocation and children invocations are shown in Invocation View. DLM Alphaworks Page 344 of 374 IBM STG - Performance Visual Performance Analyzer Figure 453 Invoke the Invocation view through call tree table Select an invocation in Call Stack view, and the selected one, its parent invocation and children invocations are shown in Invocation View. Figure 454 Invoke the Invocation view through Call Stack view Select a method in Method Overview, and the selected one, the methods of its invocation's parent and the method of its invocation's children are shown in Invocation View. DLM Alphaworks Page 345 of 374 IBM STG - Performance Visual Performance Analyzer Figure 455 Invoke the Invocation view through Method Overview Select an invocation node in Call Graph view, and the selected one together with its caller and callee invocations is shown in Invocation View. Figure 456 Invoke the Invocation view through Call Graph view � Select a gprof call object in Method Overview Editor, and the selected one, its parent and children are shown in Invocation View. DLM Alphaworks Page 346 of 374 IBM STG - Performance Visual Performance Analyzer Figure 457 Invoke the Invocation view through Method Overview editor 5.6.21Analyze memory information Open a jprof file containing memory information. For example, a method invocation might has some memory objects as follows. i:java/security/AccessController.getContext()Ljava/security/AccessControlContext; -- 1 32 1 32 java/lang/Object -- 1 24 1 24 java/security/AccessControlContext The method AccessController.getContext() allocates the memory to two objects, and the two objects in the second and third line show how to allocate the memory. These memory informations are called Type (in Type Overview) or Object (in Object View) Browse all the memory information In Type Overview all the memory information are summarized and displayed. DLM Alphaworks Page 347 of 374 IBM STG - Performance Visual Performance Analyzer Figure 458 Browse the memory information through Type Overview Click the column AO and its types are sorted by AO(Allocated Objects). You can know which object uses the largest AO value in memory. Figure 459 Click the AO column title to sort by allocated objects Double click on a type in Type Overview, and the methods in Method Overview are filtered and methods allocate this type object are displayed. You can refer to filter method in Type Overview. Browse memory information of each method or each invocation 1. When you select a method in Method Overview, the type objects are displayed in Object View. If the selected method invocation has been invoked for several times, the objects allocated by each invocation are displayed together. DLM Alphaworks Page 348 of 374 IBM STG - Performance Visual Performance Analyzer Figure 460 Display the object types in Object View through Method Overview You can see that the method I:java/util/HashMap.addEntry are invoked twice and each invocation allocates the object java/util/HashMap$Entry in Memory. 2. When you select an invocation in call tree table or Call Stack view, the allocated type objects are displayed in Object View. DLM Alphaworks Page 349 of 374 IBM STG - Performance Visual Performance Analyzer Figure 461 Display the object types in Object View through call tree table You can see that the invocation i:java/security/AccessController.getContext() allocates the objects java/lang/Object and java/security/AccessControlContext in memory. Figure 462 Display the object types in Object View through Call Stack view � DLM Alphaworks Page 350 of 374 IBM STG - Performance Visual Performance Analyzer You can see that the invocation I:java/lang/Thread.Thread allocates the object java/lang/Object in memory. 5.6.22Change time stamp in execution flow graph Select a time stamp of the option Select Time Stamp Type on the context menu of the execution flow graph, and then the graph is redrawed by this time stamp type. Select a time stamp(for example, choose INSTRUCTIONS). Figure 463 Change the time stamp of execution flow graph Then the execution flow is redrawed. Figure 464 The time stamp of execution flow graph changed to INSTRUCTIONS � 5.6.23Highlight in call tree table DLM Alphaworks Page 351 of 374 IBM STG - Performance Visual Performance Analyzer Select a thread in the execution flow graph and choose Highlight in Call Tree in the context menu. Figure 465 Highlight a thread in call tree table Then the reference invocation is highlighted in the call tree table. DLM Alphaworks Page 352 of 374 IBM STG - Performance Visual Performance Analyzer Figure 466 The highlighted invocation in call tree table 5.6.24Detect repetition in execution flow graph Some methods may be invoked in the same mode time after time. In other words, a method is invoked for many times in the same way. DLM Alphaworks Page 353 of 374 IBM STG - Performance Visual Performance Analyzer Figure 467 The execution flow graph Choose Detect Repetition on the context menu. DLM Alphaworks Page 354 of 374 IBM STG - Performance Visual Performance Analyzer Figure 468 Detect the repletion of the invocations in execution flow graph Then the same method in the same invocation mode is folded. Figure 469 The folded methods in the same invocation mode � DLM Alphaworks Page 355 of 374 IBM STG - Performance Visual Performance Analyzer It can be seen that a method is invoked for ten times(X10) in the same mode. DLM Alphaworks Page 356 of 374 � IBM STG - Performance Visual Performance Analyzer 6.Appendix A - sample profiling session This process will walk through the typical usage of VPA to analyze problems, using the Profile Analyzer plug-in and some existing data. Where can you download the existing data? Look for the Sample ETM File at the bottom of the web page: http://www.alphaworks.ibm.com/tech/vpa/download. The following picture shows the file being saved to a workspace directory. Figure 470 The file saved to the workspace directory If you don’t have the Profile Analyzer directory, try this: DLM Alphaworks Page 357 of 374 IBM STG - Performance Visual Performance Analyzer Click the Profiling Resources tab; somehow this causes it to create the missing directory. Figure 471 Click the Profiling Resources tab to create the missing directory � Then click the Navigator tab to return to the view we want. Figure 472 Click the Navigator tab to return to the wanted view Once we have the Profile Analyzer folder in view, it is sometimes necessary to refresh it. Just right-click and select Refresh. DLM Alphaworks Page 358 of 374 IBM STG - Performance Visual Performance Analyzer Figure 473 Click Refresh on the context menu � Figure 474 Click the expand icon to expand the file If you don’t have a Navigator tab, adding a new view is easy with VPA! Just select Window => Show View => Other… Figure 475 Display a view � DLM Alphaworks Page 359 of 374 IBM STG - Performance Visual Performance Analyzer Select Navigator. It opens the Navigator tab at the bottom right window. Just drag it to the top left where we want it. Figure 476 Open the Navigator view Then you should navigate the directory D:\eclipse\workspace\Profile Analyzer to see your data. DLM Alphaworks Page 360 of 374 IBM STG - Performance Visual Performance Analyzer Open the sample.etm file by double-clicking it in the Navigator view. Figure 477 Open the sample.etm file � DLM Alphaworks Page 361 of 374 IBM STG - Performance Visual Performance Analyzer The initial view looks something like this: Figure 478 The initial Profile Analyzer perspective Expand Process > Module on the top right window by clicking the + icon: Figure 479 Expand the Process > Module node � DLM Alphaworks Page 362 of 374 IBM STG - Performance Visual Performance Analyzer Once expanded, we can see that on the 4-way test system, our single threaded Java test case left 3 of the processors idling, leaving the ticks nicely divided amongst the processors. The Java test case (Process ID 410f4) took 23.24% of the total ticks. A tick is a sampled address recording where the system was executing code. JITCODE is where the work for this Java test case is being done, and we expect this due to Jitting of the Java methods. Figure 480 The Profile Analyzer editor The bar chart at the bottom left of VPA shows this well. Within the Java process, we took a full 96.87% of the ticks in Jitted code. (If not shown, just click the bar chart icon on the Samples Distribution tab): Figure 481 The Samples Distribution view � DLM Alphaworks Page 363 of 374 IBM STG - Performance Visual Performance Analyzer Almost everything with VPA is a mouse click away!!! Double-clicking any tab changes it to a maximized view of that tab. Double-clicking the same tab changes it back. Try it by double-clicking the tab on the window on the top right: Double click on tab Figure 482 Double-click to maximize the Profile Analyzer editor This makes it easier to see everything. Figure 483 The maximized editor � DLM Alphaworks Page 364 of 374 IBM STG - Performance Visual Performance Analyzer Double-click the same tab to go back to the 4 pane view: Figure 484 Double-click the editor tab to restore the editor Select PID 410f4 and click the + icon to expand and see the modules within this process: DLM Alphaworks Page 365 of 374 IBM STG - Performance Visual Performance Analyzer Figure 485 The expanded process As we noted earlier, the JITCODE module is showing the most ticks as expected for the Java process. On the right hand side we see a breakdown of ticks for the JITCODE module. Figure 486 The symbol view displaying the symbols of the selected process in the editor Clicking the JITCODE Module expands the window on the right, showing a breakdown of ticks in the JITCODE. The hottest symbols are shown in descending order. Notice that thirtytwo() symbol is the biggest contributor at almost 52%: DLM Alphaworks Page 366 of 374 IBM STG - Performance Visual Performance Analyzer Figure 487 The symbols of the selected module If you look at the Samples Distribution view in the lower left corner, you will see that the information now corresponds to the JITCODE symbols. You can also switch between pie and bar charts. Click to switch to a pie chart Figure 488 Switch to a pie chart � DLM Alphaworks Page 367 of 374 IBM STG - Performance Visual Performance Analyzer Double click thirtytwo() symbol and notice that the Disassembly/Offsets information in the bottom right window is now filled in for this method. Clicking any other method in the top right window changes everything we are seeing in the graph and in this disassembly view to correspond to the method selected: Figure 489 Display the disassembly and offset information of the symbol in Disassembly/Offsets view We need to get the Disassembly/Offsets information as a full screen, so double-click that tab: DLM Alphaworks Page 368 of 374 IBM STG - Performance Visual Performance Analyzer Double click the tab Figure 490 Double-click to maximize the Disassembly/Offsets view � DLM Alphaworks Page 369 of 374 � IBM STG - Performance Visual Performance Analyzer We now see a full window view of the view: Figure 491 The maximized Disassembly/Offsets view Notice the colored bar on the right. This is the Hotness Bar. Red is used for any symbol that represents at least 20% of the total tick count. Magenta is used for any symbol that represents between 5% and 20% of the total tick count. Blue is used for any symbol that uses less than 5% of the total tick count. Darker shades show higher activity than lighter shades. DLM Alphaworks Page 370 of 374 IBM STG - Performance Visual Performance Analyzer Hotness bar for symbol thirtytwo() Figure 492 The hotness bar for the symbol thirtytwo() We want to see the code with the most activity, so click on the % column to do a quick sort: Click on % column to sort highest percentage of ticks towards the top Figure 493 Click the column title to sort by the percentage of ticks If you would like to use other Profile Analyzer views that is not currently shown. Simply go to Window->Show View -> Other … DLM Alphaworks Page 371 of 374 IBM STG - Performance Visual Performance Analyzer Figure 494 Open a view Expand Profile Analyzer to see all the available views … Figure 495 Expand the Profile Analyzer node to view all available views � DLM Alphaworks Page 372 of 374 IBM STG - Performance Visual Performance Analyzer And as an example, click Temporal Profiling to load the view… Figure 496 Open the Temporal Profiling view And you should now have a screen that looks like this … DLM Alphaworks Page 373 of 374 IBM STG - Performance Visual Performance Analyzer Temporal Profiling now appears as a view. Figure 497 The open Temporal Profiling view � DLM Alphaworks Page 374 of 374 �

Related docs
IBM System p570 Technical Overview and Introduction
Views: 1707  |  Downloads: 27
IBM Software Training Schedule
Views: 29  |  Downloads: 2
ibm ethernet switch high performance
Views: 4  |  Downloads: 0
User Guide
Views: 4  |  Downloads: 0
Acute Logic Analyzer
Views: 1  |  Downloads: 0
IBM
Views: 107  |  Downloads: 12
IBM T42
Views: 18  |  Downloads: 0
premium docs
Other docs by techmaster
sb0106
Views: 21  |  Downloads: 0
Anderson _ Dickson - Hokas Pokas
Views: 169  |  Downloads: 2
MP5317.5
Views: 225  |  Downloads: 0
PGI 232_71
Views: 19  |  Downloads: 0
PGI 222_8
Views: 15  |  Downloads: 0
sb0103
Views: 47  |  Downloads: 0
Complaints
Views: 408  |  Downloads: 6
FARapndx1
Views: 79  |  Downloads: 0
52_248m
Views: 50  |  Downloads: 0
Dfars 252_215
Views: 67  |  Downloads: 0
IG5315.406-3
Views: 242  |  Downloads: 0
Jerks v. Nice Guys
Views: 231  |  Downloads: 9
owlcr10
Views: 23  |  Downloads: 0