User’s Guide for VirtualGL 2.3

Intended audience: System Administrators, Graphics Programmers, Researchers, and others with knowledge of the Linux or Solaris operating systems, OpenGL and GLX, and X windows.

Table of Contents



1 Legal Information

somerights20

This document and all associated illustrations are licensed under the Creative Commons Attribution 2.5 License. Any works that contain material derived from this document must cite The VirtualGL Project as the source of the material and list the current URL for the VirtualGL website.

The VirtualGL server components include software developed by the FLTK Project and distributed under the terms of the FLTK License.

The package for the VirtualGL Client for Exceed includes PuTTY, which is released under this license.

VirtualGL includes portions of X.org, which is released under this license.

VirtualGL is licensed under the wxWindows Library License, v3.1, a derivative of the GNU Lesser General Public License (LGPL).



2 Overview

VirtualGL is an open source package that gives any Unix or Linux remote display software the ability to run OpenGL applications with full 3D hardware acceleration. Some remote display software, such as VNC, lacks the ability to run OpenGL applications at all. Other remote display software forces OpenGL applications to use a slow software-only OpenGL renderer, to the detriment of performance as well as compatibility. The traditional method of displaying OpenGL applications to a remote X server (indirect rendering) supports 3D hardware acceleration, but this approach causes all of the OpenGL commands and 3D data to be sent over the network to be rendered on the client machine. This is not a tenable proposition unless the data is relatively small and static, unless the network is very fast, and unless the OpenGL application is specifically tuned for a remote X-Windows environment.

With VirtualGL, the OpenGL commands and 3D data are instead redirected to a 3D graphics accelerator on the application server, and only the rendered 3D images are sent to the client machine. VirtualGL thus “virtualizes” 3D graphics hardware, allowing it to be co-located in the “cold room” with compute and storage resources. VirtualGL also allows 3D graphics hardware to be shared among multiple users, and it provides “workstation-like” levels of performance on even the most modest of networks. This makes it possible for large, noisy, hot 3D workstations to be replaced with laptops or even thinner clients. More importantly, however, VirtualGL eliminates the workstation and the network as barriers to data size. Users can now visualize huge amounts of data in real time without needing to copy any of the data over the network or sit in front of the machine that is rendering the data.

Normally, a Unix OpenGL application would send all of its drawing commands and data, both 2D and 3D, to an X-Windows server, which may be located across the network from the application server. VirtualGL, however, employs a technique called “split rendering” to force the 3D commands from the application to go to a 3D graphics card in the application server. VGL accomplishes this by pre-loading a dynamic shared object (DSO) into the application at run time. This DSO intercepts a handful of GLX, OpenGL, and X11 commands necessary to perform split rendering. Whenever a window is created by the application, VirtualGL creates a corresponding 3D pixel buffer (“Pbuffer”) on a 3D graphics card in the application server. Whenever the application requests that an OpenGL rendering context be created for the window, VirtualGL intercepts the request and creates the context on the corresponding Pbuffer instead. Whenever the application swaps or flushes the drawing buffer to indicate that it has finished rendering a frame, VirtualGL reads back the Pbuffer and sends the rendered 3D image to the client.

The beauty of this approach is its non-intrusiveness. VirtualGL monitors a few X11 commands and events to determine when windows have been resized, etc., but it does not interfere in any way with the delivery of 2D X11 commands to the X server. For the most part, VGL does not interfere with the delivery of OpenGL commands to the graphics card, either (there are some exceptions, such as its handling of color index rendering.) VGL merely forces the OpenGL commands to be delivered to a server-side graphics card rather than a client-side graphics card. Once the OpenGL rendering context has been established in a server-side Pbuffer, everything (including esoteric OpenGL extensions, fragment/vertex programs, etc.) should “just work.” In most cases, if an application runs locally on a 3D server/workstation, that same application will run remotely from that same machine using VirtualGL. However, if it were really as simple as that, we could all turn out the lights and go home. Most of the time spent developing VirtualGL has been spent working around “stupid application tricks.”

VirtualGL has two built-in “image transports” which can be used to send rendered 3D images to the client machine:

1. VGL Transport
The VGL Transport is most often used whenever the 2D X server (the X server used to draw the application’s GUI and transmit keyboard and mouse events back to the application server) is located across the network from the application server, for instance if the 2D X server is running on the user’s desktop machine. VirtualGL uses its own protocol on a dedicated TCP socket to send the rendered 3D images to the client machine, and the VirtualGL Client application decodes the images and composites them into the appropriate X window. The VGL Transport can either deliver uncompressed images (RGB encoded), or it can compress images in real time using a high-speed JPEG codec. It also supports the delivery of stereo image pairs, which can be reconstructed into a stereo image by the VirtualGL Client.

Figure 2.1: The VGL Transport with a Remote 2D X Server

vgltransport

2. X11 Transport
The X11 Transport simply draws the rendered 3D images into the appropriate X window using XPutImage() and similar X-Windows commands. This is most useful in conjunction with an “X Proxy”, which can be one of any number of Unix remote display applications, such as VNC. These X proxies are essentially “virtual” X servers. They appear to the application as a normal X server, but they perform X11 rendering to a virtual framebuffer in main memory rather than to a real framebuffer on a graphics card. This allows the X proxy to send only images to the client machine rather than chatty X-Windows rendering commands. When using the X11 Transport, VirtualGL does not perform any image compression or encoding itself. It instead relies upon an X proxy to encode and deliver the images to the client(s). Since the use of an X proxy eliminates the need to send X-Windows commands over the network, this is the best means of using VirtualGL over high-latency or low-bandwidth networks.

Figure 2.2: The X11 Transport with an X Proxy

x11transport

VirtualGL also provides an API which can be used to develop custom image transport plugins.



3 System Requirements

3.1 Linux/x86

Server (x86) Server (x86-64) Client
Recommended CPU Pentium 4, 1.7 GHz or faster (or equivalent)
  • For optimal performance, the processor should support SSE2 extensions.
  • Dual processors or dual cores recommended
Pentium 4/Xeon with EM64T, or…
AMD Opteron or Athlon64, 1.8 GHz or faster
  • Dual processors or dual cores recommended
Pentium III or Pentium 4, 1.0 GHz or faster (or equivalent)
Graphics Any decent 3D graphics card that supports Pbuffers
  • If the manufacturer of your 3D adapter provides proprietary drivers for Linux, then it is recommended that you install these. Many of the drivers that ship with Linux do not provide full 3D acceleration or Pbuffer support.
Any graphics card with decent 2D performance
  • If using a 3D graphics card, install the vendor drivers for that 3D graphics card.
Recommended O/S
  • Any distribution in the Red Hat or SuSE families that contains GLIBC 2.3.2 or later (including Fedora, CentOS 3 and later, and White Box)
  • Ubuntu Linux v6.0 or later
Other Software X server configured to export True Color (24-bit or 32-bit) visuals

3.2 Solaris/x86

Server (x86) Server (x86-64) Client
Recommended CPU Pentium 4, 1.7 GHz or faster (or equivalent)
  • For optimal performance, the processor should support SSE2 extensions.
  • Dual processors or dual cores recommended
Pentium 4/Xeon with EM64T, or…
AMD Opteron or Athlon64, 1.8 GHz or faster
  • Dual processors or dual cores recommended
Pentium III or Pentium 4, 1.0 GHz or faster (or equivalent)
Graphics nVidia 3D graphics card Any graphics card with decent 2D performance
O/S
  • Solaris 10 Update 1 or newer (Update 3 or newer recommended)
  • OpenSolaris 2008.11 (or newer)
Other Software X server configured to export True Color (24-bit or 32-bit) visuals

3.3 Mac/x86

Client
Recommended CPU Any Intel-based Mac
O/S OS X 10.4 (“Tiger”) or later
Other Software
  • VGL Transport Only: Mac X11 application (in the “Optional Installs” package on the OS X install discs)

3.4 Windows

Client
Recommended CPU Pentium III or Pentium 4, 1.0 GHz or faster (or equivalent)
Graphics Any graphics card with decent 2D performance
O/S Windows 2000 or later
Other Software
  • VGL Transport Only: Cygwin/X or OpenText Exceed v8 or newer
  • Client display must have a 24-bit or 32-bit color depth (True Color.)

3.5 Additional Requirements for Stereographic Rendering

The client requirements do not apply to anaglyphic stereo. See Chapter 16 for more details.

Server Client
Linux 3D graphics card that supports stereo (examples: nVidia Quadro, ATI FirePro) and is configured to export stereo visuals
Solaris/x86
Mac/x86 N/A 3D graphics card that supports stereo (example: nVidia Quadro) and is configured to export stereo visuals
Windows N/A
  • 3D graphics card that supports stereo (examples: nVidia Quadro, ATI FirePro) and is configured to export stereo pixel formats
  • OpenText Exceed 3D v8 or newer

3.6 Additional Requirements for Transparent Overlays

Client
Linux 3D graphics card that supports transparent overlays (examples: nVidia Quadro, ATI FirePro) and is configured to export overlay visuals
Solaris/x86
Mac/x86
Windows
  • 3D graphics card that supports transparent overlays (examples: nVidia Quadro, ATI FirePro) and is configured to export overlay pixel formats
  • OpenText Exceed 3D v8 or newer



4 Obtaining and Installing VirtualGL

VirtualGL must be installed on any machine that will act as a VirtualGL server or as a client for the VGL Transport. It is not necessary to install VirtualGL on the client machine if using VNC or another type of X proxy.

4.1 Installing VirtualGL on Linux

  1. Download the appropriate VirtualGL binary package for your system from the Files area of the VirtualGL SourceForge project page.

    If you wish to run both 32-bit and 64-bit OpenGL applications using VirtualGL on 64-bit Linux systems, you will need to install both the i386 and x86_64 VirtualGL RPMs or both the “VirtualGL” and “VirtualGL32” amd64 DEBs. (The VirtualGL32 DEB is a supplementary package that contains only the 32-bit server components.)

  2. Log in as root, cd to the directory where you downloaded the binary package, and issue the following commands:
    RPM-based systems
    rpm -e VirtualGL
    rpm -i VirtualGL*.rpm
    
    Debian-based systems
    dpkg -r VirtualGL
    dpkg -i VirtualGL*.deb
    

4.2 Installing VirtualGL on Solaris

  1. Download the VirtualGL Solaris package (VirtualGL-{version}-solarisx86.pkg.bz2 for 32-bit systems or VirtualGL-{version}-solarisx64.pkg.bz2 for 64-bit systems) from the Files area of the VirtualGL SourceForge project page.

    The 64-bit package also contains 32-bit VirtualGL server components, which can be used to run 32-bit OpenGL applications on 64-bit servers.

  2. Log in as root, cd to the directory where you downloaded the package, and issue the following commands:
    If VirtualGL 2.1.x or older is installed
    pkgrm SUNWvgl
    
    If VirtualGL 2.2 or newer is installed
    pkgrm VirtualGL
    
    (answer “Y” when prompted.)
    bzip2 -d VirtualGL*.pkg.bz2
    pkgadd -d VirtualGL*.pkg
    
    Select the VirtualGL package (usually option 1) from the menu.

4.3 Installing the VirtualGL Client on OS X

  1. Download the VirtualGL Mac disk image (VirtualGL-{version}.dmg) from the Files area of the VirtualGL SourceForge project page.
  2. Open the disk image, then open VirtualGL-{version}.pkg inside the disk image. Follow the instructions to install the Mac client.

4.4 Installing the VirtualGL Client on Windows (Exceed)

  1. Download the VirtualGL Client for Exceed installer package (VirtualGL[64]-{version}-exceed.exe) from the Files area of the VirtualGL SourceForge project page.
  2. Run the installer. The installation of the VirtualGL Client should be self-explanatory. The only configuration option is the directory into which you want the files to be installed.

NOTE: The VirtualGL Client for Exceed installer does not remove any previous versions of the VirtualGL Client that may be installed on your machine. If you wish, you can remove these older versions manually by using the “Add or Remove Programs” applet in the Control Panel (or the “Programs and Features” applet if you are running Windows Vista.)

4.5 Installing the VirtualGL Client on Windows (Cygwin/X)

  1. Make sure that the following Cygwin packages are installed:

    libGL
    libGLU
    libstdc++
    libX11
    libXext
    openssh
    xorg-xserver
    xauth
  2. Download the VirtualGL Cygwin package (VirtualGL-{version}-cygwin.tar.bz2) from the Files area of the VirtualGL SourceForge project page.
  3. Run the Cygwin Setup application (the same application you used to install Cygwin.)
  4. On the “Choose a Download Source” page, select “Install from Local Directory”.
  5. On the “Select Root Install Directory” page, use the same options that you used when installing Cygwin.
  6. On the “Select Local Package Directory” page, enter the directory containing the VirtualGL Cygwin package.
  7. On the “Select Packages” page, change “View” to “Partial” and verify that the VirtualGL package with the correct version is selected for install. Click “Next>” to install.

4.6 Installing VirtualGL from Source

If you are using a platform for which there is not a pre-built VirtualGL binary package available, then log in as root, download the VirtualGL source tarball (VirtualGL-{version}.tar.gz) from the Files area of the VirtualGL SourceForge project page, uncompress it, cd VirtualGL-{version}, and read the contents of BUILDING.txt for further instructions on how to build and install VirtualGL from source.

4.7 Uninstalling VirtualGL

Linux

As root, issue one of the following commands:

RPM-based systems
rpm -e VirtualGL
Debian-based systems
dpkg -r VirtualGL

Solaris

As root, issue the following command:

pkgrm VirtualGL

Answer “yes” when prompted.

OS X

Use the “Uninstall VirtualGL” application provided in the VirtualGL disk image, or issue the following command from the Terminal:

sudo /opt/VirtualGL/bin/uninstall

Windows (Exceed)

Use the “Add or Remove Programs” applet in the Control Panel (or the “Programs and Features” applet if you are running Windows Vista), or select “Uninstall VirtualGL Client” in the “VirtualGL Client” Start Menu group.

Windows (Cygwin/X)

  1. Run the Cygwin Setup application (the same application you used to install Cygwin.)
  2. On the “Choose a Download Source” page, select “Install from Local Directory”.
  3. On the “Select Root Install Directory” page, use the same options that you used when installing Cygwin.
  4. On the “Select Local Package Directory” page, enter the directory containing the VirtualGL Cygwin package.
  5. On the “Select Packages” page, change “View” to “Full”, find the “VirtualGL” package in the list, and change its status from “Keep” to “Uninstall”. Click “Next>” to uninstall.



5 Configuring a Linux or Solaris Machine as a VirtualGL Server

5.1 Granting Access to the 3D X Server

VirtualGL requires access to the application server’s 3D graphics card so that it can create off-screen pixel buffers (Pbuffers) and redirect the 3D rendering from applications into these Pbuffers. Unfortunately, accessing a 3D graphics card on Linux and Solaris/x86 systems requires going through an X server. On such systems, the only way to share the application server’s 3D graphics card among multiple users is to grant those users access to the 3D X server (the X server attached to the application server’s 3D graphics card. Refer to the figures in Chapter 2.)

It is important to understand the security risks associated with this. Once a user has access to the 3D X server, there is nothing that would prevent the user from logging keystrokes or reading back images from that X server. Using xauth, one can obtain “untrusted” X authentication keys which prevent such exploits, but unfortunately, those untrusted keys also disallow access to the 3D hardware. Thus, it is necessary to grant full trusted access to the 3D X server for any users that will need to run VirtualGL. Unless you fully trust the users to whom you are granting this access, then you should avoid logging in locally to the 3D X server (particularly as root) unless absolutely necessary.

This section will explain how to configure a VirtualGL server such that selected users can run VirtualGL, even if the server is sitting at the login prompt.

  1. Shut down the display manager:
    Ubuntu Linux servers
    /etc/init.d/gdm stop
    
    SuSE Linux servers
    /etc/init.d/xdm stop
    
    Red Hat/Fedora Linux servers
    init 3
    
    Solaris 10 servers running GDM
    svcadm disable gdm2-login
    
    Solaris 11/OpenSolaris servers running GDM
    svcadm disable gdm
    
    Solaris servers running dtlogin
    /etc/init.d/dtlogin stop
    
  2. Log in as root from the text console (or remotely using SSh.)
  3. Run
    /opt/VirtualGL/bin/vglserver_config
    
  4. Select option 1 (Configure server for use with VirtualGL in GLX mode.)
  5. Restrict 3D X server access to vglusers group (recommended)?
    [Y/n]
    
    Yes
    Only users in the vglusers group can use VirtualGL (the configuration script will create the vglusers group if it doesn’t already exist.) This is the most secure option, since it prevents any users outside of the vglusers group from accessing (and thus exploiting) the 3D X server.
    No
    VirtualGL can be used by any user that successfully logs into the VirtualGL server. The 3D X server can also be accessed (and potentially exploited) by any user who is logged into the VirtualGL server. If you choose this option, it is recommended that you also disable the XTEST extension (see below.)
  6. Restrict framebuffer device access to vglusers group (recommended)?
    [Y/n]
    
    Yes
    Only users in the vglusers group can run OpenGL applications on the VirtualGL server (the configuration script will create the vglusers group if it doesn’t already exist.) This limits the possibility that an unauthorized user could snoop the 3D framebuffer device(s) and thus see (or alter) the output of a 3D application that is being used with VirtualGL.
    No
    Any authenticated user can run OpenGL applications on the VirtualGL server. If it is necessary for users outside of the vglusers group to log in locally to this server and run OpenGL applications, then this option must be selected.
  7. Disable XTEST extension (recommended)?
    [Y/n]
    
    Yes
    Disabling XTEST will not prevent a user from logging keystrokes or reading images from the 3D X server, but if a user has access to the 3D X server, disabling XTEST will prevent them from inserting keystrokes or mouse events and thus hijacking local X sessions on that X server.

    Certain Linux distributions do not have the X server command-line entries in their GDM configuration files. On these distributions, it will be necessary to run gdmsetup and manually add an argument of -tst to the X server command line to disable XTEST for the first time. After this, vglserver_config should be able to disable and enable XTEST properly. This is known to be necessary for openSUSE 10 and Red Hat Enterprise Linux 5.

    No
    x11vnc and x0vncserver both require XTEST, so if you need to attach a VNC server to the 3D X server, then it is necessary to answer “No” (and thus leave XTEST enabled.)
  8. If you chose to restrict X server or framebuffer device access to the vglusers group, then edit /etc/group and add root to the vglusers group. If you choose, you can also add additional users to the group at this time. Note that any user you add to vglusers must log out and back in again before their new group permissions will take effect.
  9. Restart the display manager:
    Ubuntu Linux servers
    /etc/init.d/gdm start
    
    SuSE Linux servers
    /etc/init.d/xdm start
    
    Red Hat/Fedora Linux servers
    init 5
    
    Solaris 10 servers running GDM
    svcadm enable gdm2-login
    
    Solaris 11/OpenSolaris servers running GDM
    svcadm enable gdm
    
    Solaris servers running dtlogin
    /etc/init.d/dtlogin start
    

Sanity Check

To verify that the application server is ready to run VirtualGL, log out of the server, log back into the server using SSh, and execute the following commands in the SSh session:

If you restricted 3D X server access to vglusers
xauth merge /etc/opt/VirtualGL/vgl_xauth_key
xdpyinfo -display :0
/opt/VirtualGL/bin/glxinfo -display :0 -c

NOTE: xauth and xdpyinfo are in /usr/openwin/bin on Solaris systems.

If you did not restrict 3D X server access
xdpyinfo -display :0
/opt/VirtualGL/bin/glxinfo -display :0 -c

Both commands should output a list of visuals and should complete with no errors. If you chose to disable the XTEST extension, then check the output of xdpyinfo to verify that XTEST does not show up in the list of extensions.

You should also examine the output of glxinfo to ensure that at least one of the visuals is 24-bit or 32-bit TrueColor and has Pbuffer support (the latter is indicated by a “P” in the last column.) Example:

    visual  x  bf lv rg d st colorbuffer ax dp st accumbuffer  ms  cav  drw
  id dep cl sp sz l  ci b ro  r  g  b  a bf th cl  r  g  b  a ns b eat  typ
---------------------------------------------------------------------------
0x151  0 tc  0 32  0 r  y  .  8  8  8  0  4 24  8 16 16 16 16  0 0 None PXW

If none of the visuals has Pbuffer support, then this is most likely because there is no 3D acceleration, which is most likely because the correct 3D drivers are not installed (or are misconfigured.) Lack of 3D acceleration is also typically indicated by the word “Mesa” in the client GLX vendor string and/or the OpenGL vendor string, and the words “Software Rasterizer” in the OpenGL renderer string.

5.2 Using VirtualGL with Multiple Graphics Cards

VirtualGL can redirect the OpenGL commands from a 3D application to any 3D graphics card in the server machine. In order for this to work, however, all of the 3D graphics cards must be attached to different screens on the same X server using Xinerama. They can then be individually addressed by setting VGL_DISPLAY to (or invoking vglrun -d with) :0.0, :0.1, :0.2, etc.

5.3 SSh Server Configuration

The application server’s SSh daemon should have the X11Forwarding option enabled and the UseLogin option disabled. This is configured in sshd_config, which is usually located under /etc/ssh.

5.4 Un-Configuring the Server

You can use the vglserver_config script to restore the security settings that were in place before VirtualGL was installed. Option 2 (Unconfigure server for use with VirtualGL in GLX mode) will remove any shared access to the 3D X server and thus prevent VirtualGL from accessing the 3D hardware in that manner. Additionally, this option will re-enable the XTEST extension on the 3D X server and will restore the framebuffer device permissions to their default (by default, only root or the user that is currently logged into the application server locally can access the framebuffer devices.)

NOTE: Unconfiguring the server does not remove the vglusers group.

After selecting Option 2, you must restart the display manager before the changes will take effect.



6 Configuring a Windows Machine as a VGL Transport Client

6.1 Configuring and Optimizing Exceed

If using the VirtualGL Client for Exceed, then add the Exceed path (example: C:\Program Files\Hummingbird\Connectivity\9.00\Exceed) to the system PATH environment if it isn’t already there.

Disabling Pixel Format Conversion (Exceed 2006 and earlier)

  1. Load Exceed XConfig (right-click on the Exceed taskbar icon, then select Tools–>Configuration.)
  2. Open the “X Server Protocol” applet in XConfig.

    If you are using the “Classic View” mode of XConfig, open the “Protocol” applet instead.

  3. In the “X Server Protocol” applet, select the “Protocol” tab and make sure that “Use 32 bits per pixel for true color” is not checked.

    exceed1
  4. Click “Validate and Apply Changes.” If XConfig asks whether you want to perform a server reset, click “Yes.”

Disabling Backing Store

  1. Load Exceed XConfig (right-click on the Exceed taskbar icon, then select Tools–>Configuration.)
  2. Open the “Other Server Settings” applet in XConfig.

    If you are using the “Classic View” mode of XConfig, open the “Performance” applet instead.

  3. Select the “Performance” tab and make sure that “Default Backing Store” is set to “None.”

    exceed3
  4. Click “Validate and Apply Changes.” If XConfig asks whether you want to perform a server reset, click “Yes.”

Enabling MIT-SHM

VirtualGL has the ability to take advantage of the MIT-SHM extension in OpenText Exceed to accelerate image drawing on Windows. This can significantly improve the overall performance of the VirtualGL pipeline when running over a local-area network.

The bad news is that this extension is not consistently implemented across all versions of Exceed. In particular, Exceed 8, Exceed 9, and Exceed 2008 require patches to make it work properly. If you are using one of these versions of Exceed, you will need to obtain the following patches from the OpenText support site:

Product Patches Required How to Obtain
Exceed 8.0 hclshm.dll v9.0.0.1 (or higher)
xlib.dll v9.0.0.3 (or higher)
exceed.exe v8.0.0.28 (or higher)
Download all patches from the OpenText support site.
(OpenText WebSupport account required)
Exceed 9.0 hclshm.dll v9.0.0.1 (or higher)
xlib.dll v9.0.0.3 (or higher)
exceed.exe v9.0.0.9 (or higher)
exceed.exe can be patched by running Hummingbird Update.

All other patches must be downloaded from the OpenText support site.
(OpenText WebSupport account required)
Exceed 2008 xlib.dll v13.0.1.235 (or higher)
(or install the latest Connectivity 2008 Service Pack.)
Download all patches from the OpenText support site.
(OpenText WebSupport account required)

No patches should be necessary for Exceed 10, 2006, or 2007.

Next, you need to enable the MIT-SHM extension in Exceed:

  1. Load Exceed XConfig (right-click on the Exceed taskbar icon, then select Tools–>Configuration.)
  2. Open the “X Server Protocol” applet in XConfig.

    If you are using the “Classic View” mode of XConfig, open the “Protocol” applet instead.

  3. Select the “Extensions” tab and make sure that “MIT-SHM” is checked.

    exceed2
  4. Click “Validate and Apply Changes.” If XConfig asks whether you want to perform a server reset, click “Yes.”

6.2 Optimizing Cygwin/X

VirtualGL has the ability to take advantage of the MIT-SHM extension in Cygwin/X to accelerate image drawing on Windows. This can significantly improve the overall performance of the VirtualGL pipeline when running over a local-area network.

To enable MIT-SHM in Cygwin/X:

  1. Open a Cygwin Bash shell
  2. Run cygserver-config
  3. Answer “yes” when asked “Do you want to install cygserver as service?”
  4. Run net start cygserver
  5. Add server to the CYGWIN system environment variable (create this environment variable if it doesn’t already exist)
  6. Start or re-start Cygwin/X
  7. Run xdpyinfo and verify that MIT-SHM appears in the list of X extensions



7 Using VirtualGL with the VGL Transport

Advantages of the VGL Transport

Disadvantages of the VGL Transport

7.1 VGL Transport with X11 Forwarding

This mode is recommended for use on secure local-area networks. The X11 traffic is encrypted, but the VGL Transport is left unencrypted to maximize performance.

Procedure for Linux/Solaris/Mac/Cygwin Clients

  1. Start the X server if it isn’t started already.
    Mac clients: start the Mac X11 application.
    Cygwin clients: start Cygwin/X.
  2. Open a new terminal window.
    Mac clients: in the X11 application, start a new xterm [Command-N] if one isn’t already started.
    Cygwin clients: start a new xterm if one isn’t already started (right-click on the Cygwin/X taskbar icon, then select Applications–>xterm.)
  3. In the same terminal/xterm window, open a Secure Shell (SSh) session into the VirtualGL server:
    /opt/VirtualGL/bin/vglconnect {user}@{server}
    
    Replace {user} with your user account name on the VirtualGL server and {server} with the hostname or IP address of that server.
  4. In the SSh session, start a 3D application using VirtualGL:
    /opt/VirtualGL/bin/vglrun [vglrun options] {application_executable_or_script} {arguments}
    
    Consult Chapter 19 for more information on vglrun command-line options.

Procedure for Windows Clients Running Exceed

  1. Start Exceed if it isn’t already started. Hover the mouse pointer over the Exceed taskbar icon and make a note of the display number on which Exceed is listening (Example: “Exceed 0.0 Multiwindow Mode”.)
  2. Open a new Command Prompt.
  3. In the same Command Prompt window, set the DISPLAY environment variable to match the display on which Exceed is listening. Example:
    set DISPLAY=:0.0
    

    If you only ever plan to use one Exceed session at a time, then you can set the DISPLAY environment variable in your global user environment.

  4. Open a Secure Shell (SSh) session into the VirtualGL server:
    cd /d "c:\program files\virtualgl-{version}-{build}"
    vglconnect {user}@{server}
    
    Replace {user} with your user account name on the VirtualGL server and {server} with the hostname or IP address of that server.
  5. In the SSh session, start a 3D application using VirtualGL:
    /opt/VirtualGL/bin/vglrun [vglrun options] {application_executable_or_script} {arguments}
    
    Consult Chapter 19 for more information on vglrun command-line options.

7.2 VGL Transport with a Direct X11 Connection

As with the previous mode, this mode performs optimally on local-area networks. However, it is less secure, since both the X11 traffic and the VGL Transport are unencrypted. This mode is primarily useful in grid environments where you may not know ahead of time which server will execute a VirtualGL job. It is assumed that the “submit host” (the machine into which you connect with SSh) and the “execute hosts” (the machines that will run VirtualGL jobs) share the same home directories and reside in the same domain.

Some newer Linux and Solaris distributions ship with default settings that do not allow TCP connections into the X server. Such systems cannot be used as clients with this procedure unless they are reconfigured to allow X11 TCP connections.

Procedure

The procedure for this mode is identical to the procedure for X11 forwarding, except that you should pass a -x argument to vglconnect when connecting to the server:

/opt/VirtualGL/bin/vglconnect -x {user}@{server}

7.3 VGL Transport with X11 Forwarding and SSh Tunneling

Both the VGL Transport and the X11 traffic are tunneled through SSh when using this mode, and thus it provides a completely secure solution. It is also useful when either the VirtualGL server or the client machine are behind restrictive firewalls and only SSh connections are allowed through. Using SSh tunneling on wide-area networks should not affect performance significantly. However, using SSh tunneling on a local-area network can reduce VirtualGL’s performance by anywhere from 20-40%.

Procedure

The procedure for this mode is identical to the procedure for X11 forwarding, except that you should pass a -s argument to vglconnect when connecting to the server:

/opt/VirtualGL/bin/vglconnect -s {user}@{server}

vglconnect will make two SSh connections into the server, the first to find an open port on the server and the second to create the SSh tunnel for the VGL Transport and open the secure shell. If you are not using an SSh agent to create password-less logins, then this mode will require you to enter your password twice.

vglconnect -s can be used to create multi-layered SSh tunnels. For instance, if the VirtualGL server is not directly accessible from the Internet, you can use vglconnect -s to connect to a gateway server, then use vglconnect -s again on the gateway server to connect to the VirtualGL server. Both the X11 traffic and the VGL Transport will be forwarded from the VirtualGL server through the gateway and to the client.

sshtunnel

7.4 VGL Transport over Gigabit Networks

When using the VGL Transport over Gigabit Ethernet or faster networks, it may be desirable to disable image compression. This can be accomplished by passing an argument of -c rgb to vglrun or setting the VGL_COMPRESS environment variable to rgb on the VirtualGL server. Disabling image compression will reduce VirtualGL’s server and client CPU usage by 50% or more, but the tradeoff is that it will also increase VirtualGL’s network usage by a factor of 10 or more. Thus, disabling image compression is not recommended unless you are using switched Gigabit Ethernet (or faster) infrastructure and have plenty of bandwidth to spare.

7.5 VGL Transport with XDMCP

XDMCP is very insecure and is not recommended as a means of running VirtualGL, in general. This section is provided mainly for completeness and should not be construed as an endorsement of XDMCP. In general, using an X proxy is a much better approach for getting a remote desktop session on the 3D application server.

Using the VGL Transport with XDMCP is conceptually similar to using the VGL Transport with a direct X11 connection. The major difference is that, rather than remotely displaying individual X windows to the 2D X server, XDMCP remotely displays a complete desktop session from the application server. Any applications that are started within this desktop session will run on the application server, not the client. Thus, vglconnect cannot be used in this case. Instead, it is necessary to start vglclient manually on the client machine.

Procedure

  1. Configure the server machine to accept XDMCP connections. This may require opening specific ports in its firewall.
  2. Configure the client machine to make XDMCP connections. This may require enabling X11 TCP connections and opening specific ports in its firewall.
  3. Once you have established an XDMCP connection from the client to the server, open a terminal inside the XDMCP session and type:
    xhost +LOCAL:
    

    This grants access to the 2D X server for any user that is currently logged into the client machine. This is not very secure, but neither is using XDMCP. If you are concerned, then see below for a discussion of how to use xauth to provide 2D X server access in a slightly more secure manner.

  4. If you are using a Mac or Windows client, or if you are using a nested X server (such as Xephyr or XNest) on a Linux/Unix client to make the XDMCP connection, then the next step is easy. Simply open a new terminal/command prompt on the client machine, set the DISPLAY environment variable to the display name of the X server that is running the XDMCP session (usually :0 or :1), and type:
    vglclient -detach
    
    You can now close the terminal/command prompt, if you wish.
  5. If you are running a full-screen XDMCP session on a Linux/Unix client (for instance, using GDM Chooser), then starting vglclient is a bit trickier. You will need to use SSh to connect back into the client machine from inside the XDMCP session. Then, in the client SSh session, set the DISPLAY environment variable to the display name of the X server that is running the XDMCP session (usually :0 or :1), and type:
    vglclient -detach
    
    You can now close the SSh session, if you wish.

Security

Typing xhost +LOCAL: in step 3 above opens the 2D X server to all current users of the client machine. This shouldn’t pose any significant risk if the client is a Windows or a Mac machine. However, Linux/Unix clients might have multiple simultaneous users, so in these cases, it may be desirable to use a more secure method of granting access to the 2D X server.

Instead of typing xhost +LOCAL:, you can type the following:

xauth nextract - $DISPLAY | sed "s/.*[ ]//g" | xargs ssh {client} xauth add {display} .

where {client} is the hostname or IP address of the client machine and {display} is the display name of the 2D X server, from the point of view of the client machine (usually :0 or :1).

This extracts the XAuth key for the 2D X server, then remotely adds it to the XAuth keyring on the client machine.

7.6 The VirtualGL Client Application: Nuts and Bolts

The VirtualGL Client application (vglclient) receives encoded and/or compressed images on a dedicated TCP socket, decodes and/or decompresses the images, and draws the images into the appropriate X window. The vglconnect script wraps both vglclient and SSh to greatly simplify the process of creating VGL Transport connections.

vglconnect invokes vglclient with an argument of -detach, which causes vglclient to completely detach from the console and run as a background daemon. It will remain running silently in the background, accepting VGL Transport connections for the X server on which it was started, until that X server is reset or until the vglclient process is explicitly killed. Logging out of the X server will reset the X server and thus kill all vglclient instances that are attached to it. You can also explicitly kill all instances of vglclient running under your user account by invoking

vglclient -kill

(vglclient for Linux/Mac/Solaris/Cygwin is in /opt/VirtualGL/bin, and vglclient for Exceed is in c:\program files\virtualgl-{version}-{build}.)

vglconnect instructs vglclient to redirect all of its console output to a log file named {home}/.vgl/vglconnect-{hostname}-{display}.log, where {home} is the path of the current user’s home directory (%USERPROFILE% if using the VirtualGL Client for Exceed), {hostname} is the name of the computer running vglconnect, and {display} is the name of the current X display (read from the DISPLAY environment or passed to vglconnect using the -display argument.) In the event that something goes wrong, this log file is the first place to check.

When vglclient successfully starts on a given X display, it stores its listener port number in a root window property on the X display. If other vglclient instances attempt to start on the same X display, they read the X window property, determine that another vglclient instance is already running, and exit to allow the first instance to retain control. vglclient will clean up the X property under most circumstances, even if it is explicitly killed. However, under rare circumstances (if sent a SIGKILL signal on Unix, for instance), a vglclient instance may exit uncleanly and leave the X property set. In these cases, it may be necessary to add an argument of -force to vglconnect the next time you use it. This tells vglconnect to start a new vglclient instance, regardless of whether vglclient thinks that there is already an instance running on this X display. Alternately, you can simply reset your X server to clear the orphaned X window property.

7.6.1 The VirtualGL Client and Firewalls

To retain compatibility with previous versions of VirtualGL, the first vglclient instance on a given machine will attempt to listen on port 4242 for unencrypted connections and 4243 for SSL connections (if VirtualGL was built with OpenSSL support.) If it fails to obtain one of those ports, because another application or another vglclient instance is already using them, then vglclient will try to obtain a free port in the range of 4200-4299. Failing that, it will request a free port from the operating system.

In a nutshell: if you only ever plan to run one X server at a time on your client machine, which means that you’ll only ever need one instance of vglclient at a time, then it is sufficient to open inbound port 4242 (and 4243 if you plan to use SSL) in your client machine’s firewall. If you plan to run multiple X servers on your client machine, which means that you will need to run multiple vglclient instances, then you may wish to open ports 4200-4299. Similarly, if you are running vglclient on a multi-user X proxy server that has a firewall, then you may wish to open ports 4200-4299 in the server’s firewall. Opening ports 4200-4299 will accommodate up to 100 separate vglclient instances (50 if OpenSSL support is enabled.) More instances than that cannot be accommodated on a firewalled machine, unless the firewall is able to create rules based on application executables instead of listening ports.

Note that it is not necessary to open any inbound ports in the firewall to use the VGL Transport with SSh Tunneling.



8 Using VirtualGL with X Proxies Such as VNC

The VGL Transport is a good solution for using VirtualGL over a fast network. However, the VGL Transport is not generally suitable for high-latency or low-bandwidth networks, due to its reliance on the X11 protocol to send the non-3D elements of the 3D application’s GUI. The VGL Transport also requires an X server to be running on the client machine, which makes it generally more difficult to deploy and use, particularly on Windows clients. VirtualGL can be used with an “X proxy” to overcome these limitations. An X proxy acts as a virtual X server, receiving X11 commands from the application (and from VirtualGL), rendering the X11 commands into images, compressing the resulting images, and sending the compressed images over the network to a client or clients. X proxies perform well on all types of networks, including high-latency and low-bandwidth networks. They often provide rudimentary collaboration capabilities, allowing multiple clients to simultaneously view the same X session and pass around control of the keyboard and mouse. X proxies are also stateless, meaning that the client can disconnect and reconnect at will from any machine on the network, and the 3D application will remain running on the server.

Since VirtualGL is sending rendered 3D images to the X proxy at a very fast rate, the proxy must be able to compress the images very quickly in order to keep up. Unfortunately, however, most X proxies can’t. They simply aren’t designed to compress, with any degree of performance, the large and complex images generated by 3D applications. Therefore, the VirtualGL Project provides an optimized X proxy called “TurboVNC”, a variant of TightVNC that is designed specifically to achieve high levels of performance with VirtualGL. More information about TurboVNC, including instructions for using it with VirtualGL, can be found in the TurboVNC User’s Guide.

TigerVNC is a next-generation VNC project based on the RealVNC and Xorg code bases. TigerVNC spun off from the TightVNC project in early 2009, and the VirtualGL Project now actively participates in its development. TigerVNC uses the same high-speed JPEG codec as VirtualGL and TurboVNC (libjpeg-turbo), but as of this writing, TigerVNC’s performance is not quite yet on par with TurboVNC. The ultimate goal, however, is to replace TurboVNC with TigerVNC. TigerVNC is available in Fedora 11 or later.

Other solutions, such as RealVNC and NX, also work well with VirtualGL. Generally, none of these other solutions will provide anywhere near the performance of TurboVNC, but some of them have capabilities that TurboVNC lacks (NX, for instance, can do seamless windows.)

8.1 Using VirtualGL with an X Proxy on the Same Server

The most common (and optimal) way to use VirtualGL with an X proxy is to set up both on the same server. This allows VirtualGL to send its rendered 3D images to the X proxy through shared memory rather than sending them over a network.

x11transport

With this configuration, you can usually invoke

/opt/VirtualGL/bin/vglrun {application_executable_or_script}

from within an X proxy session, and it will “just work.” VirtualGL reads the value of the DISPLAY environment variable to determine whether to enable the X11 Transport by default. If DISPLAY begins with a colon (“:”) or with “unix:”, then VirtualGL will assume that the X server connection is local and will enable the X11 Transport as the default. In some cases, however, the DISPLAY environment variable within the X proxy may not begin with a colon or “unix:”. In these cases, it is necessary to manually enable the X11 Transport by setting the VGL_COMPRESS environment variable to proxy or by passing an argument of -c proxy to vglrun.

8.2 Using VirtualGL with an X Proxy on a Different Machine

vgltransportservernetwork

If the X proxy and VirtualGL are running on different servers, then it is desirable to use the VGL Transport to send images from the VirtualGL server to the X proxy. It is also desirable to disable image compression in the VGL Transport. Otherwise, the images would have to be compressed by the VirtualGL server, decompressed by the VirtualGL Client, then recompressed by the X proxy, which is a waste of CPU resources. However, sending images uncompressed over a network requires a fast network (generally, Gigabit Ethernet or faster), so there needs to be a fast link between the VirtualGL server and the X proxy server for this procedure to perform well.

The procedure for using the VGL Transport to display 3D applications from a VirtualGL server to a remote X proxy is the same as the procedure for using the VGL Transport to display 3D applications from a VirtualGL server to a remote 2D X server, with the following exceptions:

  1. The “client” in this case is really the X proxy server.
  2. The “X server” is really the X proxy.
  3. Once connected to the VirtualGL server with SSh, it is recommended that you disable image compression in the VGL Transport by either setting the VGL_COMPRESS environment variable to rgb or passing an argument of -c rgb to vglrun when launching VirtualGL. Otherwise, VirtualGL will detect that the connection to the X server is remote, and it will automatically try to enable JPEG compression.



9 Support for the X Video Extension

The X Video extension allows applications to pre-encode or pre-compress images and send them through the X server to the graphics card, which presumably has on-board video decoding capabilities. This approach greatly reduces the amount of CPU resources used by the X server, which can be beneficial if the X server is running on a different machine than the application.

In the case of VirtualGL, what this means is that the VirtualGL client machine no longer has to decode or decompress images from the 3D application server. It can simply pass the images along to the graphics card for decoding.

VirtualGL supports the X Video extension in two ways:

9.1 YUV Encoding with the VGL Transport

Setting the VGL_COMPRESS environment variable to yuv or passing an argument of -c yuv to vglrun enables YUV encoding with the VGL Transport. When this mode is enabled, VirtualGL encodes images as YUV420P (a form of YUV encoding which uses 4X chrominance subsampling and separates Y, U, and V pixel components into separate image planes) instead of RGB or JPEG. The YUV420P images are sent to the VirtualGL Client, which draws them using the X Video extension.

On a per-frame basis, YUV encoding uses about half the server CPU time as JPEG compression and only slightly more server CPU time than RGB encoding. On a per-frame basis, YUV encoding uses about 1/3 the client CPU time as JPEG compression and about half the client CPU time as RGB encoding. YUV encoding also uses about half the network bandwidth (per frame) as RGB.

However, since YUV encoding uses 4X chrominance subsampling, the resulting images may contain some visible artifacts. In particular, narrow, aliased lines and other sharp features may appear “soft”.

9.2 The XV Transport

Setting the VGL_COMPRESS environment variable to xv or passing an argument of -c xv to vglrun enables the XV Transport. The XV Transport is a special version of the X11 Transport which encodes images as YUV420P and draws them directly to the 2D X server using the X Video extension. This is mainly useful in conjunction with X proxies, such as the Sun Ray Server Software, that support the X Video extension. The idea is that if the X proxy is going to have to transcode the image to YUV anyhow, VirtualGL may be faster at doing this, since it has a SIMD-accelerated YUV encoder.



10 Using VirtualGL in a Sun Ray Environment

The Sun Ray technology from Sun Microsystems consists of a software component (the Sun Ray Server Software, or SRSS) and an ultra-thin hardware client (the Sun Ray Desktop Unit, or DTU.) The SRSS receives connection requests from multiple DTU’s and creates a separate instance of the Sun Ray X Server (an X proxy) for each user. These X proxy sessions can be seamlessly suspended/resumed or migrated from one DTU to another by the use of smartcards (every Sun Ray DTU has a built-in smartcard reader.)

As with most X proxies, each instance of the Sun Ray X Server creates a virtual desktop with which the user can interact to launch applications. The Sun Ray X Server is responsible for compressing the images from the virtual desktop and sending the compressed images across the network to the user’s DTU using the proprietary ALP protocol. The images are converted to YUV with up to 16X chrominance subsampling and can be further compressed using Differential Pulse Code Modulation (DPCM) or wavelets.

If VirtualGL is installed and run on a Sun Ray Server, then the 3D images from VirtualGL can be displayed to a Sun Ray X Server instance using the X11 Transport, as with any other X proxy. However, most Sun Ray Servers are deployed as 2D application servers, and thus they may have dozens of users connecting to them at any given time. 3D applications are very demanding of system resources, so running these applications on a Sun Ray Server that is shared by many 2D application users is not a recommended approach.

To get around this problem, VirtualGL 2.0.x and 2.1.x used a proprietary plugin provided by Sun which sent images directly from the 3D application server to the Sun Ray DTU, thus bypassing the Sun Ray Server altogether.

Figure 10.1: The Sun Ray Transport (Provided by a Proprietary Plugin to VirtualGL 2.0.x and 2.1.x)

sunray

When the Sun Visualization System product was discontinued in early 2009, the proprietary VirtualGL Sun Ray plugin was discontinued along with it. Although sending pre-compressed images directly to the DTU was advantageous in that it eliminated almost all of the CPU and network load from the Sun Ray Server, it was not without its problems. Handling window clipping and throttling the bandwidth to avoid dropped UDP packets were among these. With the release of SRSS 4.1, Sun implemented the X Video extension in the Sun Ray X Server, which allows applications to pre-encode YUV images and send these through the Sun Ray X Server to the DTU without any transcoding.

VirtualGL 2.2 supports this mechanism through the use of the XV Transport and YUV encoding with the VGL Transport (see Chapter 9.) If VirtualGL detects that the 2D X server is a Sun Ray X Server instance, then it will automatically enable the XV transport if the X server connection is local or YUV encoding with the VGL Transport if the X server is remote. The remote case is illustrated in the figure below.

Figure 10.2: YUV Encoding with the VGL Transport in a Sun Ray Environment

sunrayvgltransport

There are trade-offs with using X Video vs. using the Sun Ray Transport. Whereas the X Video mechanism still uses very little CPU time on the Sun Ray Server, it requires a great deal of network bandwidth. The Sun Ray X Video implementation only supports up to 4X chrominance subsampling, which means that a 2:1 compression ratio is the best that one can achieve. This means that each pixel sent through the Sun Ray server will require approximately 24 bits to represent it on the network (12 bits coming in from the 3D server, and another 12 going out to the Sun Ray DTU.) To sustain a 1280x1024 image stream at 8 frames/second (generally the maximum sustainable performance for that size image on Sun Ray 2 hardware) would require 250 Megabits/second of bandwidth. In general, the Sun Ray Server would need to be provisioned with multiple Gigabit Ethernet adapters in this case.



11 Transport Plugins

VirtualGL 2.2 includes an API which allows you to write your own image transports. Thus, you can use VirtualGL for doing split rendering and pixel readback but then use your own library for delivering the pixels to the client.

When the VGL_TRANSPORT environment variable (or the -trans option to vglrun) is set to {t}, then VirtualGL will look for a DSO (dynamic shared object) with the name libtransvgl_{t}.so in the dynamic linker path and will attempt to access a set of API functions from this library. The functions that the plugin library must export are defined in /opt/VirtualGL/include/rrtransport.h, and an example of their usage can be found in rr/testplugin.cpp and rr/testplugin2.cpp in the VirtualGL source distribution. The former wraps the VGL Transport as an image transport plugin, and the latter does the same for the X11 Transport.



12 Using VirtualGL with setuid/setgid Executables

vglrun can be used to launch either binary executables or shell scripts, but there are a few things to keep in mind when using vglrun to launch a shell script. When you vglrun a shell script, the VirtualGL faker library will be preloaded into every executable that the script launches. Normally this is innocuous, but if the script calls any executables that have the setuid and/or setgid permission bits set, then the dynamic linker will refuse to preload the VirtualGL faker library into those executables. One of the following warnings will be printed out for each setuid/setgid executable that the script tries to launch:

Linux
ERROR: ld.so: object 'librrfaker.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libdlfaker.so' from LD_PRELOAD cannot be preloaded: ignored.
Solaris
ld.so.1: warning: librrfaker.so: open failed: No such file in secure directories
ld.so.1: warning: libdlfaker.so: open failed: No such file in secure directories

On Solaris and on versions of Linux with GLIBC 2.3 and later, the executable will continue to run– but without VirtualGL preloaded into it. That is definitely a problem if the setuid/setgid executable was a 3D application that was intended to be used with VirtualGL.

There are a couple of ways to work around this issue. If the 3D application that you are intending to run in VirtualGL is not itself a setuid/setgid executable, then probably the safest way to work around the issue is simply to edit the application script and make it store the value of the LD_PRELOAD environment variables until right before the application executable is launched. For instance, consider the following application script:

Initial contents of application.sh:

#!/bin/sh
some_setuid_executable
some_3D_application_executable

You could modify the script as follows:

Solaris
Modified application.sh:
#!/bin/sh
LD_PRELOAD_32_SAVE=$LD_PRELOAD_32
LD_PRELOAD_64_SAVE=$LD_PRELOAD_64
LD_PRELOAD_32=
LD_PRELOAD_64=
export LD_PRELOAD_32 LD_PRELOAD_64

some_setuid_executable

LD_PRELOAD_32=$LD_PRELOAD_32_SAVE
LD_PRELOAD_64=$LD_PRELOAD_64_SAVE
export LD_PRELOAD_32 LD_PRELOAD_64

some_3D_application_executable
Linux
Modified application.sh:
#!/bin/sh
LD_PRELOAD_SAVE=$LD_PRELOAD
LD_PRELOAD=
export LD_PRELOAD

some_setuid_executable

LD_PRELOAD=$LD_PRELOAD_SAVE
export LD_PRELOAD

some_3D_application_executable

You can also force VirtualGL to be preloaded into setuid/setgid executables, but please be aware of the security ramifications of this before you do it. By applying one of the following workarounds, you are essentially telling the operating system that you trust the security and stability of the VirtualGL code as much as you trust the security and stability of the operating system. And while we’re flattered, we’re not sure that we’re necessarily deserving of that accolade, so if you are in a security critical environment, apply the appropriate level of paranoia here.

To force VirtualGL to be preloaded into setuid/setgid executables on Linux, make librrfaker.so and libdlfaker.so setuid executables. To do this, run the following commands as root:

chmod u+s /usr/lib/librrfaker.so
chmod u+s /usr/lib/libdlfaker.so

On 64-bit Linux systems, also run:

chmod u+s /usr/lib64/librrfaker.so
chmod u+s /usr/lib64/libdlfaker.so

On Solaris, you can force VirtualGL to be preloaded into setuid/setgid executables by adding the VirtualGL library directories to the Solaris “secure path.” Solaris keeps a tight lid on what goes into /usr/lib and /lib, and by default, it will only allow libraries in those paths to be preloaded into an executable that is setuid and/or setgid. Generally, 3rd party packages are forbidden from installing anything into /usr/lib or /lib, but you can use the crle utility to add other directories to the operating system’s list of secure paths. In the case of VirtualGL, you would execute the following commands (as root):

crle -u -s /opt/VirtualGL/lib
crle -64 -u -s /opt/VirtualGL/lib/64

vglrun on Solaris has two additional options that are relevant to launching scripts:

vglrun -32 {script}

will preload VirtualGL only into 32-bit executables called by a script, whereas

vglrun -64 {script}

will preload VirtualGL only into 64-bit executables. So if, for instance, the setuid executable that the script is invoking is 32-bit and the application executable is 64-bit, then you could use vglrun -64 to launch the application script.



13 Using VirtualGL with Chromium

Chromium is a powerful framework for performing various types of parallel OpenGL rendering. It is usually used on clusters of commodity Linux PC’s to divide up the task of rendering scenes with large geometries or large pixel counts (such as when driving a display wall.) Chromium is most often used in one of three configurations:

  1. Sort-First Rendering (Image-Space Decomposition)
  2. Sort-First Rendering (Image-Space Decomposition) with Readback
  3. Sort-Last Rendering (Object-Space Decomposition)

13.1 Configuration 1: Sort-First Rendering (Image-Space Decomposition)

chromium-displaywall

Sort-First Rendering (Image-Space Decomposition) is used to overcome the fill rate limitations of individual graphics cards. When configured to use sort-first rendering, Chromium divides up the scene based on which polygons will be visible in a particular section of the final image. It then instructs each node of the cluster to render only the polygons that are necessary to generate the image section (“tile”) for that node. This is primarily used to drive high-resolution displays that would be impractical to drive from a single graphics card due to limitations in the card’s framebuffer memory, processing power, or both. Configuration 1 could be used, for instance, to drive a CAVE, video wall, or even an extremely high-resolution monitor. In this configuration, each Chromium node generally uses all of its screen real estate to render a section of the multi-screen image.

VirtualGL is generally not very useful with Configuration 1. You could theoretically install a separate copy of VirtualGL on each display node and use it to redirect the output of each crserver instance to a separate VirtualGL Client instance running on a multi-screen 2D X server elsewhere on the network. However, synchronizing the frames on the remote end would require extensive modifications to VirtualGL and perhaps to Chromium as well. Such is left as an exercise for the reader.

13.2 Configuration 2: Sort-First Rendering (Image-Space Decomposition) with Readback

chromium-sortfirst

Configuration 2 uses the same sort-first principle as Configuration 1, except that each tile is only a fraction of a single screen, and the tiles are recombined into a single window on Node 0. This configuration is perhaps the least often used of the three, but it is useful in cases where the scene contains a large amount of textures (such as in volume rendering) and thus rendering the whole scene on a single node would be prohibitively slow due to fill rate limitations.

In this configuration, the application is allowed to choose a visual, create an X window, and manage the window as it would normally do. However, all other OpenGL and GLX activity is intercepted by the Chromium App Faker (CrAppFaker) so that the 3D rendering can be split up among the rendering nodes. Once each node has rendered its section of the final image, the image tiles are passed back to a Chromium Server (CrServer) process running on Node 0. This CrServer process attaches to the previously-created application window and draws the pixels into the window using glDrawPixels().

The general strategy for making this work with VirtualGL is to first make it work without VirtualGL and then insert VirtualGL only into the processes that run on Node 0. VirtualGL must be inserted into the CrAppFaker process to prevent CrAppFaker from sending glXChooseVisual() calls to the 2D X server (which would fail if this X server was a VNC session or otherwise did not support GLX.) VirtualGL must be inserted into the CrServer process on Node 0 to prevent it from sending glDrawPixels() calls to the 2D X server (which would similarly fail if the 2D X server didn’t support GLX and which would create a performance issue if the 2D X server was remote.) Instead, VirtualGL forces CrServer to draw into a Pbuffer, and VGL then takes charge of transmitting the pixels from the Pbuffer to the 2D X server in the most efficient way possible.

As with any normal OpenGL application, CrServer can be launched using vglrun. However, because CrAppFaker also interposes OpenGL and GLX functions, it must be handled differently in order to avoid interference with VirtualGL. Chromium provides an environment variable, CR_SYSTEM_GL_PATH, which allows one to specify an alternate path to be searched for libGL.so. The VirtualGL packages for Linux and Solaris include a symbolic link named libGL.so, which points to the VirtualGL faker library (librrfaker.so). This symbolic link is located in its own isolated directory, so that directory can be passed to Chromium in the CR_SYSTEM_GL_PATH environment variable, and this will cause Chromium to load VirtualGL rather than the “real” OpenGL library. Refer to the following table:

32-bit Applications 64-bit Applications
/opt/VirtualGL/fakelib /opt/VirtualGL/fakelib/64
CR_SYSTEM_GL_PATH setting required to use VirtualGL with Chromium

To run CrAppFaker, it is necessary to set this environment variable to the appropriate value so that Chromium will load the interposed versions of OpenGL and GLX functions from VirtualGL. It is also necessary to set VGL_GLLIB to the location of the “real” OpenGL library (example: /usr/lib/libGL.so.1). CrAppFaker creates its own fake version of libGL.so, which is really just a copy of Chromium’s libcrfaker.so. Thus, if left to its own devices, VirtualGL will unwittingly try to load libcrfaker.so instead of the “real” OpenGL library. Chromium’s libcrfaker.so will, in turn, try to load VirtualGL, and an endless loop will occur.

Therefore, we must use the CR_SYSTEM_GL_PATH environment variable to tell Chromium to pass OpenGL commands into VirtualGL, then we must use the VGL_GLLIB environment variable to tell VirtualGL not to pass OpenGL commands into Chromium. For example:

export CR_SYSTEM_GL_PATH=/opt/VirtualGL/fakelib
export VGL_GLLIB=/usr/lib/libGL.so.1
crappfaker

CrAppFaker will copy the application into a temporary directory and then copy libcrfaker.so to that same directory, renaming it as libGL.so. So, when the application is started, it loads libcrfaker.so instead of libGL.so. libcrfaker.so will then load VirtualGL instead of the “real” OpenGL library, because we’ve overridden CR_SYSTEM_GL_PATH to point to VirtualGL’s fake libGL.so. VirtualGL will then use the library specified in VGL_GLLIB to make any “real” OpenGL calls that it needs to make.

NOTE: crappfaker should not be invoked with vglrun.

So, putting this all together, here is an example of how you might start a sort-first rendering job using Chromium and VirtualGL:

  1. Start the mothership on Node 0 with an appropriate configuration for performing sort-first rendering with readback
  2. Start crserver on each of the rendering nodes

    NOTE: crserver should be run on display :0 (or whichever display is attached to the 3D hardware.)

  3. On Node 0, vglrun crserver &
  4. On Node 0, set the CR_SYSTEM_GL_PATH environment variable to the appropriate value based on whether crappfaker was compiled as a 32-bit or a 64-bit app (see table above)
  5. On Node 0, set VGL_GLLIB to the location of the “real” OpenGL library (example: /usr/lib/libGL.so.1 or /usr/lib64/libGL.so.1).
  6. On Node 0, launch crappfaker (do not use vglrun here)

Again, it’s always a good idea to make sure this works without VirtualGL before adding VirtualGL into the mix.

Using VirtualGL to Force Pbuffer Rendering

In the procedure above, VirtualGL can also be used on the rendering nodes to redirect the rendering commands from crserver into a Pbuffer instead of a window. If you wish to do this, then perform the following procedure in place of step 2 above:

On each of the rendering nodes,

13.3 Configuration 3: Sort-Last Rendering (Object-Space Decomposition)

chromium-sortlast

Sort-Last Rendering is used when the scene contains a huge number of polygons and the rendering bottleneck is processing all of that geometry on a single graphics card. In this case, each node runs a separate copy of the application, and for best results, the application needs to be aware that it is running in a parallel environment so that it can give Chromium hints as to how to distribute the various objects to be rendered. Each node generates an image of a particular portion of the object space, and these images must be composited in such a way that the front-to-back ordering of pixels is maintained. This is generally done by collecting Z buffer data from each node to determine whether a particular pixel on a particular node is visible in the final image. The rendered images from each node are often composited using a “binary swap”, whereby the nodes combine their images in a cascading tree so that the overall compositing time is proportional to log2(N) rather than N.

To make this configuration work with VirtualGL:

  1. Start the mothership on Node 0 with an appropriate configuration for performing sort-last rendering
  2. Start crappfaker on each of the rendering nodes

    NOTE: crappfaker should be run on display :0 (or whichever display is attached to the 3D hardware.)

  3. On Node 0, vglrun crserver

CRUT

The Chromium Utility Toolkit provides a convenient way for graphics applications to specifically take advantage of Chromium’s sort-last rendering capabilities. Such applications can use CRUT to explicitly specify how their object space should be decomposed. CRUT applications require an additional piece of software, crutserver, to be running on Node 0. Therefore, the following procedure should be used to make these applications work with VirtualGL:

  1. Start the mothership on Node 0 with an appropriate configuration for performing sort-last rendering
  2. Start crappfaker on each of the rendering nodes

    NOTE: crappfaker should be run on display :0 (or whichever display is attached to the 3D hardware.)

  3. On Node 0, vglrun crutserver &
  4. On Node 0, vglrun crserver

13.4 A Note About Performance

Chromium’s use of X11 is generally not very optimal. It assumes a very fast connection between the 2D X server and the Chromium Server. In certain modes, Chromium polls the 2D X server on every frame to determine whether windows have been resized, etc. Thus, we have observed that, even on a fast network, Chromium tends to perform much better with VirtualGL running in an X proxy as opposed to using the VGL Transport.



14 Using VirtualGL with VirtualBox

VirtualBox is an enterprise-class, open source virtualization solution provided by Sun Microsystems. With the release of VirtualBox 2.1.0, experimental support was added for hardware-accelerated OpenGL in Windows and Linux guests running on Windows, MacIntel, Linux, and Solaris/x86 hosts. 3D acceleration in VirtualBox is accomplished by installing a special driver in the guest which uses a subset of Chromium to transmit OpenGL calls through a local connection to the VirtualBox process running on the host. When used in conjunction with VirtualGL on a Linux or Solaris/x86 host, this solution provides a means of displaying Windows 3D applications remotely.

To use VirtualGL with VirtualBox, perform the following procedures:

Configuring the System

  1. Launch VirtualBox and use the VirtualBox GUI to create and test your virtual machine.
  2. Follow the procedures outlined in the VirtualBox User’s Manual to enable 3D acceleration on the virtual machine. Verify that 3D acceleration works without VirtualGL before adding VirtualGL to the mix.
  3. Follow the procedure described in Chapter 12 to make librrfaker.so and libdlfaker.so setuid executables (Linux) or to add the VirtualGL library directory to the list of secure paths (Solaris).

Launching VirtualBox (Method 1)

This should work on most Linux systems.

  1. vglrun VirtualBox -startvm {VM name or UUID}

Launching VirtualBox (Method 2)

If the above does not work, then try the following alternate method:

  1. If running 32-bit VirtualBox,
    export CR_SYSTEM_GL_PATH=/opt/VirtualGL/fakelib
    
    If running 64-bit VirtualBox,
    export CR_SYSTEM_GL_PATH=/opt/VirtualGL/fakelib/64
    
  2. vglrun -nodl VirtualBox -startvm {VM name or UUID}

NOTES



15 Other Application Recipes

Application Platform Recipe Notes
Abaqus v6 Linux It is necessary to add

import os
os.environ['ABAQUS_EMULATE_OVERLAYS'] = "1"

to /{abaqus_install_dir}/{abaqus_version}/site/abaqus_v6.env to make Abaqus v6 work properly with VirtualGL in an X proxy environment. If this is not done, then the application may fail to launch, fail to display the 3D pixels, or the 3D pixels may become corrupted whenever other windows obscure them.
VirtualGL does not redirect the rendering of transparent overlays, since those cannot be rendered in a Pbuffer. Thus, in order to use transparent overlays, the 2D X Server must be able to render them, which is never the case for X proxies (see Section 16.2 for more details.) Setting ABAQUS_EMULATE_OVERLAYS to 1 causes the application to emulate overlay rendering instead of using actual transparent overlays.
Abaqus v6 Linux vglrun -nodl {abaqus_path}/abaqus User reports indicate that Abaqus 6.9 will not work properly if libdlfaker.so from VirtualGL is preloaded into it. This may be true for other versions of Abaqus as well.
Animator 4 Linux Comment out the line that reads

unsetenv LD_PRELOAD

in the a4 script, then launch Animator 4 using

vglrun -ge a4

When the a4 script unsets LD_PRELOAD, this prevents VirtualGL from being loaded into the application. Animator 4 additionally checks the value of LD_PRELOAD and attempts to unset it from inside the application. Using vglrun -ge to launch the application fools Animator 4 into thinking that LD_PRELOAD is unset.
ANSA v12.1.0 Linux Add

LD_PRELOAD_SAVE=$LD_PRELOAD
export LD_PRELOAD=

to the top of the ansa.sh script, then add

export LD_PRELOAD=$LD_PRELOAD_SAVE

just prior to the ${ANSA_EXEC_DIR}bin/ansa_linux${ext2} line.
The ANSA startup script directly invokes /lib/libc.so.6 to query the glibc version. Since the VirtualGL faker depends on libc, preloading VirtualGL when directly invoking libc.so.6 creates an infinite loop. So it is necessary to disable the preloading of VirtualGL in the application script and then re-enable it prior to launching the actual application.
Ansoft HFSS, Roxar RMS Linux Set the VGL_SPOILLAST environment variable to 0 prior to launching the application with vglrun These applications use double buffering and draw geometry to the back buffer, but node highlighting and rubber banding are drawn directly to the front buffer. In order for these front-buffer operations to be displayed properly, it is necessary to use the “spoil first” frame spoiling algorithm whenever the application calls glFlush(). See Section 19.1 for more details.
AutoForm v4.0x All vglrun +sync xaf_{version} AutoForm relies on mixed X11/OpenGL rendering, and thus certain features (particularly the “Dynamic Section” dialog and “Export Image” feature) do not work properly unless VGL_SYNC is enabled. Since VGL_SYNC automatically enables the X11 transport and disables frame spoiling, it is highly recommended that you use an X proxy when VGL_SYNC is enabled. See Section 19.1 for more details.
Cedega v6.0.x Linux Add

export LD_PRELOAD=librrfaker.so

to the top of ~/.cedega/.winex_ver/winex-{version}/bin/winex3, then run Cedega as you would normally (without vglrun.) Since vglrun is not being used, it is necessary to use environment variables or the VirtualGL Configuration dialog to modify VirtualGL’s configuration.
The actual binary (WineX) which uses OpenGL is buried beneath several layers of Python and shell scripts. The LD_PRELOAD variable does not get propagated down from the initial shell that invoked vglrun.
Heretic II Linux vglrun heretic2 +set vid_ref glx
Mathematica 7 Linux Set the VGL_ALLOWINDIRECT environment variable to 1 prior to launching the application with vglrun Mathematica 7 will not draw the axis numbers on 3D charts correctly unless it is allowed to create an indirect OpenGL context. See See Section 19.1 for more details.



16 Advanced OpenGL Features

16.1 Stereographic Rendering

Stereographic rendering is a feature of OpenGL that creates separate rendering buffers for the left and right eyes and allows the application to render a different image into each buffer. How the stereo images are subsequently displayed depends on the particulars of the 3D hardware and the user’s environment. VirtualGL can support stereographic applications in one of two ways: (1) by sending the stereo image pairs to the client to be displayed in stereo by the client’s 3D graphics card, or (2) by combining each stereo image pair into a single anaglyph that can be viewed with traditional red/cyan 3D glasses.

16.1.1 Quad-Buffered Stereo

The name “quad-buffered” stereo derives from the fact that OpenGL uses four buffers (left front, right front, left back, and right back) to support stereographic rendering with double buffering. 3D graphics cards with quad-buffered stereo capabilities generally provide some sort of synchronization signal that can be used to control various types of active stereo 3D glasses. Some also support “passive stereo”, which requires displaying the left and right eye buffers to different monitor outputs. VirtualGL supports true quad-buffered stereo by rendering the stereo images on the server and sending the image pairs across the network to be displayed on the client.

In most cases, VirtualGL does not require a 3D graphics card to be present in the client machine. However, a 3D graphics card is required to display stereo image pairs, so such a card must be present in any client machine that will use VirtualGL’s quad-buffered stereo feature. Since the 3D graphics card is only being used to draw images, it need not necessarily be a high-end card. Generally, the least expensive 3D graphics card that has stereo capabilities will work fine in a VirtualGL client machine. The VirtualGL server must also have a 3D graphics card that supports stereo, since this is the only way that VirtualGL can obtain a stereo Pbuffer.

When an application tries to render something in stereo, VirtualGL will default to using quad-buffered stereo rendering if the 2D X server supports OpenGL and has stereo visuals available (Exceed 3D is required for Windows clients.) Otherwise, VirtualGL will fall back to using anaglyphic stereo (see below.) It is usually necessary to explicitly enable stereo in the graphics driver configuration for both the client and server machines. The Troubleshooting section below lists a way to verify that both the 3D X server and the 2D X server have stereo visuals available.

In quad-buffered mode, VirtualGL reads back both the left and right eye buffers on the server and sends the contents as a pair of compressed images to the VirtualGL Client. The VirtualGL Client then decompresses both images and draws them as a single stereo frame to the client machine’s X display using glDrawPixels(). It should thus be no surprise that enabling quad-buffered stereo in VirtualGL decreases performance by 50% or more and uses twice the network bandwidth to maintain the same frame rate as mono.

Quad-buffered stereo requires the VGL Transport. If any other image transport is used, then VGL will fall back to anaglyphic stereo mode.

16.1.2 Anaglyphic Stereo

Anaglyphic stereo is the type of stereographic display used by old 3D movies. It generally relies on a set of 3D glasses consisting of red transparency film over the left eye and cyan transparency film over the right eye. To generate a 3D anaglyph, the red color data from the left eye buffer is combined with the green and blue color data from the right eye buffer, thus allowing a single monographic image to contain stereo data. From the point of view of VirtualGL, an anaglyphic image is the same as a monographic image, so anaglyphic stereo images can be sent using any image transport to any type of client, regardless of the client’s capabilities.

VirtualGL uses anaglyphic stereo if it detects that an application has rendered something in stereo but quad-buffered stereo is not available, either because the client doesn’t support it or because a transport other than the VGL Transport is being used. Anaglyphic stereo provides a cheap and easy way to view stereographic applications in X proxies and on clients that do not support quad-buffered stereo. Additionally, anaglyphic stereo performs much faster than quad-buffered stereo, since it does not require sending twice the data to the client.

As with quad-buffered stereo, anaglyphic stereo requires that the VirtualGL server have stereo rendering capabilities. However, anaglyphic stereo does not require any 3D rendering capabilities (stereo or otherwise) on the client machine.

16.1.3 Selecting a Stereo Mode

A particular stereo mode can be selected by setting the VGL_STEREO environment variable or by using the -st argument to vglrun. See Section 19.1 for more details.

16.2 Transparent Overlays

Transparent overlays have similar requirements and restrictions as quad-buffered stereo. In this case, VirtualGL completely bypasses its own GLX faker and uses indirect OpenGL rendering to render the transparent overlay on the client machine’s 3D graphics card. The underlay is still rendered on the server, as always. Using indirect rendering to render the overlay is unfortunately necessary, because there is no reliable way to draw to an overlay using 2D (X11) functions, there are severe performance issues (on some cards) with using glDrawPixels() to draw to the overlay, and there is no reasonable way to composite the overlay and underlay on the VirtualGL server.

The use of overlays is becoming more and more infrequent, and when they are used, it is generally only for drawing small, simple, static shapes and text. We have found that it is often faster to ship the overlay geometry over to the client rather than to render it as an image and send the image. Thus, even if it were possible to implement overlays without using indirect rendering, it is likely that indirect rendering of overlays would still be the fastest approach for most applications.

As with quad-buffered stereo, overlays must be explicitly enabled in the graphics driver configuration. In the case of overlays, however, they need only be supported and enabled on the client machine. Some graphics drivers are known to disallow using both quad-buffered stereo and overlays at the same time.

Indexed color (8-bit) overlays have been tested and are known to work with VirtualGL. True color (24-bit) overlays work, in theory, but have not been tested. Use glxinfo (see Troubleshooting below) to verify whether your client’s X display supports overlays and whether they are enabled. In Exceed 3D, make sure that the “Overlay Support” option is checked in the “Exceed 3D and GLX” applet:

exceed6

Overlays do not work with X proxies. VirtualGL must be displaying to a “real” X server.

16.3 Indexed (PseudoColor) Rendering

In a PseudoColor visual, each pixel is represented by an index which refers to a location in a color table. The color table stores the actual color values (256 of them in the case of 8-bit PseudoColor) that correspond to each index. An application merely tells the X server which color index to use when drawing, and the X server takes care of mapping that index to an actual color from the color table. OpenGL allows for rendering to Pseudocolor visuals, and it does so by being intentionally ignorant of the relationship between indices and actual colors. As far as OpenGL is concerned, each color index value is just a meaningless number, and it is only when the final image is drawn by the X server that these numbers take on meaning. As a result, many pieces of OpenGL’s core functionality either have undefined behavior or do not work at all with PseudoColor rendering. PseudoColor rendering used to be a common technique for visualizing scientific data, because such data often only contained 8 bits per sample to begin with. Applications could manipulate the color table to allow the user to dynamically control the relationship between sample values and colors. As more and more graphics cards drop support for PseudoColor rendering, however, the applications that use it are becoming a vanishing breed.

VirtualGL supports PseudoColor rendering if a PseudoColor visual is available on the 2D X server or X proxy. A PseudoColor visual need not be present on the application server. On the application server, VirtualGL uses the red channel of a standard RGB Pbuffer to store the color index. Upon receiving an end of frame trigger, VirtualGL reads back the red channel of the Pbuffer and uses XPutImage() to draw the color indices into the appropriate X window. To put this another way, PseudoColor rendering in VirtualGL always uses the X11 Transport. However, since there is only 1 byte per pixel in a PseudoColor “image”, the images can still be sent to the client reasonably quickly even though they are uncompressed.

VirtualGL’s PseudoColor rendering mode works with X proxies, provided that the X proxy is configured to use an 8-bit color depth. Note, however, that VNC cannot provide both PseudoColor and TrueColor visuals at the same time.

16.4 Troubleshooting

VirtualGL includes a modified version of glxinfo that can be used to determine whether or not the client and server have stereo, overlay, or Pseudocolor visuals enabled.

Run the following command sequence on the VirtualGL server to determine whether the 3D X server has a suitable visual for stereographic rendering:

xauth merge /etc/opt/VirtualGL/vgl_xauth_key
/opt/VirtualGL/bin/glxinfo -display :{n} -c -v

(where {n} is the display number of the 3D X server.) One or more of the visuals should say “stereo=1” and should list “Pbuffer” as one of the “Drawable Types.”

Run the following command sequence on the VirtualGL server to determine whether the 2D X server has a suitable visual for stereographic rendering, transparent overlays, or Pseudocolor.

/opt/VirtualGL/bin/glxinfo -v

In order to use stereo, one or more of the visuals should say “stereo=1”. In order to use transparent overlays, one or more of the visuals should say “level=1”, should list a “Transparent Index” (non-transparent visuals will say “Opaque” instead), and should have a class of “PseudoColor.” In order to use PseudoColor (indexed) rendering, one of the visuals should have a class of “PseudoColor.”



17 Performance Measurement

17.1 VirtualGL’s Built-In Profiling System

The easiest way to uncover bottlenecks in VirtualGL’s image pipeline is to set the VGL_PROFILE environment variable to 1 on both server and client (passing an argument of +pr to vglrun on the server has the same effect.) This will cause VirtualGL to measure and report the throughput of the various stages in the pipeline. For example, here are some measurements from a dual Pentium 4 server communicating with a Pentium III client on a 100 Megabit LAN:

Server
Readback   - 43.27 Mpixels/sec - 34.60 fps
Compress 0 - 33.56 Mpixels/sec - 26.84 fps
Total      -  8.02 Mpixels/sec -  6.41 fps - 10.19 Mbits/sec (18.9:1)
Client
Decompress - 10.35 Mpixels/sec -  8.28 fps
Blit       - 35.75 Mpixels/sec - 28.59 fps
Total      -  8.00 Mpixels/sec -  6.40 fps - 10.18 Mbits/sec (18.9:1)

The total throughput of the pipeline is 8.0 Megapixels/sec, or 6.4 frames/sec, indicating that our frame is 8.0 / 6.4 = 1.25 Megapixels in size (a little less than 1280 x 1024 pixels.) The readback and compress stages, which occur in parallel on the server, are obviously not slowing things down, and we’re only using 1/10 of our available network bandwidth. Looking at the client, however, we discover that its slow decompression speed (10.35 Megapixels/second) is the primary bottleneck. Decompression and blitting on the client cannot be done in parallel, so the aggregate performance is the harmonic mean of the decompression and blitting rates: [1/ (1/10.35 + 1/35.75)] = 8.0 Mpixels/sec. In this case, we could improve the performance of the whole system by simply using a client with a faster CPU.

17.2 Frame Spoiling

By default, VirtualGL will only send a frame to the client if the client is ready to receive it. If a rendered frame arrives at the server’s queue and there are frames waiting in the queue to be processed, then those unprocessed frames are dropped (“spoiled”) and the new frame is promoted to the head of the queue. This prevents a backlog of frames on the server, which would cause a perceptible delay in the responsiveness of interactive applications. However, when running non-interactive applications, particularly benchmarks, frame spoiling should always be disabled. With frame spoiling disabled, the server will render frames only as quickly as VirtualGL can send those frames to the client, which will conserve server resources as well as allow OpenGL benchmarks to accurately measure the frame rate of the VirtualGL system. With frame spoiling enabled, OpenGL benchmarks will report meaningless data, since the rate at which the server can render frames is decoupled from the rate at which VirtualGL can send those frames to the client.

In a VNC environment, there is another layer of frame spoiling, since the server only sends updates to the client when the client requests them. Thus, even if frame spoiling is disabled in VirtualGL, OpenGL benchmarks will still report inaccurate data if they are run in a VNC session. TCBench, described below, provides a limited solution to this problem.

To disable frame spoiling, set the VGL_SPOIL environment variable to 0 on the VirtualGL server or pass an argument of -sp to vglrun. See Section 19.1 for more details.

17.3 VirtualGL Diagnostic Tools

VirtualGL includes several tools which can be useful in diagnosing performance problems with the system.

NetTest

NetTest is a network benchmark that uses the same network I/O classes as VirtualGL. It can be used to test the latency and throughput of any TCP/IP connection, with or without SSL encryption. nettest can be found in /opt/VirtualGL/bin on Linux/Mac/Solaris/Cygwin VirtualGL installations or in c:\program files\VirtualGL-{version}-{build} if using the VirtualGL Client for Exceed.

To use NetTest, first start up the NetTest server on one end of the connection:

nettest -server [-ssl]

(Use -ssl if you want to test the performance of SSL encryption over this particular connection. VirtualGL must have been compiled with OpenSSL support for this option to be available.)

Next, start the client on the other end of the connection:

nettest -client {server} [-ssl]

Replace {server} with the hostname or IP address of the machine where the NetTest server is running. (Use -ssl if the NetTest server is running in SSL mode. VirtualGL must have been compiled with OpenSSL support for this option to be available.)

The NetTest client will produce output similar to the following:

TCP transfer performance between localhost and {server}:

Transfer size  1/2 Round-Trip      Throughput      Throughput
(bytes)                (msec)        (MB/sec)     (Mbits/sec)
1                    0.093402        0.010210        0.085651
2                    0.087308        0.021846        0.183259
4                    0.087504        0.043594        0.365697
8                    0.088105        0.086595        0.726409
16                   0.090090        0.169373        1.420804
32                   0.093893        0.325026        2.726514
64                   0.102289        0.596693        5.005424
128                  0.118493        1.030190        8.641863
256                  0.146603        1.665318       13.969704
512                  0.205092        2.380790       19.971514
1024                 0.325896        2.996542       25.136815
2048                 0.476611        4.097946       34.376065
4096                 0.639502        6.108265       51.239840
8192                 1.033596        7.558565       63.405839
16384                1.706110        9.158259       76.825049
32768                3.089896       10.113608       84.839091
65536                5.909509       10.576174       88.719379
131072              11.453894       10.913319       91.547558
262144              22.616389       11.053931       92.727094
524288              44.882406       11.140223       93.450962
1048576             89.440702       11.180592       93.789603
2097152            178.536997       11.202160       93.970529
4194304            356.754396       11.212195       94.054712

We can see that the throughput peaks at about 94 megabits/sec, which is pretty good for a 100 Megabit connection. We can also see that, for small transfer sizes, the round-trip time is dominated by latency. The “latency” is the same thing as the one-way (1/2 round-trip) transit time for a zero-byte packet, which is about 93 microseconds in this case.

CPUstat

CPUstat is available only in the VirtualGL Linux packages and is located in the same place as NetTest (/opt/VirtualGL/bin.) It measures the average, minimum, and peak CPU usage for all processors combined and for each processor individually. On Windows, this same functionality is provided in the Windows Performance Monitor, which is part of the operating system. On Solaris, the same data can be obtained through vmstat.

CPUstat measures the CPU usage over a given sample period (a few seconds) and continuously reports how much the CPU was utilized since the last sample period. Output for a particular sample looks something like this:

ALL :  51.0 (Usr= 47.5 Nice=  0.0 Sys=  3.5) / Min= 47.4 Max= 52.8 Avg= 50.8
cpu0:  20.5 (Usr= 19.5 Nice=  0.0 Sys=  1.0) / Min= 19.4 Max= 88.6 Avg= 45.7
cpu1:  81.5 (Usr= 75.5 Nice=  0.0 Sys=  6.0) / Min= 16.6 Max= 83.5 Avg= 56.3

The first column indicates what percentage of time the CPU was active since the last sample period (this is then broken down into what percentage of time the CPU spent running user, nice, and system/kernel code.) “ALL” indicates the average utilization across all CPUs since the last sample period. “Min”, “Max”, and “Avg” indicate a running minimum, maximum, and average of all samples since CPUstat was started.

Generally, if an application’s CPU usage is fairly steady, you can run CPUstat for a bit and wait for the Max. and Avg. for the “ALL” category to stabilize, then that will tell you what the application’s peak and average % CPU utilization is.

TCBench

TCBench was born out of the need to compare VirtualGL’s performance to that of other thin client packages, some of which had frame spoiling features that couldn’t be disabled. TCBench measures the frame rate of a thin client system as seen from the client’s point of view. It does this by attaching to one of the client windows and continuously reading back a small area at the center of the window. While this may seem to be a somewhat non-rigorous test, experiments have shown that if care is taken to ensure that the application is updating the center of the window on every frame (such as in a spin animation), TCBench can produce quite accurate results. It has been sanity checked with VirtualGL’s internal profiling mechanism and with a variety of system-specific techniques, such as monitoring redraw events on the client’s windowing system.

TCBench can be found in /opt/VirtualGL/bin on Linux/Mac/Solaris/Cygwin VirtualGL installations or in c:\program files\VirtualGL-{version}-{build} if using the VirtualGL Client for Exceed. Run tcbench from the command line, and it will prompt you to click in the window you want to benchmark. That window should already have an automated animation of some sort running before you launch TCBench. Note that GLXSpheres (see below) is an ideal benchmark to use with TCBench, since GLXSpheres draws a new sphere to the center of its window on every frame.

TCBench can also be used to measure the frame rate of applications that are running on the local display, although for extremely fast applications (those that exceed 40 fps on the local display), you may need to increase the sampling rate of TCBench to get accurate results. The default sampling rate of 50 samples/sec should be fine for measuring the throughput of VirtualGL and other thin client systems.

tcbench -?

lists the relevant command-line switches, which can be used to adjust the benchmark time, the sampling rate, and the x and y offset of the sampling area within the window.

GLXSpheres

GLXSpheres is a benchmark that produces very similar images to nVidia’s (long discontinued) SphereMark benchmark. Back in the early days of VirtualGL’s existence, it was discovered (quite by accident) that SphereMark was a pretty good test of VirtualGL’s end-to-end performance, because that benchmark generated images with about the same proportion of solid color and similar frequency components to the images generated by volume visualization applications.

Thus, the goal of GLXSpheres was to create an open source Unix version of SphereMark (the original SphereMark was for Windows only) completely from scratch. GLXSpheres does not use any code from the original benchmark, but it does attempt to mimic the image output of the original as closely as possible. GLXSpheres lacks some of the advanced rendering features of the original, such as the ability to use vertex arrays, but since it was primarily designed as a benchmark for VirtualGL, display lists are more than fast enough for that purpose.

GLXSpheres has some additional modes which its predecessor lacked, modes which are designed specifically to test the performance of various VirtualGL features:

Stereographic rendering (glxspheres -s)
Color index rendering (glxspheres -c)
In color index mode, GLXSpheres will draw the spheres using an 8-bit color map and will change the color map periodically.
Overlay rendering (glxspheres -o)
This renders text, a moving crosshair cursor, and a block of pixels to an 8-bit transparent overlay while animating the spheres on the underlay. The color map of the overlay is changed periodically.
Immediate mode rendering (glxspheres -m)
Want to really see the benefit of VirtualGL? Run glxspheres -m over a remote X connection, then run vglrun -sp glxspheres -m over the same connection and compare. Immediate mode does not use display lists, so when immediate mode OpenGL is rendered indirectly (over a remote X connection), this causes every OpenGL command to be sent as a separate network request to the X server … on every frame. Many applications cannot use display lists, because the geometry they are rendering is dynamic, so this models how such applications might perform when displayed remotely without VirtualGL.
Interactive mode (glxspheres -i)
In interactive mode, GLXSpheres will wait to draw a frame until it receives a mouse event. Continuously dragging the mouse in the window should produce a steady frame rate, and this frame rate is a reasonable model of the frame rate that you can achieve when running interactive applications in VirtualGL. Comparing this interactive frame rate (vglrun glxspheres -i) with the non-interactive frame rate (vglrun -sp glxspheres) allows you to quantify the effect of X latency on the performance of interactive applications in a VirtualGL environment.

GLXSpheres is installed in /opt/VirtualGL/bin on Linux and Solaris VirtualGL servers. 64-bit VirtualGL packages name this program glxspheres64 so as to allow both a 64-bit and a 32-bit version of GLXSpheres to be installed on the same system.



18 The VirtualGL Configuration Dialog

Several of VirtualGL’s configuration parameters can be changed on the fly once a 3D application has been started. This is accomplished by using the VirtualGL Configuration dialog, which can be popped up by holding down the CTRL and SHIFT keys and pressing the F9 key while any one of the 3D application’s windows is active. This displays the following dialog box:

configdialog

You can use this dialog to adjust various image compression and display parameters in VirtualGL. Changes are communicated immediately to VirtualGL.

Image Compression (Transport)
This is a drop-down gadget with the following options:

None (X11 Transport) : equivalent to setting VGL_COMPRESS=proxy. This option can be activated at any time, regardless of which transport was active when VirtualGL started.

JPEG (VGL Transport) : equivalent to setting VGL_COMPRESS=jpeg. This option is only available if the VGL Transport was active when VirtualGL started.

RGB (VGL Transport) : equivalent to setting VGL_COMPRESS=rgb. This option is only available if the VGL Transport was active when VirtualGL started.

YUV (XV Transport) : equivalent to setting VGL_COMPRESS=xv. This option is only available if the 2D X server has the X Video extension and the X Video implementation supports the YUV420P (AKA “I420”) pixel format.

YUV (VGL Transport) : equivalent to setting VGL_COMPRESS=yuv. This option is only available if the 2D X server has the X Video extension, the X Video implementation supports the YUV420P (AKA “I420”) pixel format, and the VGL Transport was active when VirtualGL started.

See Section 19.1 for more information about the VGL_COMPRESS configuration option.

If an image transport plugin is loaded, then this gadget’s name changes to “Image Compression”, and it has options “0” through “10”.

Chrominance Subsampling
This drop-down gadget is active only when using JPEG compression or an image transport plugin. It has the following options:

Grayscale : equivalent to setting VGL_SUBSAMP=gray

1X : equivalent to setting VGL_SUBSAMP=1x

2X : equivalent to setting VGL_SUBSAMP=2x

4X : equivalent to setting VGL_SUBSAMP=4x

See Section 19.1 for more information about the VGL_SUBSAMP configuration option.

If an image transport plugin is loaded, then this gadget has two additional options, “8X” and “16X”.

JPEG Image Quality
This slider gadget is active only when using JPEG compression or an image transport plugin. It is the equivalent of setting VGL_QUAL. See Section 19.1 for more information about the VGL_QUAL configuration option.

If an image transport plugin is loaded, then this gadget’s name changes to “Image Quality”.

Connection Profile
This drop-down gadget is active only if the VGL Transport was active when VirtualGL started. It has the following options:

Low Qual (Low-Bandwidth Network) : Sets the image compression type to JPEG (VGL Transport), sets the Chrominance Subsampling to 4X, and sets the JPEG Image Quality to 30.

Medium Qual : Sets the image compression type to JPEG (VGL Transport), sets the Chrominance Subsampling to 2X, and sets the JPEG Image Quality to 80.

High Qual (High-Bandwidth Network) : Sets the image compression type to JPEG (VGL Transport), sets the Chrominance Subsampling to 1X, and sets the JPEG Image Quality to 95.
Gamma Correction Factor
This floating point input gadget is the equivalent of setting VGL_GAMMA. This enables VirtualGL’s internal gamma correction system with the specified gamma correction factor. See Section 19.1 for more information about the VGL_GAMMA configuration option.
Frame Spoiling
This toggle button is the equivalent of setting VGL_SPOIL. See Section 17.2 and Section 19.1 for more information about the VGL_SPOIL configuration option.
Interframe Comparison
This toggle button is the equivalent of setting VGL_INTERFRAME. See Section 19.1 for more information about the VGL_INTERFRAME configuration option.
Stereographic Rendering Method
This drop-down gadget has the following options:

Send Left Eye Only : equivalent to setting VGL_STEREO=left.

Send Right Eye Only : equivalent to setting VGL_STEREO=right

Quad-Buffered (if available) : equivalent to setting VGL_STEREO=quad

Anaglyphic (Red/Cyan) : equivalent to setting VGL_STEREO=rc

See Section 19.1 for more information about the VGL_STEREO configuration option.
Limit Frames/second
This floating point input gadget is the equivalent of setting VGL_FPS. See Section 19.1 for more information about the VGL_FPS configuration option.

You can set the VGL_GUI environment variable to change the key sequence used to pop up the VirtualGL Configuration dialog. If the default of CTRL-SHIFT-F9 is not suitable, then set VGL_GUI to any combination of ctrl, shift, alt, and one of f1, f2,..., f12 (these are not case sensitive.) For example:

export VGL_GUI=CTRL-F9

will cause the dialog box to pop up whenever CTRL-F9 is pressed.

To disable the VirtualGL dialog altogether, set VGL_GUI to none.

VirtualGL monitors the application’s X event loop to determine whenever a particular key sequence has been pressed. If an application is not monitoring key press events in its X event loop, then the VirtualGL Configuration dialog might not pop up at all. There is unfortunately no workaround for this, but it should be a rare occurrence.



19 Advanced Configuration

19.1 Server Settings

You can control the operation of the VirtualGL faker in four different ways. Each method of configuration takes precedence over the previous method:

  1. Setting a configuration environment variable globally (for instance, in /etc/profile)
  2. Setting a configuration environment variable on a per-user basis (for instance, in ~/.bashrc)
  3. Setting a configuration environment variable only for the current shell session (for instance, export VGL_XXX={whatever})
  4. Passing a configuration option as an argument to vglrun. This effectively overrides any previous environment variable setting corresponding to that configuration option.

If “Custom (if supported)” is listed as one of the available Image Transports, then this means that image transport plugins are free to handle or ignore this option as they see fit.

Environment Variable VGL_ALLOWINDIRECT = 0 | 1
Summary Allow applications to request an indirect OpenGL context
Image Transports All
Default Value 0 (all OpenGL contexts use direct rendering, unless rendering to a transparent overlay)
Description
Normally, when VirtualGL maps a Pbuffer to a window and establishes an OpenGL rendering context with the Pbuffer, it forces direct rendering to be used with this context. Some 3D applications erroneously try to create indirect OpenGL contexts because they detect that the X display is remote and assume that the 3D rendering commands will be sent over the network. Thus, VirtualGL normally forces all contexts to be direct in order to prevent severe readback performance degradation with such apps (even on modern 3D adapters, glReadPixels() can perform very slowly if an indirect OpenGL context is used.)

However, some applications intentionally try to create indirect contexts so that these contexts can be shared, and those apps may not work properly when the contexts are forced to be direct. For such apps, setting VGL_ALLOWINDIRECT to 1 will cause VirtualGL to honor the application’s request for an indirect OpenGL context.
Environment Variable VGL_CLIENT = {c}
vglrun argument -cl {c}
Summary {c} = the hostname or IP address of the VirtualGL client
Image Transports VGL, Custom (if supported)
Default Value Automatically set by vglconnect or vglrun
Description
When using the VGL Transport, VGL_CLIENT should be set to the hostname or IP address of the machine on which vglclient is running. Normally, VGL_CLIENT is set automatically by the vglconnect or vglrun script, so don’t override it unless you know what you’re doing.

Environment Variable VGL_COMPRESS = proxy | jpeg | rgb | xv | yuv
vglrun argument -c proxy | jpeg | rgb | xv | yuv
Summary Set image transport and image compression type
Image Transports All
Default Value (See description)
Description
proxy = Send images uncompressed using the X11 Transport. This is useful when displaying to a local 2D X server or X proxy (see Section 8.1.)

jpeg = Compress images using JPEG and send using the VGL Transport. This is useful when displaying to a remote 2D X server (see Chapter 7.)

rgb = Encode images as uncompressed RGB and send using the VGL Transport. This is useful when displaying to a remote 2D X server or X proxy across a very fast network (see Section 8.2.)

xv = Encode images as YUV420P (planar YUV with 4X chrominance subsampling) and display them to the 2D X server using the XV Transport. This transport is designed for use with X proxies, such as the Sun Ray Server Software, that support the X Video extension (see Chapter 9.)

yuv = Encode images as YUV420P, send using the VGL Transport, and display on the client machine using the X Video extension. This greatly reduces the CPU usage on both server and client and uses only about half the network bandwidth as RGB, but the use of 4X chrominance subsampling does produce some visible artifacts (see Chapter 9.)

If this option is not specified, then the default is set as follows:

If the DISPLAY environment variable begins with : or unix:, then VirtualGL assumes that the X display connection is local. If it detects that the 2D X server is a Sun Ray X Server instance, then it will default to using xv compression. Otherwise, it will default to proxy compression.

If VirtualGL detects that the 2D X server is remote, then it will default to using yuv compression if that X server is a Sun Ray X Server instance or jpeg compression otherwise.

If an image transport plugin is being used, then you can set VGL_COMPRESS to any numeric value >= 0 (Default value = 0.) How the plugin responds to this value is implementation-specific.

Environment Variable VGL_DISPLAY = {d}
vglrun argument -d {d}
Summary {d} = the X display to use for 3D rendering
Image Transports All
Default Value :0
Description
If the VirtualGL server has multiple 3D graphics cards and you want the OpenGL commands from the 3D application to be redirected to the second or subsequent graphics cards in the system, then you can set VGL_DISPLAY to (or invoke vglrun -d with) :0.1, :0.2, etc.

Environment Variable VGL_FORCEALPHA = 0 | 1
Summary Force the Pbuffers used for 3D rendering to have an 8-bit alpha channel
Image Transports All
Default Value VGL_FORCEALPHA=1 if PBO readback mode is used, VGL_FORCEALPHA=0 otherwise
Description
Normally, VirtualGL will create a Pbuffer whose pixel format matches the specifications of the visual requested by the 3D application. However, on some 3D hardware, this may produce suboptimal performance when reading back the 3D pixels. Thus, this option can be used to force VirtualGL to create a 32-bit-per-pixel Pbuffer when the application requests a 24-bit-per-pixel visual.

On some 3D hardware, it is necessary to use a 32-bit-per-pixel Pbuffer to realize any performance improvement from the use of pixel buffer objects (PBO’s). Thus, unless this option is explicitly disabled, it will be enabled by default whenever PBO readback mode is enabled. See the VGL_READBACK option for further information.

Environment Variable VGL_FPS = {f}
vglrun argument -fps {f}
Summary Limit the client/server frame rate to {f} frames/second, where {f} is a floating point number > 0.0
Image Transports VGL, X11, XV, Custom (if supported)
Default Value 0.0 (No limit)
Description
This option prevents VirtualGL from sending frames at a rate faster than the specified limit. It can be used, for instance, as a crude way to control network bandwidth or CPU usage in multi-user environments where those resources are constrained.

If frame spoiling is disabled, then setting VGL_FPS effectively limits the server’s 3D rendering frame rate as well.

Environment Variable VGL_GAMMA = {g}
vglrun argument -gamma {g}
Summary Specify gamma correction factor
Image Transports All
Default Value 1.00 (no gamma correction)
Description
“Gamma” refers to the relationship between the intensity of light that your computer’s monitor is instructed to display and the intensity that it actually displays. The curve is an exponential curve of the form Y = XG, where X is between 0 and 1. G is called the “gamma” of the monitor. PC monitors and TV’s usually have a gamma of around 2.2.

Some of the math involved in 3D rendering assumes a linear gamma (G = 1.0), so technically speaking, 3D applications will not display with mathematical correctness unless the pixels are “gamma corrected” to counterbalance the non-linear response curve of the monitor. However, some systems do not have any form of built-in gamma correction, so the applications developed for such systems have usually been designed to display properly without gamma correction. Gamma correction involves passing pixels through a function of the form X = W1/G, where G is the “gamma correction factor” and should be equal to the gamma of the monitor. So the final output is Y = XG = (W1/G)G = W, which describes a linear relationship between the intensity of the pixels drawn by the application and the intensity of the pixels displayed by the monitor.

If VGL_GAMMA is set to an arbitrary floating point value, then VirtualGL performs gamma correction internally using the specified value as the gamma correction factor. You can also specify a negative value to apply a “de-gamma” function. Specifying a gamma correction factor of G (where G < 0) is equivalent to specifying a gamma correction factor of -1/G.
Environment Variable VGL_GLLIB = {l}
Summary {l} = the location of an alternate OpenGL library
Image Transports All
Description
Normally, VirtualGL tries to call any “real” GLX and OpenGL functions it needs from the OpenGL library against which it is linked (usually /usr/lib/libGL.so.1, /usr/lib64/libGL.so.1, or /usr/lib/64/libGL.so.1). Failing this, VirtualGL will then try to call these functions from the first compatible library named libGL.so.1 that is found in the dynamic loader path. You can use the VGL_GLLIB environment variable to override this behavior and specify a dynamic library from which VirtualGL will call the “real” GLX and OpenGL functions.

You shouldn’t need to muck with this unless something doesn’t work. However, setting this environment variable is necessary when using VirtualGL with Chromium. It is also potentially useful if one wishes to insert another OpenGL interposer between VirtualGL and the system’s OpenGL library.
Environment Variable VGL_GUI = {k}
Summary {k} = the key sequence used to pop up the VirtualGL Configuration dialog, or none to disable the dialog
Image Transports All
Default Value shift-ctrl-f9
Description
VirtualGL will normally monitor an application’s X event queue and pop up the VirtualGL Configuration dialog whenever CTRL-SHIFT-F9 is pressed. In the event that this interferes with a key sequence that a 3D application is already using, then you can redefine the key sequence used to pop up the VirtualGL Configuration dialog by setting VGL_GUI to some combination of shift, ctrl, alt, and one of f1, f2, ..., f12. You can also set VGL_GUI to none to disable the configuration dialog altogether. See Chapter 18 for more details.

Environment Variable VGL_INTERFRAME = 0 | 1
Summary Enable or disable interframe image comparison
Image Transports VGL (JPEG, RGB), Custom (if supported)
Default Value Enabled
Description
The VGL Transport will normally compare each frame with the previous frame and send only the portions of the image that have changed. Setting VGL_INTERFRAME to 0 disables this behavior.

This setting was introduced in order to work around a specific application interaction issue, but since a proper fix for that issue was introduced in VirtualGL 2.1.1, this option isn’t really useful anymore.

When using the VGL Transport, interframe comparison is affected by the VGL_TILESIZE option

Environment Variable VGL_LOG = {l}
Summary Redirect all messages from VirtualGL to a log file specified by {l}
Image Transports All
Default Value Print all messages to stderr
Description
Setting this environment variable to the pathname of a log file on the VirtualGL server will cause VirtualGL to redirect all of its messages (including profiling and trace output) to the specified log file rather than to stderr.
Environment Variable VGL_LOGO = 0 | 1
Summary Enable or disable the display of a VGL logo in the 3D window
Image Transports All
Default Value Disabled
Description
Setting VGL_LOGO to 1 will cause VirtualGL to display a small logo in the bottom right-hand corner of the 3D window. This is meant as a debugging tool to allow users to determine whether or not VirtualGL is active.

Environment Variable VGL_NPROCS = {n}
vglrun argument -np {n}
Summary {n} = the number of CPUs to use for multi-threaded compression
Image Transports VGL (JPEG, RGB), Custom (if supported)
Default Value 1
Description
The VGL Transport can divide the task of compressing each frame among multiple server CPUs. This might speed up the overall throughput in rare circumstances where the server CPUs are significantly slower than the client CPUs.

VirtualGL will not allow more than 4 processors total to be used for compression, nor will it allow you to set this parameter to a value greater than the number of processors in the system.

When using the VGL Transport, multi-threaded compression is affected by the VGL_TILESIZE option

Environment Variable VGL_PORT = {p}
vglrun argument -p {p}
Summary {p} = the TCP port to use when connecting to the VirtualGL Client
Image Transports VGL, Custom (if supported)
Default Value Read from X property stored by VirtualGL Client
Description
The connection port for the VGL Transport is normally determined by reading an X property that vglclient stores on the 2D X server, so don’t override this unless you know what you’re doing.
Environment Variable VGL_PROFILE = 0 | 1
vglrun argument -pr / +pr
Summary Disable/enable profiling output
Image Transports VGL, X11, XV, Custom (if supported)
Default Value Disabled
Description
If profiling output is enabled, then VirtualGL will continuously benchmark itself and periodically print out the throughput of various stages in its image pipeline.

See Chapter 17 for more details.

Environment Variable VGL_QUAL = {q}
vglrun argument -q {q}
Summary {q} = the JPEG compression quality, 1 <= {q} <= 100
Image Transports VGL (JPEG), Custom (if supported)
Default Value 95
Description
In digital images, “frequency” refers to how quickly the color changes between light and dark as you move either horizontally or vertically in the image. Images with very sharp, bright features on a dark background, for instance, consist of both low-frequency and high-frequency components, whereas images with smooth transitions between neighboring pixels contain only low-frequency components. JPEG compression works by breaking down the image into its constituent frequencies and then throwing out the highest of these frequencies. The JPEG image “quality” determines which frequencies are thrown out. A JPEG quality of 1 throws out all but the lowest frequencies and thus produces a very impressionistic, but generally not very useful, compressed image. A JPEG quality of 100 retains all frequencies in the original image (but, due to roundoff errors, the compressed image is still not completely lossless.)

Because the human eye usually can’t detect the highest frequencies in the image, and often because the image lacks those high-frequency elements to begin with, a sufficiently high JPEG quality setting can produce a “perceptually lossless” image. A “perceptually lossless” image contains a small amount of mathematical error when compared to the original image, but this error is so small that, under normal circumstances, the human visual system cannot distinguish it. The threshold quality level at which JPEG compression becomes perceptually lossless is different for each image, but experiments with various visual difference benchmarks (such as HDR-VDP) suggest that a JPEG quality of 95 is sufficient to guarantee perceptual losslessness for the types of applications (volume visualization apps, in particular) in which image quality is critical. As with any benchmarks, Your Mileage May Vary. If image quality is of paramount concern, consider setting the JPEG quality to 100 or using RGB encoding.

If using an image transport plugin, then this setting need not necessarily correspond to JPEG image quality. How the plugin responds to the VGL_QUAL option is implementation-specific.

Environment Variable VGL_READBACK = none | pbo | sync
Summary Specify the method used by VirtualGL to read back the 3D pixels from the 3D graphics hardware
Image Transports All
Default Value sync
Description
Environment Variable VGL_SAMPLES = {s}
vglrun argument -ms {s}
Summary Force OpenGL multisampling to be enabled with {s} samples. {s} = 0 to force OpenGL multisampling to be disabled.
Image Transports All
Default Value Allow the 3D application to determine the level of multisampling
Description
This option was added primarily because certain vendor-specific methods of enabling full-scene antialiasing at a global level (such as nVidia’s __GL_FSAA_MODE environment variable) do not work with Pbuffers and, subsequently, do not work with VirtualGL. If VGL_SAMPLES is > 0, then VirtualGL will attempt to redirect 3D rendering to multisample-enabled Pbuffers with the specified number (or a greater number) of samples. This effectively forces the 3D application to render with the specified multisampling level, as if the application had explicitly passed parameters of GLX_SAMPLES, {s} to glXChooseVisual(). If VGL_SAMPLES is 0, then VirtualGL forces multisampling to be disabled, even if the 3D application explicitly tries to enable it.

Environment Variable VGL_SPOIL = 0 | 1
vglrun argument -sp / +sp
Summary Disable/enable frame spoiling
Image Transports VGL, X11, XV, Custom (if supported)
Default Value Enabled
Description
In remote display environments, the mouse movement is generally sampled at least 40 and sometimes 60 times per second. Therefore, unless VirtualGL is able to deliver at least this number of frames per second to the client, the movement of a 3D scene will appear to drag behind the mouse motion. VirtualGL’s default behavior is to compensate for this by dropping frames. This ensures that every mouse motion event will result in a new frame being rendered on the server, even though not all of these frames will actually be delivered to the client.

Frame spoiling should produce the best results with interactive applications, but it should be turned off when running benchmarks or other non-interactive applications. Turning off frame spoiling will force every frame rendered on the server to be sent through VirtualGL, and thus the frame rate reported by OpenGL benchmarks will accurately reflect the frame rate of VirtualGL’s image pipeline (though, in X proxy environments, this may still not accurately reflect the frame rate seen by the user. See Section 17.2.) Disabling frame spoiling also prevents non-interactive applications from wasting graphics resources by rendering frames that will never be seen.

Environment Variable VGL_SPOILLAST = 0 | 1
Summary Disable/enable “spoil last” frame spoiling algorithm for frames triggered by glFlush()
Image Transports VGL, X11, XV, Custom (if supported)
Default Value Enabled
Description
VirtualGL will commonly read back a rendered 3D image if the 3D application calls glXSwapBuffers() while rendering to the back buffer or if the 3D application calls glFinish(), glFlush(), or glXWaitGL() while rendering to the front buffer. When frame spoiling is enabled and the frame queue is busy compressing/sending a frame, the newly-rendered frame is normally promoted to the head of the frame queue, and the rest of the frames in the queue are “spoiled” (discarded.) This algorithm, called “spoil first”, ensures that when a frame is actually sent to the client (rather than spoiled), the sent frame will be the most recently rendered frame. However, this algorithm requires that VirtualGL read back every frame that the application renders, even if the frame is ultimately discarded.

Some applications call glFlush() many thousands of times per frame while rendering to the front buffer. Thus, VirtualGL’s default behavior is to use a different spoiling algorithm, “spoil last”, to process frames triggered by glFlush() calls. “Spoil last” discards the most recently rendered frame if the frame queue is busy. Thus, the only frames that are read back from the graphics card are the frames that are actually sent to the client. However, there is no guarantee in this case that the frame sent to the client will be the most recently rendered frame, so applications that perform front buffer rendering and call glFlush() in response to an interactive operation may not display properly. For such applications, setting the VGL_SPOILLAST environment variable to 0 prior to launching the application with vglrun will cause the “spoil first” algorithm to be used for all frame triggers, including glFlush(). This should fix the display problem, at the expense of increased load on the graphics card (because VirtualGL is now reading back the rendered 3D image every time glFlush() is called.) See Application Recipes for a list of applications that are known to require this.
Environment Variable VGL_SSL = 0 | 1
vglrun argument -s / +s
Summary Disable/enable SSL encryption of the image transport
Image Transports VGL, Custom (if supported)
Default Value Disabled
Description
Enabling this option causes the VGL Transport to be tunneled through a secure socket layer (SSL).

This option has no effect unless both the VirtualGL server and client were built with OpenSSL support.

Environment Variable VGL_STEREO = left | right | quad | rc
vglrun argument -st left | right | quad | rc
Summary Specify the delivery method for stereo images
Image Transports All
Default Value quad
Description
left = When an application renders a stereo frame, send only the left eye buffer

right = When an application renders a stereo frame, send only the right eye buffer

quad = Attempt to use quad-buffered stereo, which will result in a pair of images being sent to the VirtualGL Client for every frame. If using the VGL Transport and quad-buffered stereo is not available on the client, or if using the X11 Transport, then fall back to using anaglyphic stereo. Using quad-buffered stereo requires the VGL Transport (or a transport plugin that can handle stereo image pairs.) Using quad-buffered stereo with the VGL Transport also requires that the 2D X server support OpenGL and be connected to a 3D accelerator that supports stereo rendering. Quad-buffered stereo is not supported when using the VGL Transport with YUV encoding.

rc = Use Red/Cyan (anaglyphic) stereo, even if quad-buffered is available

See Chapter 16 for more details.

Environment Variable VGL_SUBSAMP = gray | 1x | 2x | 4x | 8x | 16x
vglrun argument -samp gray | 1x | 2x | 4x | 8x | 16x
Summary Specify the level of chrominance subsampling in the JPEG image compressor
Image Transports VGL (JPEG), Custom (if supported)
Default Value 1x
Description
When an image is encoded using JPEG, each pixel in the image is first converted from RGB (Red/Green/Blue) to YUV. An RGB pixel has three values that specify the amounts of red, green, and blue that make up the pixel’s color. A YUV pixel has three values that specify the overall brightness of the pixel (Y, or “luminance”) and the overall color of the pixel (U and V, or “chrominance”.)

Since the human eye is less sensitive to changes in color than it is to changes in brightness, the chrominance components for some of the pixels can be discarded without much noticeable loss in image quality. This technique, called “chrominance subsampling”, significantly reduces the size of the compressed image.

1x = no chrominance subsampling

2x = discard the chrominance components for every other pixel along the image’s X direction (this is also known as “4:2:2” or “2:1” subsampling.) All else being equal, 2x subsampling generally reduces the image size by about 20-25% when compared to no subsampling.

4x = discard the chrominance components for every other pixel along both the X and Y directions of the image (this is also known as “4:2:0” or “2:2” subsampling.) All else being equal, 4x subsampling generally reduces the image size by about 35-40% when compared to no subsampling.

8x = discard the chrominance components for 3 out of every 4 pixels along the image’s X direction and half the pixels along the image’s Y direction (this is also known as “4:1:0” or “4:2” subsampling.) This option is available only when using an image transport plugin that supports it.

16x = discard the chrominance components for 3 out of every 4 pixels along both the X and Y directions of the image (this is also known as “4:4” subsampling.) This option is available only when using an image transport plugin that supports it.

gray = discard all chrominance components. This is useful when running applications (such as medical visualization applications) that are already generating grayscale images.

Subsampling artifacts are less noticeable with volume data, since it usually only contains 256 colors to begin with, but narrow, aliased lines and other sharp features on a black background will tend to produce very noticeable artifacts when subsampling is enabled.

The axis indicator from a popular visualization app displayed with 1x, 2x, and 4x chrominance subsampling (respectively):
444422411

If using an image transport plugin, then this setting need not necessarily correspond to JPEG chrominance subsampling. How the plugin responds to the VGL_SUBSAMP option is implementation-specific.

Environment Variable VGL_SYNC = 0 | 1
vglrun argument -sync / +sync
Summary Disable/enable strict 2D/3D synchronization
Image Transports VGL, X11, XV, Custom (if supported)
Default Value Disabled
Description
Normally, VirtualGL’s operation is asynchronous from the point of view of the application. The application swaps the buffers or calls glFinish() or glFlush() or glXWaitGL(), and VirtualGL reads back the framebuffer and sends the pixels to the client’s display … eventually. This will work fine for the vast majority of applications, but it does not strictly conform to the GLX spec. Technically speaking, when an application calls glXWaitGL() or glFinish(), it is well within its rights to expect the 3D image to be immediately available in the X window. Fortunately, very few applications actually do expect this, but on rare occasions, an application may try to use XGetImage() or other X11 functions to obtain a bitmap of the pixels that were rendered by OpenGL. Enabling VGL_SYNC is a somewhat extreme measure that may be needed to make such applications work properly. It was developed initially as a way to pass the GLX conformance suite (conformx, specifically), but at least one commercial application is known to require it as well (see Application Recipes.)

When VGL_SYNC is enabled, every call to glFinish(), glXWaitGL(), and glXSwapBuffers() will cause the contents of the Pbuffer to be read back and synchronously drawn into the application’s window using the X11 Transport and no frame spoiling. The call to glFinish(), glXWaitGL(), or glXSwapBuffers() will not return until VirtualGL has verified that the pixels have been delivered into the application’s window. As such, this mode can have potentially dire effects on performance when used with a remote 2D X server. It is strongly recommended that VGL_SYNC be used only in conjunction with an X proxy running on the same server as VirtualGL.

If an image transport plugin is being used, then VirtualGL does not automatically enable the X11 Transport or disable frame spoiling when VGL_SYNC is set. This allows the plugin to handle synchronous image delivery as it sees fit (or to simply ignore this option.)

Environment Variable VGL_TILESIZE = {t}
Summary {t} = the image tile size ({t} x {t} pixels) to use for multi-threaded compression and interframe comparison (8 <= {t} <= 1024)
Image Transports VGL (JPEG, RGB), Custom (if supported)
Default Value 256
Description
Normally, the VGL Transport will divide an OpenGL window into equal-sized square tiles, compare each tile vs. the same tile in the previous frame, then compress and send only the tiles that have changed (assuming interframe comparison is enabled.) The VGL Transport will also divide up the task of compressing these tiles among the available CPUs in a round robin fashion, if multi-threaded compression is enabled (see VGL_NPROCS.)

There are several tradeoffs that must be considered when choosing a tile size:

Parallel scalability:
Compression efficiency:
Inter-frame optimization:
Network efficiency:
256x256 was chosen as the default because, in experiments, it provided the best balance between scalability and efficiency on the platforms that VirtualGL supports.
Environment Variable VGL_TRACE = 0 | 1
vglrun argument -tr / +tr
Summary Disable/enable tracing
Image Transports All
Default Value Disabled
Description
When tracing is enabled, VirtualGL will log all calls to the GLX and X11 functions it is intercepting, as well as the arguments, return values, and execution times for those functions. This is useful when diagnosing interaction problems between VirtualGL and a particular OpenGL application.
Environment Variable VGL_TRANSPORT = {t}
vglrun argument -trans {t}
Summary Use an image transport plugin
Default Value None
Description
If this option is specified, then VirtualGL will attempt to load an image transport plugin contained in a dynamic library named libtransvgl_{t}.so located in the dynamic linker path.
Environment Variable VGL_TRAPX11 = 0 | 1
Summary Disable/enable VirtualGL’s X11 error handler
Image Transports All
Default Value Disabled
Description
If an application does not install its own X11 error handler, then the default X11 error handler is used. The default X11 error handler will cause the application to exit if an X11 error occurs. Enabling the VGL_TRAPX11 option will cause VirtualGL to install its own X11 error handler, which prints a warning message but allows the application to continue running.
Environment Variable VGL_VERBOSE = 0 | 1
vglrun argument -v / +v
Summary Disable/enable verbose VirtualGL messages
Image Transports All
Default Value Disabled
Description
When verbose mode is enabled, VirtualGL will reveal some of the decisions it makes behind the scenes, such as which code path it is using to compress images, which type of X11 drawing it is using, etc. This can be helpful when diagnosing performance problems.
Environment Variable VGL_X11LIB = {l}
Summary {l} = the location of an alternate X11 library
Image Transports All
Description
Normally, VirtualGL tries to call any “real” X11 functions it needs from the X11 library against which it is linked (usually /usr/lib/libX11.so.6, /usr/lib/64/libX11.so.6, /usr/X11R6/lib/libX11.so.6, or /usr/X11R6/lib64/libX11.so.6). Failing this, VirtualGL will then try to call these functions from the first compatible library named libX11.so.6 that is found in the dynamic loader path. You can use the VGL_X11LIB environment variable to override this behavior and specify a dynamic library from which VirtualGL will call the “real” X11 functions.

You shouldn’t need to muck with this unless something doesn’t work. However, it is potentially useful if one wishes to insert another X11 interposer between VirtualGL and the system’s X11 library.
Environment Variable VGL_XVENDOR = {v}
Summary {v} = a fake X11 vendor string to return when the application calls XServerVendor() or ServerVendor()
Image Transports All
Description
Some applications expect the X11 vendor string to contain a particular value, which the application (sometimes erroneously) uses to figure out whether it is being displayed to a local or a remote X server. This setting allows you to fool such applications into thinking that they are being displayed to a “local” X server rather than a remote one.

19.2 Client Settings

These settings control the VirtualGL Client, which is used only with the VGL Transport. vglclient is normally launched automatically from vglconnect and should not require any further configuration except in exotic circumstances. These settings are meant only for advanced users or those wishing to build additional infrastructure around VirtualGL.

Environment Variable VGLCLIENT_DRAWMODE = ogl | x11
vglclient argument -gl / -x
Summary Specify the method used to draw pixels into the application window
Default Value x11
Description
If the client machine has a 3D accelerator, then it may be faster in some rare instances to draw pixels using OpenGL rather than using 2D (X11) commands.
Environment Variable VGLCLIENT_LISTEN = sslonly | nossl
vglclient argument -sslonly / -nossl
Summary Accept only unencrypted or only SSL connections from the VirtualGL server
Default Value Accept both SSL and unencrypted connections

This option is available only if the VirtualGL client was built with OpenSSL support.

Environment Variable VGLCLIENT_PORT = {p}
vglclient argument -port {p}
Summary {p} = TCP port on which to listen for unencrypted connections from the VirtualGL server
Default Value Automatically select a free port
Description
The default behavior for vglclient is to first try listening for unencrypted connections on port 4242, to maintain backward compatibility with VirtualGL v2.0.x. If port 4242 is not available, then vglclient will try to find a free port in the range of 4200-4299. If none of those ports is available, then vglclient will request a free port from the operating system.

Setting this option circumvents the automatic behavior described above and causes vglclient to listen only on the specified TCP port.
Environment Variable VGL_PROFILE = 0 | 1
Summary Disable/enable profiling output
Default Value Disabled
Description
If profiling output is enabled, then VirtualGL will continuously benchmark itself and periodically print out the throughput of various stages in its image pipelines.

See Chapter 17 for more details.
Environment Variable VGLCLIENT_SSLPORT = {p}
vglclient argument -sslport {p}
Summary {p} = TCP port on which to listen for SSL connections from the VirtualGL server
Default Value Automatically select a free port
Description
The default behavior for vglclient is to first try listening for SSL connections on port 4243, to maintain backward compatibility with VirtualGL v2.0.x. If port 4243 is not available, then vglclient will try to find a free port in the range of 4200-4299. If none of those ports is available, then vglclient will request a free port from the operating system.

Setting this option circumvents the automatic behavior described above and causes vglclient to listen only on the specified TCP port.

This option is available only if the VirtualGL client was built with OpenSSL support.

Environment Variable VGL_VERBOSE = 0 | 1
Summary Disable/enable verbose VirtualGL messages
Default Value Disabled
Description
When verbose mode is enabled, VirtualGL will reveal some of the decisions it makes behind the scenes, such as which code path it is using to decompress images, which type of X11 drawing it is using, etc. This can be helpful when diagnosing performance problems.