Home > Nvidia > Processor > Nvidia Geforce 6 Series Manual

Nvidia Geforce 6 Series Manual

    Download as PDF Print this page Share this page

    Have a look at the manual Nvidia Geforce 6 Series Manual online for free. It’s possible to download the document as PDF or print. UserManuals.tech offer 9 Nvidia manuals and user’s guides for free. Share the user manual or guide on Facebook, Twitter or Google+.

    							
    Alternatively, conditional writes (that is, write if a condition code is set) can b\
    e used
    when branching is not performance-effective. In practice, the compiler will use the
    method that delivers higher performance when possible.
    30.5.4 Use fp16 Intermediate Values Wherever Possible
    Because GeForce 6 Series GPUs support a full-speed fp16 normalize instruction in parallel
    with the multiplies and adds, and because fp16 intermediate values reduce internal storage
    and datapath requirements, using fp16 intermediate values wherever possible can be a
    performance win, saving fp32 intermediate values for cases where the precision is needed.
    Excessive internal storage requirements can adversely affect performance in the follow-
    ing way: The shader pipeline is optimized to keep hundreds of fragments in flight given
    a fixed amount of register space per fragment (four fp32
    ×4 registers or eight fp16×4
    registers). If the register space is exceeded, then fewer fragments can remain in flight,
    reducing the latency tolerance for texture fetches, and adversely affecting performance.
    The GeForce 6 Series fragment processor will have the maximum number of fragments
    in flight when shader programs use up to four fp32
    ×4 temporary registers (or eight
    fp16
    ×4 registers). That is, at any one time, a maximum of four temporary fp32×4 (or
    eight fp16
    ×4) registers are in use. This decision was based on the fact that for the over-
    whelming majority of analyzed shaders, four or fewer simultaneously active fp32
    ×4
    registers proved to be the sweet spot during the shaders’ execution. In addition, the
    architecture is designed so that performance degrades slowly if more registers are used. 
    Similarly, the register file has enough read and write bandwidth to keep all the units
    busy if reading fp16
    ×4 values, but it may run out of bandwidth to feed all units if
    using fp32
    ×4 values exclusively. NVIDIA’s compiler technology is smart enough to
    reduce this effect’s impact substantially, but fp16 intermediate values are never slower
    than fp32 values; because of the resource restrictions and the fp16 normalize hardware,
    they can often be much faster.
    30.6 Conclusion
    GeForce 6 Series GPUs provide the GPU programmer with unparalleled flexibility and
    performance in a product line that spans the entire PC market. After reading this chap-
    ter, you should have a better understanding of what GeForce 6 Series GPUs are capable
    of, and you should be able to use this knowledge to develop applications—either
    graphical or general purpose—in a more efficient way.
    30.6 Conclusion 491
    
    430_gems2_ch30_new.qxp  1/31/2005  6:58 PM  Page 491
    Excerpted from GPU Gems 2
    Copyright 2005 by NVIDIA Corporation  
    						
    							
    Copyright © NVIDIA 
    Corporation 2004
    GPU Gems 2 GPU Gems 2 
    Programming Techniques for HighProgramming Techniques for High--Performance Performance 
    Graphics and GeneralGraphics and General--Purpose ComputationPurpose Computation
    880 full-color pages, 330 figures
    Hard cover
    $59.99
    Available at GDC 2005 (March 7, 2005)
    Experts from universities and industry
    Geometric Complexity
    Shading, Lighting, and Shadows
    High-Quality Rendering
    General Purpose Computation 
    on GPUs: A Primer
    Image-Oriented Computing
    Simulation and Numerical 
    Algorithms
    Graphics ProgrammingGraphics ProgrammingGPGPU ProgrammingGPGPU Programming
    Sign up for e-mail notification when the book is available at:
    http://developer.nvidia.com/object/gpu_gems_2_notification.html
    For more information, please visit:
    http://developer.nvidia.com/object/gpu_gems_2_home.html 
    						
    All Nvidia manuals Comments (0)