After spending most of the past decade without a decent computer, all laptops with GPUs more able at toasting bread than proper gaming, I finally cracked the spare Bitcoin piggy-bank and built my dream machine with an i7-4790k and Nvidia 970 GPU inside it.

I could play Witcher 3 at last, so many great games to catch up on. :-)

But before that I had to get the maximum performance the hardware can provide through overclocking.

The point is, I’m a nerd and nerds like to tweak things. We’re not the kind that puts up with bloated closed source software and crappy xmass tree GUIs. I thus needed a simple and snappy tool to achieve the purpose of overclocking my brand new GPU making it on par with a 980 model.

Let’s be honest here, the main contenders offenders would make the eyes of any sane person bleed instantly:

Those tools are respectively from MSI, EVGA, Gigabyte and Asus.

A quick look at the interface and features suggests they are very similar, probably all built upon the same toolkit, it’s called “RTHAL” in MSI Afterburner.

Anyways this is bad software and their authors should feel bad. Not everyone buying those graphics cards is a 14yo xXX_l33thaxor1ny0ma|\/|4_XXx who wants dragons and giant robots on their packaging.

There is no light and open source overclocking software for power users these days, mostly because GPU makers won’t publish their docs, the situation needs a fix.

Where do we start?

Nvidia has an API to talk to their Driver, at least under Windows, it’s conveniently named NvAPI and it has a documentation here: GPU Performance State Interface.

What would hit the hopeful coder square in the face when reading that is the very reduced set of functions available:

NVAPI_INTERFACE NvAPI_GPU_GetPstates20 (__in NvPhysicalGpuHandle hPhysicalGpu, __inout NV_GPU_PERF_PSTATES20_INFO *pPstatesInfo) NVAPI_INTERFACE NvAPI_GPU_GetCurrentPstate (NvPhysicalGpuHandle hPhysicalGpu, NV_GPU_PERF_PSTATE_ID *pCurrentPstate) NVAPI_INTERFACE NvAPI_GPU_GetDynamicPstatesInfoEx (NvPhysicalGpuHandle hPhysicalGpu, NV_GPU_DYNAMIC_PSTATES_INFO_EX *pDynamicPstatesInfoEx)

Yep, that also sucks, plenty of functions with “Get” in their names but almost none with “Set”.

Indeed this public API is incomplete, after lurking the interwebs it seems the full featured api, headers, libs and docs are provided under an NDA and there is not the slightest chance I could access that information legitimately I suppose.

I don’t have time to waste jumping through those hoops, creating accounts and whatnot either, I just want to overclock my GPU and the Internet for once doesn’t have anything of that sort readily available since RivaTuner which has never been open source in the first place.

So let’s grab a shovel and go deeper.

What do we know, what do we need?

We sure know that tools from cards makers can do overclocking through NvAPI by accessing the undocumented functions.

Probably if there’s some “public” Get…

NvAPI_GPU_GetPstates20

then there’s a “private” Set… hiding somewhere:

NvAPI_GPU_SetPstates20

Maybe this is more complicated than that for all we know, so let’s start by running MSI Afterburner inside Ollydbg and we’ll quickly land here by browsing the strings references:

“nvapi.dll” definitely gets loaded here using LoadLibrary/GetModuleHandle. We’re on the right track.

Now where exactly is that lib used? There could be thousands occurrences.

That’s simple, with the program running and the realtime graph disabled (it polls NvAPI constantly adding noise to the mass of API calls). we place a memory breakpoint on the .Text memory segment of the NVapi.dll inside MSI Afterburner’s process. (just hit F2 in the segments window when NvAPI is highlighted…).

Then we set the sliders in the MSI tool to get some negligible GPU underclock and hit the “apply” button. It breaks inside NvAPI… magic!

But wait, this isn’t the “overclocking” (SetPstates20()) function there, the symbol for the return pointer on the top of the stack shows something along the lines of “QueryInterface”.

Long story short, this “NvAPI_QueryInterface” function is the only exported function from the nvapi.dll

Its purpose is to take the ID of a function in the API and return a pointer to the actual code of the function in the mapped process. It probably serves as a convenient layer for not breaking the API across updates and also for obfuscating the entry points where the goods are to be found.

Actually if you get the NVapi SDK from Nvidia’s website you’ll find a linkable module inside the archive. It serves exactly no purpose, just acts as an “exports proxy”, it exports the name of all the public functions from the API, when the functions are called it retrieves the real pointer with the ID it holds and executes the real function.

Ultimately the end user/programmer doesn’t have to be aware of all those ID things, he would just call the public functions and link the module from the public SDK using the public headers.

You may already have guessed, I don’t want to proceed that way.

Hopefully if you look again at the previous screenshot of Ollydbg inside the QueryInterface() function you’ll find the sole INT argument to the function on top of the stack just under the return pointer, it’s 0xF4DAE6B. We’re getting closer!

Let’s continue runnning the program in olly and break a second time on NvAPI, we learn from the symbols floating around that MSI Afterburner just initiated a call to “Nv_SetPStates20()”. So 0xF4DAE6B is certainly the ID of the function we’re looking for.

Good we just need its prototype and arguments to be in turn able to declare and use it inside our own code.

Also a quick web search for 0xF4DAE6B yielded this very interesting result where an amazing Russian dude with only 2 messages on his stackoverflow.com profile still found a way to drop this sweet piece of data which looks disturbingly like what the NDA version of the API would be:

_NvAPI_Initialize 150E828h

_NvAPI_Unload 0D22BDD7Eh

_NvAPI_GetErrorMessage 6C2D048Ch

_NvAPI_GetInterfaceVersionString 1053FA5h

_NvAPI_GetDisplayDriverVersion 0F951A4D1h

_NvAPI_SYS_GetDriverAndBranchVersion 2926AAADh

_NvAPI_EnumNvidiaDisplayHandle 9ABDD40Dh

_NvAPI_EnumNvidiaUnAttachedDisplayHandle 20DE9260h

_NvAPI_EnumPhysicalGPUs 0E5AC921Fh

_NvAPI_EnumLogicalGPUs 48B3EA59h

_NvAPI_GetPhysicalGPUsFromDisplay 34EF9506h

_NvAPI_GetPhysicalGPUFromUnAttachedDisplay 5018ED61h

_NvAPI_CreateDisplayFromUnAttachedDisplay 63F9799Eh

_NvAPI_GetLogicalGPUFromDisplay 0EE1370CFh

_NvAPI_GetLogicalGPUFromPhysicalGPU 0ADD604D1h

_NvAPI_GetPhysicalGPUsFromLogicalGPU 0AEA3FA32h

_NvAPI_GetAssociatedNvidiaDisplayHandle 35C29134h

_NvAPI_DISP_GetAssociatedUnAttachedNvidiaDisplayHandle 0A70503B2h

_NvAPI_GetAssociatedNvidiaDisplayName 22A78B05h

_NvAPI_GetUnAttachedAssociatedDisplayName 4888D790h

_NvAPI_EnableHWCursor 2863148Dh

_NvAPI_DisableHWCursor 0AB163097h

_NvAPI_GetVBlankCounter 67B5DB55h

_NvAPI_SetRefreshRateOverride 3092AC32h

_NvAPI_GetAssociatedDisplayOutputId 0D995937Eh

_NvAPI_GetDisplayPortInfo 0C64FF367h

_NvAPI_SetDisplayPort 0FA13E65Ah

_NvAPI_GetHDMISupportInfo 6AE16EC3h

_NvAPI_DISP_EnumHDMIStereoModes 0D2CCF5D6h

_NvAPI_GetInfoFrame 9734F1Dh

_NvAPI_SetInfoFrame 69C6F365h

_NvAPI_SetInfoFrameState 67EFD887h

_NvAPI_GetInfoFrameState 41511594h

_NvAPI_Disp_InfoFrameControl 6067AF3Fh

_NvAPI_Disp_ColorControl 92F9D80Dh

_NvAPI_DISP_GetVirtualModeData 3230D69Ah

_NvAPI_DISP_OverrideDisplayModeList 291BFF2h

_NvAPI_GetDisplayDriverMemoryInfo 774AA982h

_NvAPI_GetDriverMemoryInfo 2DC95125h

_NvAPI_GetDVCInfo 4085DE45h

_NvAPI_SetDVCLevel 172409B4h

_NvAPI_GetDVCInfoEx 0E45002Dh

_NvAPI_SetDVCLevelEx 4A82C2B1h

_NvAPI_GetHUEInfo 95B64341h

_NvAPI_SetHUEAngle 0F5A0F22Ch

_NvAPI_GetImageSharpeningInfo 9FB063DFh

_NvAPI_SetImageSharpeningLevel 3FC9A59Ch

_NvAPI_D3D_GetCurrentSLIState 4B708B54h

_NvAPI_D3D9_RegisterResource 0A064BDFCh

_NvAPI_D3D9_UnregisterResource 0BB2B17AAh

_NvAPI_D3D9_AliasSurfaceAsTexture 0E5CEAE41h

_NvAPI_D3D9_StretchRectEx 22DE03AAh

_NvAPI_D3D9_ClearRT 332D3942h

_NvAPI_D3D_CreateQuery 5D19BCA4h

_NvAPI_D3D_DestroyQuery 0C8FF7258h

_NvAPI_D3D_Query_Begin 0E5A9AAE0h

_NvAPI_D3D_Query_End 2AC084FAh

_NvAPI_D3D_Query_GetData 0F8B53C69h

_NvAPI_D3D_Query_GetDataSize 0F2A54796h

_NvAPI_D3D_Query_GetType 4ACEEAF7h

_NvAPI_D3D_RegisterApp 0D44D3C4Eh

_NvAPI_D3D9_CreatePathContextNV 0A342F682h

_NvAPI_D3D9_DestroyPathContextNV 667C2929h

_NvAPI_D3D9_CreatePathNV 71329DF3h

_NvAPI_D3D9_DeletePathNV 73E0019Ah

_NvAPI_D3D9_PathVerticesNV 0C23DF926h

_NvAPI_D3D9_PathParameterfNV 0F7FF00C1h

_NvAPI_D3D9_PathParameteriNV 0FC31236Ch

_NvAPI_D3D9_PathMatrixNV 0D2F6C499h

_NvAPI_D3D9_PathDepthNV 0FCB16330h

_NvAPI_D3D9_PathClearDepthNV 157E45C4h

_NvAPI_D3D9_PathEnableDepthTestNV 0E99BA7F3h

_NvAPI_D3D9_PathEnableColorWriteNV 3E2804A2h

_NvAPI_D3D9_DrawPathNV 13199B3Dh

_NvAPI_D3D9_GetSurfaceHandle 0F2DD3F2h

_NvAPI_D3D9_GetOverlaySurfaceHandles 6800F5FCh

_NvAPI_D3D9_GetTextureHandle 0C7985ED5h

_NvAPI_D3D9_GpuSyncGetHandleSize 80C9FD3Bh

_NvAPI_D3D9_GpuSyncInit 6D6FDAD4h

_NvAPI_D3D9_GpuSyncEnd 754033F0h

_NvAPI_D3D9_GpuSyncMapTexBuffer 0CDE4A28Ah

_NvAPI_D3D9_GpuSyncMapSurfaceBuffer 2AB714ABh

_NvAPI_D3D9_GpuSyncMapVertexBuffer 0DBC803ECh

_NvAPI_D3D9_GpuSyncMapIndexBuffer 12EE68F2h

_NvAPI_D3D9_SetPitchSurfaceCreation 18CDF365h

_NvAPI_D3D9_GpuSyncAcquire 0D00B8317h

_NvAPI_D3D9_GpuSyncRelease 3D7A86BBh

_NvAPI_D3D9_GetCurrentRenderTargetHandle 22CAD61h

_NvAPI_D3D9_GetCurrentZBufferHandle 0B380F218h

_NvAPI_D3D9_GetIndexBufferHandle 0FC5A155Bh

_NvAPI_D3D9_GetVertexBufferHandle 72B19155h

_NvAPI_D3D9_CreateTexture 0D5E13573h

_NvAPI_D3D9_AliasPrimaryAsTexture 13C7112Eh

_NvAPI_D3D9_PresentSurfaceToDesktop 0F7029C5h

_NvAPI_D3D9_CreateVideoBegin 84C9D553h

_NvAPI_D3D9_CreateVideoEnd 0B476BF61h

_NvAPI_D3D9_CreateVideo 89FFD9A3h

_NvAPI_D3D9_FreeVideo 3111BED1h

_NvAPI_D3D9_PresentVideo 5CF7F862h

_NvAPI_D3D9_VideoSetStereoInfo 0B852F4DBh

_NvAPI_D3D9_SetGamutData 2BBDA32Eh

_NvAPI_D3D9_SetSurfaceCreationLayout 5609B86Ah

_NvAPI_D3D9_GetVideoCapabilities 3D596B93h

_NvAPI_D3D9_QueryVideoInfo 1E6634B3h

_NvAPI_D3D9_AliasPrimaryFromDevice 7C20C5BEh

_NvAPI_D3D9_SetResourceHint 905F5C27h

_NvAPI_D3D9_Lock 6317345Ch

_NvAPI_D3D9_Unlock 0C182027Eh

_NvAPI_D3D9_GetVideoState 0A4527BF8h

_NvAPI_D3D9_SetVideoState 0BD4BC56Fh

_NvAPI_D3D9_EnumVideoFeatures 1DB7C52Ch

_NvAPI_D3D9_GetSLIInfo 694BFF4Dh

_NvAPI_D3D9_SetSLIMode 0BFDC062Ch

_NvAPI_D3D9_QueryAAOverrideMode 0DDF5643Ch

_NvAPI_D3D9_VideoSurfaceEncryptionControl 9D2509EFh

_NvAPI_D3D9_DMA 962B8AF6h

_NvAPI_D3D9_EnableStereo 492A6954h

_NvAPI_D3D9_StretchRect 0AEAECD41h

_NvAPI_D3D9_CreateRenderTarget 0B3827C8h

_NvAPI_D3D9_NVFBC_GetStatus 0BD3EB475h

_NvAPI_D3D9_IFR_SetUpTargetBufferToSys 55255D05h

_NvAPI_D3D9_GPUBasedCPUSleep 0D504DDA7h

_NvAPI_D3D9_IFR_TransferRenderTarget 0AB7C2DCh

_NvAPI_D3D9_IFR_SetUpTargetBufferToNV12BLVideoSurface 0CFC92C15h

_NvAPI_D3D9_IFR_TransferRenderTargetToNV12BLVideoSurface 5FE72F64h

_NvAPI_D3D10_AliasPrimaryAsTexture 8AAC133Dh

_NvAPI_D3D10_SetPrimaryFlipChainCallbacks 73EB9329h

_NvAPI_D3D10_ProcessCallbacks 0AE9C2019h

_NvAPI_D3D10_GetRenderedCursorAsBitmap 0CAC3CE5Dh

_NvAPI_D3D10_BeginShareResource 35233210h

_NvAPI_D3D10_BeginShareResourceEx 0EF303A9Dh

_NvAPI_D3D10_EndShareResource 0E9C5853h

_NvAPI_D3D10_SetDepthBoundsTest 4EADF5D2h

_NvAPI_D3D10_CreateDevice 2DE11D61h

_NvAPI_D3D10_CreateDeviceAndSwapChain 5B803DAFh

_NvAPI_D3D11_CreateDevice 6A16D3A0h

_NvAPI_D3D11_CreateDeviceAndSwapChain 0BB939EE5h

_NvAPI_D3D11_BeginShareResource 121BDC6h

_NvAPI_D3D11_EndShareResource 8FFB8E26h

_NvAPI_D3D11_SetDepthBoundsTest 7AAF7A04h

_NvAPI_GPU_GetShaderPipeCount 63E2F56Fh

_NvAPI_GPU_GetShaderSubPipeCount 0BE17923h

_NvAPI_GPU_GetPartitionCount 86F05D7Ah

_NvAPI_GPU_GetMemPartitionMask 329D77CDh

_NvAPI_GPU_GetTPCMask 4A35DF54h

_NvAPI_GPU_GetSMMask 0EB7AF173h

_NvAPI_GPU_GetTotalTPCCount 4E2F76A8h

_NvAPI_GPU_GetTotalSMCount 0AE5FBCFEh

_NvAPI_GPU_GetTotalSPCount 0B6D62591h

_NvAPI_GPU_GetGpuCoreCount 0C7026A87h

_NvAPI_GPU_GetAllOutputs 7D554F8Eh

_NvAPI_GPU_GetConnectedOutputs 1730BFC9h

_NvAPI_GPU_GetConnectedSLIOutputs 680DE09h

_NvAPI_GPU_GetConnectedDisplayIds 78DBA2h

_NvAPI_GPU_GetAllDisplayIds 785210A2h

_NvAPI_GPU_GetConnectedOutputsWithLidState 0CF8CAF39h

_NvAPI_GPU_GetConnectedSLIOutputsWithLidState 96043CC7h

_NvAPI_GPU_GetSystemType 0BAAABFCCh

_NvAPI_GPU_GetActiveOutputs 0E3E89B6Fh

_NvAPI_GPU_GetEDID 37D32E69h

_NvAPI_GPU_SetEDID 0E83D6456h

_NvAPI_GPU_GetOutputType 40A505E4h

_NvAPI_GPU_GetDeviceDisplayMode 0D2277E3Ah

_NvAPI_GPU_GetFlatPanelInfo 36CFF969h

_NvAPI_GPU_ValidateOutputCombination 34C9C2D4h

_NvAPI_GPU_GetConnectorInfo 4ECA2C10h

_NvAPI_GPU_GetFullName 0CEEE8E9Fh

_NvAPI_GPU_GetPCIIdentifiers 2DDFB66Eh

_NvAPI_GPU_GetGPUType 0C33BAEB1h

_NvAPI_GPU_GetBusType 1BB18724h

_NvAPI_GPU_GetBusId 1BE0B8E5h

_NvAPI_GPU_GetBusSlotId 2A0A350Fh

_NvAPI_GPU_GetIRQ 0E4715417h

_NvAPI_GPU_GetVbiosRevision 0ACC3DA0Ah

_NvAPI_GPU_GetVbiosOEMRevision 2D43FB31h

_NvAPI_GPU_GetVbiosVersionString 0A561FD7Dh

_NvAPI_GPU_GetAGPAperture 6E042794h

_NvAPI_GPU_GetCurrentAGPRate 0C74925A0h

_NvAPI_GPU_GetCurrentPCIEDownstreamWidth 0D048C3B1h

_NvAPI_GPU_GetPhysicalFrameBufferSize 46FBEB03h

_NvAPI_GPU_GetVirtualFrameBufferSize 5A04B644h

_NvAPI_GPU_GetQuadroStatus 0E332FA47h

_NvAPI_GPU_GetBoardInfo 22D54523h

_NvAPI_GPU_GetRamType 57F7CAACh

_NvAPI_GPU_GetFBWidthAndLocation 11104158h

_NvAPI_GPU_GetAllClockFrequencies 0DCB616C3h

_NvAPI_GPU_GetPerfClocks 1EA54A3Bh

_NvAPI_GPU_SetPerfClocks 7BCF4ACh

_NvAPI_GPU_GetCoolerSettings 0DA141340h

_NvAPI_GPU_SetCoolerLevels 891FA0AEh

_NvAPI_GPU_RestoreCoolerSettings 8F6ED0FBh

_NvAPI_GPU_GetCoolerPolicyTable 518A32Ch

_NvAPI_GPU_SetCoolerPolicyTable 987947CDh

_NvAPI_GPU_RestoreCoolerPolicyTable 0D8C4FE63h

_NvAPI_GPU_GetPstatesInfo 0BA94C56Eh

_NvAPI_GPU_GetPstatesInfoEx 843C0256h

_NvAPI_GPU_SetPstatesInfo 0CDF27911h

_NvAPI_GPU_GetPstates20 6FF81213h

_NvAPI_GPU_SetPstates20 0F4DAE6Bh

_NvAPI_GPU_GetCurrentPstate 927DA4F6h

_NvAPI_GPU_GetPstateClientLimits 88C82104h

_NvAPI_GPU_SetPstateClientLimits 0FDFC7D49h

_NvAPI_GPU_EnableOverclockedPstates 0B23B70EEh

_NvAPI_GPU_EnableDynamicPstates 0FA579A0Fh

_NvAPI_GPU_GetDynamicPstatesInfoEx 60DED2EDh

_NvAPI_GPU_GetVoltages 7D656244h

_NvAPI_GPU_GetThermalSettings 0E3640A56h

_NvAPI_GPU_SetDitherControl 0DF0DFCDDh

_NvAPI_GPU_GetDitherControl 932AC8FBh

_NvAPI_GPU_GetColorSpaceConversion 8159E87Ah

_NvAPI_GPU_SetColorSpaceConversion 0FCABD23Ah

_NvAPI_GetTVOutputInfo 30C805D5h

_NvAPI_GetTVEncoderControls 5757474Ah

_NvAPI_SetTVEncoderControls 0CA36A3ABh

_NvAPI_GetTVOutputBorderColor 6DFD1C8Ch

_NvAPI_SetTVOutputBorderColor 0AED02700h

_NvAPI_GetDisplayPosition 6BB1EE5Dh

_NvAPI_SetDisplayPosition 57D9060Fh

_NvAPI_GetValidGpuTopologies 5DFAB48Ah

_NvAPI_GetInvalidGpuTopologies 15658BE6h

_NvAPI_SetGpuTopologies 25201F3Dh

_NvAPI_GPU_GetPerGpuTopologyStatus 0A81F8992h

_NvAPI_SYS_GetChipSetTopologyStatus 8A50F126h

_NvAPI_GPU_Get_DisplayPort_DongleInfo 76A70E8Dh

_NvAPI_I2CRead 2FDE12C5h

_NvAPI_I2CWrite 0E812EB07h

_NvAPI_I2CWriteEx 283AC65Ah

_NvAPI_I2CReadEx 4D7B0709h

_NvAPI_GPU_GetPowerMizerInfo 76BFA16Bh

_NvAPI_GPU_SetPowerMizerInfo 50016C78h

_NvAPI_GPU_GetVoltageDomainsStatus 0C16C7E2Ch

_NvAPI_GPU_ClientPowerTopologyGetInfo 0A4DFD3F2h

_NvAPI_GPU_ClientPowerTopologyGetStatus 0EDCF624Eh

_NvAPI_GPU_ClientPowerPoliciesGetInfo 34206D86h

_NvAPI_GPU_ClientPowerPoliciesGetStatus 70916171h

_NvAPI_GPU_ClientPowerPoliciesSetStatus 0AD95F5EDh

_NvAPI_GPU_WorkstationFeatureSetup 6C1F3FE4h

_NvAPI_SYS_GetChipSetInfo 53DABBCAh

_NvAPI_SYS_GetLidAndDockInfo 0CDA14D8Ah

_NvAPI_OGL_ExpertModeSet 3805EF7Ah

_NvAPI_OGL_ExpertModeGet 22ED9516h

_NvAPI_OGL_ExpertModeDefaultsSet 0B47A657Eh

_NvAPI_OGL_ExpertModeDefaultsGet 0AE921F12h

_NvAPI_SetDisplaySettings 0E04F3D86h

_NvAPI_GetDisplaySettings 0DC27D5D4h

_NvAPI_GetTiming 0AFC4833Eh

_NvAPI_DISP_GetMonitorCapabilities 3B05C7E1h

_NvAPI_EnumCustomDisplay 42892957h

_NvAPI_TryCustomDisplay 0BF6C1762h

_NvAPI_RevertCustomDisplayTrial 854BA405h

_NvAPI_DeleteCustomDisplay 0E7CB998Dh

_NvAPI_SaveCustomDisplay 0A9062C78h

_NvAPI_QueryUnderscanCap 61D7B624h

_NvAPI_EnumUnderscanConfig 4144111Ah

_NvAPI_DeleteUnderscanConfig 0F98854C8h

_NvAPI_SetUnderscanConfig 3EFADA1Dh

_NvAPI_GetDisplayFeatureConfig 8E985CCDh

_NvAPI_SetDisplayFeatureConfig 0F36A668Dh

_NvAPI_GetDisplayFeatureConfigDefaults 0F5F4D01h

_NvAPI_SetView 957D7B6h

_NvAPI_GetView 0D6B99D89h

_NvAPI_SetViewEx 6B89E68h

_NvAPI_GetViewEx 0DBBC0AF4h

_NvAPI_GetSupportedViews 66FB7FC0h

_NvAPI_GetHDCPLinkParameters 0B3BB0772h

_NvAPI_Disp_DpAuxChannelControl 8EB56969h

_NvAPI_SetHybridMode 0FB22D656h

_NvAPI_GetHybridMode 0E23B68C1h

_NvAPI_Coproc_GetCoprocStatus 1EFC3957h

_NvAPI_Coproc_SetCoprocInfoFlagsEx 0F4C863ACh

_NvAPI_Coproc_GetCoprocInfoFlagsEx 69A9874Dh

_NvAPI_Coproc_NotifyCoprocPowerState 0CADCB956h

_NvAPI_Coproc_GetApplicationCoprocInfo 79232685h

_NvAPI_GetVideoState 1C5659CDh

_NvAPI_SetVideoState 54FE75Ah

_NvAPI_SetFrameRateNotify 18919887h

_NvAPI_SetPVExtName 4FEEB498h

_NvAPI_GetPVExtName 2F5B08E0h

_NvAPI_SetPVExtProfile 8354A8F4h

_NvAPI_GetPVExtProfile 1B1B9A16h

_NvAPI_VideoSetStereoInfo 97063269h

_NvAPI_VideoGetStereoInfo 8E1F8CFEh

_NvAPI_Mosaic_GetSupportedTopoInfo 0FDB63C81h

_NvAPI_Mosaic_GetTopoGroup 0CB89381Dh

_NvAPI_Mosaic_GetOverlapLimits 989685F0h

_NvAPI_Mosaic_SetCurrentTopo 9B542831h

_NvAPI_Mosaic_GetCurrentTopo 0EC32944Eh

_NvAPI_Mosaic_EnableCurrentTopo 5F1AA66Ch

_NvAPI_Mosaic_SetGridTopology 3F113C77h

_NvAPI_Mosaic_GetMosaicCapabilities 0DA97071Eh

_NvAPI_Mosaic_GetDisplayCapabilities 0D58026B9h

_NvAPI_Mosaic_EnumGridTopologies 0A3C55220h

_NvAPI_Mosaic_GetDisplayViewportsByResolution 0DC6DC8D3h

_NvAPI_Mosaic_GetMosaicViewports 7EBA036h

_NvAPI_Mosaic_SetDisplayGrids 4D959A89h

_NvAPI_Mosaic_ValidateDisplayGridsWithSLI 1ECFD263h

_NvAPI_Mosaic_ValidateDisplayGrids 0CF43903Dh

_NvAPI_Mosaic_EnumDisplayModes 78DB97D7h

_NvAPI_Mosaic_ChooseGpuTopologies 0B033B140h

_NvAPI_Mosaic_EnumDisplayGrids 0DF2887AFh

_NvAPI_GetSupportedMosaicTopologies 410B5C25h

_NvAPI_GetCurrentMosaicTopology 0F60852BDh

_NvAPI_SetCurrentMosaicTopology 0D54B8989h

_NvAPI_EnableCurrentMosaicTopology 74073CC9h

_NvAPI_QueryNonMigratableApps 0BB9EF1C3h

_NvAPI_GPU_QueryActiveApps 65B1C5F5h

_NvAPI_Hybrid_QueryUnblockedNonMigratableApps 5F35BCB5h

_NvAPI_Hybrid_QueryBlockedMigratableApps 0F4C2F8CCh

_NvAPI_Hybrid_SetAppMigrationState 0FA0B9A59h

_NvAPI_Hybrid_IsAppMigrationStateChangeable 584CB0B6h

_NvAPI_GPU_GPIOQueryLegalPins 0FAB69565h

_NvAPI_GPU_GPIOReadFromPin 0F5E10439h

_NvAPI_GPU_GPIOWriteToPin 0F3B11E68h

_NvAPI_GPU_GetHDCPSupportStatus 0F089EEF5h

_NvAPI_SetTopologyFocusDisplayAndView 0A8064F9h

_NvAPI_Stereo_CreateConfigurationProfileRegistryKey 0BE7692ECh

_NvAPI_Stereo_DeleteConfigurationProfileRegistryKey 0F117B834h

_NvAPI_Stereo_SetConfigurationProfileValue 24409F48h

_NvAPI_Stereo_DeleteConfigurationProfileValue 49BCEECFh

_NvAPI_Stereo_Enable 239C4545h

_NvAPI_Stereo_Disable 2EC50C2Bh

_NvAPI_Stereo_IsEnabled 348FF8E1h

_NvAPI_Stereo_GetStereoCaps 0DFC063B7h

_NvAPI_Stereo_GetStereoSupport 296C434Dh

_NvAPI_Stereo_CreateHandleFromIUnknown 0AC7E37F4h

_NvAPI_Stereo_DestroyHandle 3A153134h

_NvAPI_Stereo_Activate 0F6A1AD68h

_NvAPI_Stereo_Deactivate 2D68DE96h

_NvAPI_Stereo_IsActivated 1FB0BC30h

_NvAPI_Stereo_GetSeparation 451F2134h

_NvAPI_Stereo_SetSeparation 5C069FA3h

_NvAPI_Stereo_DecreaseSeparation 0DA044458h

_NvAPI_Stereo_IncreaseSeparation 0C9A8ECECh

_NvAPI_Stereo_GetConvergence 4AB00934h

_NvAPI_Stereo_SetConvergence 3DD6B54Bh

_NvAPI_Stereo_DecreaseConvergence 4C87E317h

_NvAPI_Stereo_IncreaseConvergence 0A17DAABEh

_NvAPI_Stereo_GetFrustumAdjustMode 0E6839B43h

_NvAPI_Stereo_SetFrustumAdjustMode 7BE27FA2h

_NvAPI_Stereo_CaptureJpegImage 932CB140h

_NvAPI_Stereo_CapturePngImage 8B7E99B5h

_NvAPI_Stereo_ReverseStereoBlitControl 3CD58F89h

_NvAPI_Stereo_SetNotificationMessage 6B9B409Eh

_NvAPI_Stereo_SetActiveEye 96EEA9F8h

_NvAPI_Stereo_SetDriverMode 5E8F0BECh

_NvAPI_Stereo_GetEyeSeparation 0CE653127h

_NvAPI_Stereo_IsWindowedModeSupported 40C8ED5Eh

_NvAPI_Stereo_AppHandShake 8C610BDAh

_NvAPI_Stereo_HandShake_Trigger_Activation 0B30CD1A7h

_NvAPI_Stereo_HandShake_Message_Control 315E0EF0h

_NvAPI_Stereo_SetSurfaceCreationMode 0F5DCFCBAh

_NvAPI_Stereo_GetSurfaceCreationMode 36F1C736h

_NvAPI_Stereo_Debug_WasLastDrawStereoized 0ED4416C5h

_NvAPI_Stereo_ForceToScreenDepth 2D495758h

_NvAPI_Stereo_SetVertexShaderConstantF 416C07B3h

_NvAPI_Stereo_SetVertexShaderConstantB 5268716Fh

_NvAPI_Stereo_SetVertexShaderConstantI 7923BA0Eh

_NvAPI_Stereo_GetVertexShaderConstantF 622FDC87h

_NvAPI_Stereo_GetVertexShaderConstantB 712BAA5Bh

_NvAPI_Stereo_GetVertexShaderConstantI 5A60613Ah

_NvAPI_Stereo_SetPixelShaderConstantF 0A9657F32h

_NvAPI_Stereo_SetPixelShaderConstantB 0BA6109EEh

_NvAPI_Stereo_SetPixelShaderConstantI 912AC28Fh

_NvAPI_Stereo_GetPixelShaderConstantF 0D4974572h

_NvAPI_Stereo_GetPixelShaderConstantB 0C79333AEh

_NvAPI_Stereo_GetPixelShaderConstantI 0ECD8F8CFh

_NvAPI_Stereo_SetDefaultProfile 44F0ECD1h

_NvAPI_Stereo_GetDefaultProfile 624E21C2h

_NvAPI_Stereo_Is3DCursorSupported 0D7C9EC09h

_NvAPI_Stereo_GetCursorSeparation 72162B35h

_NvAPI_Stereo_SetCursorSeparation 0FBC08FC1h

_NvAPI_VIO_GetCapabilities 1DC91303h

_NvAPI_VIO_Open 44EE4841h

_NvAPI_VIO_Close 0D01BD237h

_NvAPI_VIO_Status 0E6CE4F1h

_NvAPI_VIO_SyncFormatDetect 118D48A3h

_NvAPI_VIO_GetConfig 0D34A789Bh

_NvAPI_VIO_SetConfig 0E4EEC07h

_NvAPI_VIO_SetCSC 0A1EC8D74h

_NvAPI_VIO_GetCSC 7B0D72A3h

_NvAPI_VIO_SetGamma 964BF452h

_NvAPI_VIO_GetGamma 51D53D06h

_NvAPI_VIO_SetSyncDelay 2697A8D1h

_NvAPI_VIO_GetSyncDelay 462214A9h

_NvAPI_VIO_GetPCIInfo 0B981D935h

_NvAPI_VIO_IsRunning 96BD040Eh

_NvAPI_VIO_Start 0CDE8E1A3h

_NvAPI_VIO_Stop 6BA2A5D6h

_NvAPI_VIO_IsFrameLockModeCompatible 7BF0A94Dh

_NvAPI_VIO_EnumDevices 0FD7C5557h

_NvAPI_VIO_QueryTopology 869534E2h

_NvAPI_VIO_EnumSignalFormats 0EAD72FE4h

_NvAPI_VIO_EnumDataFormats 221FA8E8h

_NvAPI_GPU_GetTachReading 5F608315h

_NvAPI_3D_GetProperty 8061A4B1h

_NvAPI_3D_SetProperty 0C9175E8Dh

_NvAPI_3D_GetPropertyRange 0B85DE27Ch

_NvAPI_GPS_GetPowerSteeringStatus 540EE82Eh

_NvAPI_GPS_SetPowerSteeringStatus 9723D3A2h

_NvAPI_GPS_SetVPStateCap 68888EB4h

_NvAPI_GPS_GetVPStateCap 71913023h

_NvAPI_GPS_GetThermalLimit 583113EDh

_NvAPI_GPS_SetThermalLimit 0C07E210Fh

_NvAPI_GPS_GetPerfSensors 271C1109h

_NvAPI_SYS_GetDisplayIdFromGpuAndOutputId 8F2BAB4h

_NvAPI_SYS_GetGpuAndOutputIdFromDisplayId 112BA1A5h

_NvAPI_DISP_GetDisplayIdByDisplayName 0AE457190h

_NvAPI_DISP_GetGDIPrimaryDisplayId 1E9D8A31h

_NvAPI_DISP_GetDisplayConfig 11ABCCF8h

_NvAPI_DISP_SetDisplayConfig 5D8CF8DEh

_NvAPI_GPU_GetPixelClockRange 66AF10B7h

_NvAPI_GPU_SetPixelClockRange 5AC7F8E5h

_NvAPI_GPU_GetECCStatusInfo 0CA1DDAF3h

_NvAPI_GPU_GetECCErrorInfo 0C71F85A6h

_NvAPI_GPU_ResetECCErrorInfo 0C02EEC20h

_NvAPI_GPU_GetECCConfigurationInfo 77A796F3h

_NvAPI_GPU_SetECCConfiguration 1CF639D9h

_NvAPI_D3D1x_CreateSwapChain 1BC21B66h

_NvAPI_D3D9_CreateSwapChain 1A131E09h

_NvAPI_D3D_SetFPSIndicatorState 0A776E8DBh

_NvAPI_D3D9_Present 5650BEBh

_NvAPI_D3D9_QueryFrameCount 9083E53Ah

_NvAPI_D3D9_ResetFrameCount 0FA6A0675h

_NvAPI_D3D9_QueryMaxSwapGroup 5995410Dh

_NvAPI_D3D9_QuerySwapGroup 0EBA4D232h

_NvAPI_D3D9_JoinSwapGroup 7D44BB54h

_NvAPI_D3D9_BindSwapBarrier 9C39C246h

_NvAPI_D3D1x_Present 3B845A1h

_NvAPI_D3D1x_QueryFrameCount 9152E055h

_NvAPI_D3D1x_ResetFrameCount 0FBBB031Ah

_NvAPI_D3D1x_QueryMaxSwapGroup 9BB9D68Fh

_NvAPI_D3D1x_QuerySwapGroup 407F67AAh

_NvAPI_D3D1x_JoinSwapGroup 14610CD7h

_NvAPI_D3D1x_BindSwapBarrier 9DE8C729h

_NvAPI_SYS_VenturaGetState 0CB7C208Dh

_NvAPI_SYS_VenturaSetState 0CE2E9D9h

_NvAPI_SYS_VenturaGetCoolingBudget 0C9D86E33h

_NvAPI_SYS_VenturaSetCoolingBudget 85FF5A15h

_NvAPI_SYS_VenturaGetPowerReading 63685979h

_NvAPI_DISP_GetDisplayBlankingState 63E5D8DBh

_NvAPI_DISP_SetDisplayBlankingState 1E17E29Bh

_NvAPI_DRS_CreateSession 694D52Eh

_NvAPI_DRS_DestroySession 0DAD9CFF8h

_NvAPI_DRS_LoadSettings 375DBD6Bh

_NvAPI_DRS_SaveSettings 0FCBC7E14h

_NvAPI_DRS_LoadSettingsFromFile 0D3EDE889h

_NvAPI_DRS_SaveSettingsToFile 2BE25DF8h

_NvAPI_DRS_CreateProfile 0CC176068h

_NvAPI_DRS_DeleteProfile 17093206h

_NvAPI_DRS_SetCurrentGlobalProfile 1C89C5DFh

_NvAPI_DRS_GetCurrentGlobalProfile 617BFF9Fh

_NvAPI_DRS_GetProfileInfo 61CD6FD6h

_NvAPI_DRS_SetProfileInfo 16ABD3A9h

_NvAPI_DRS_FindProfileByName 7E4A9A0Bh

_NvAPI_DRS_EnumProfiles 0BC371EE0h

_NvAPI_DRS_GetNumProfiles 1DAE4FBCh

_NvAPI_DRS_CreateApplication 4347A9DEh

_NvAPI_DRS_DeleteApplicationEx 0C5EA85A1h

_NvAPI_DRS_DeleteApplication 2C694BC6h

_NvAPI_DRS_GetApplicationInfo 0ED1F8C69h

_NvAPI_DRS_EnumApplications 7FA2173Ah

_NvAPI_DRS_FindApplicationByName 0EEE566B2h

_NvAPI_DRS_SetSetting 577DD202h

_NvAPI_DRS_GetSetting 73BF8338h

_NvAPI_DRS_EnumSettings 0AE3039DAh

_NvAPI_DRS_EnumAvailableSettingIds 0F020614Ah

_NvAPI_DRS_EnumAvailableSettingValues 2EC39F90h

_NvAPI_DRS_GetSettingIdFromName 0CB7309CDh

_NvAPI_DRS_GetSettingNameFromId 0D61CBE6Eh

_NvAPI_DRS_DeleteProfileSetting 0E4A26362h

_NvAPI_DRS_RestoreAllDefaults 5927B094h

_NvAPI_DRS_RestoreProfileDefault 0FA5F6134h

_NvAPI_DRS_RestoreProfileDefaultSetting 53F0381Eh

_NvAPI_DRS_GetBaseProfile 0DA8466A0h

_NvAPI_Event_RegisterCallback 0E6DBEA69h

_NvAPI_Event_UnregisterCallback 0DE1F9B45h

_NvAPI_GPU_GetCurrentThermalLevel 0D2488B79h

_NvAPI_GPU_GetCurrentFanSpeedLevel 0BD71F0C9h

_NvAPI_GPU_SetScanoutIntensity 0A57457A4h

_NvAPI_GPU_SetScanoutWarping 0B34BAB4Fh

_NvAPI_GPU_GetScanoutConfiguration 6A9F5B63h

_NvAPI_DISP_SetHCloneTopology 61041C24h

_NvAPI_DISP_GetHCloneTopology 47BAD137h

_NvAPI_DISP_ValidateHCloneTopology 5F4C2664h

_NvAPI_GPU_GetPerfDecreaseInfo 7F7F4600h

_NvAPI_GPU_QueryIlluminationSupport 0A629DA31h

_NvAPI_GPU_GetIllumination 9A1B9365h

_NvAPI_GPU_SetIllumination 254A187h

_NvAPI_D3D1x_IFR_SetUpTargetBufferToSys 473F7828h

_NvAPI_D3D1x_IFR_TransferRenderTarget 9FBAE4EBh

“_NvAPI_GPU_SetPstates20 0F4DAE6Bh”, it checks, we’re definitely heading the correct way!

In IDApro we can also check the Xrefs from the NvQueryInterface and we land at the start of the data section with a big array of INTs grouped by pairs, each comprises the address of a function and the associated Nvidia function ID:

Then again, IDs and addresses are valid according to the information we already have.

It means we are now sure of the location of the location for “GetPstates20” and “SetPstates20”. We can break directly inside them at will. Let’s do that in IDA after importing the nvapi.h headers so IDA knows about the structs in use: grab the pointer for the second argument on the stack just when entering “GetPstates20”, dereference it and apply the type of an “NV_GPU_PERF_PSTATES20_INFO_V1” struct to it.

Now all those apparently garbage values are starting to make sense.

We can confirm everything is correct as we expected by comparing one of the values to an authoritative measurement. Here 0xD8ACC stands for the GPU vCore represented as µVolts. It is 887500 in base 10, meaning 887.5mV or 0.8875V. The GPU-Z tool reports a similar value.

It seems we’re doing fine:

For good measure let’s go back a bit and put a conditional logging breakpoint in Olly at the beginning of the “QueryInterface” function in order to log ALL the function IDs successively requested by MSI Afterburner. Just in case things don’t go smoothly and we encounter a difficult pipeline setup before being able to overclock the GPU.

_NvAPI_Initialize = 0150E828

718006A0 COND: offset = 33C7358C

718006A0 COND: offset = 593E8644

_NvAPI_SYS_GetDriverAndBranchVersion = 2926AAAD

_NvAPI_EnumPhysicalGPUs = E5AC921F

718006A0 COND: offset = 6533EA3E

_NvAPI_GPU_GetBusId = 1BE0B8E5

_NvAPI_GPU_GetBusSlotId = 2A0A350F

_NvAPI_GPU_GetPCIIdentifiers = 2DDFB66E

_NvAPI_DRS_EnumAvailableSettingIds = F020614A

_NvAPI_DRS_CreateSession = 0694D52E

_NvAPI_DRS_LoadSettings = 375DBD6B

_NvAPI_DRS_GetBaseProfile = DA8466A0

_NvAPI_DRS_GetSetting = 73BF8338

_NvAPI_DRS_DestroySession = DAD9CFF8

_NvAPI_GPU_ClientPowerTopologyGetStatus = EDCF624E

_NvAPI_GPU_GetThermalSettings = E3640A56

_NvAPI_GPU_GetDynamicPstatesInfoEx = 60DED2ED

_NvAPI_GPU_GetCoolerSettings = DA141340

_NvAPI_GPU_GetAllClockFrequencies = DCB616C3

_NvAPI_GPU_GetPstates20 = 6FF81213

718006A0 COND: offset = 07F9B368

718006A0 COND: offset = 409D9841

_NvAPI_GPU_ClientPowerPoliciesGetInfo = 34206D86

718006A0 COND: offset = 0D258BB5

_NvAPI_GPU_GetSystemType = BAAABFCC

_NvAPI_GPU_GetFullName = CEEE8E9F

718006A0 COND: offset = 3D358A0C

718006A0 COND: offset = D988F0F3

_NvAPI_GPU_GetVbiosVersionString = A561FD7D

_NvAPI_GPU_SetPstates20 = 0F4DAE6B

_NvAPI_Unload = D22BDD7E

Despite the stackoverflow post being somewhat outdated or incomplete we can still name the majority of the functions called.

Reading through that quickly shows the obvious things one would expect, init the NVapi, get various informations, finally call SetPstates20 (when we hit a breakpoint attempting some small underclock) and clean up the API.

Considering “GetVbiosVersionString” is called just before the overclocking related function and it’s a purely GUI/dashboard info feature we can safely assume that no particular setup is required, we just need the correct arguments to call said function.

Reversing the function’s arguments

This one was supposed to be hell but actually it went better than previous thought:

– Most NvAPI functions including GetPstates20 take a physical GPU handle as their first argument.

– GetPstates20’s second argument is a struct for storing Pstates and it is documented in the public NvAPI headers.

– A quick look at the code calling “SetPstates20” in MSI afterburner shows 2 pushed arguments before the call. The first one is “0x100” for both the GET and SET functions, it is the handle for our GPU#0.

It is then highly likely that “SetPstates20” will take the same kind of struct as “GetPstates20” for its second argument, with a few edited values.

Let’s get coding

First, let’s isolate the data structures we need from the NvAPI headers because there are far too many lines in that file and I’m lazy to the point that scrolling hurts my finger. Also those are mostly ints so we’ll get rid of all the fancy names and make them regular uint/int for readability.

typedef unsigned long NvU32; typedef struct { NvU32 version; NvU32 ClockType:2; NvU32 reserved:22; NvU32 reserved1:8; struct { NvU32 bIsPresent:1; NvU32 reserved:31; NvU32 frequency; }domain[32]; } NV_GPU_CLOCK_FREQUENCIES_V2; typedef struct { int value; struct { int mindelta; int maxdelta; } valueRange; } NV_GPU_PERF_PSTATES20_PARAM_DELTA; typedef struct { NvU32 domainId; NvU32 typeId; NvU32 bIsEditable:1; NvU32 reserved:31; NV_GPU_PERF_PSTATES20_PARAM_DELTA freqDelta_kHz; union { struct { NvU32 freq_kHz; } single; struct { NvU32 minFreq_kHz; NvU32 maxFreq_kHz; NvU32 domainId; NvU32 minVoltage_uV; NvU32 maxVoltage_uV; } range; } data; } NV_GPU_PSTATE20_CLOCK_ENTRY_V1; typedef struct { NvU32 domainId; NvU32 bIsEditable:1; NvU32 reserved:31; NvU32 volt_uV; int voltDelta_uV; } NV_GPU_PSTATE20_BASE_VOLTAGE_ENTRY_V1; typedef struct { NvU32 version; NvU32 bIsEditable:1; NvU32 reserved:31; NvU32 numPstates; NvU32 numClocks; NvU32 numBaseVoltages; struct { NvU32 pstateId; NvU32 bIsEditable:1; NvU32 reserved:31; NV_GPU_PSTATE20_CLOCK_ENTRY_V1 clocks[8]; NV_GPU_PSTATE20_BASE_VOLTAGE_ENTRY_V1 baseVoltages[4]; } pstates[16]; } NV_GPU_PERF_PSTATES20_INFO_V1;

Then we need prototypes for the functions we’ll use. Remember we won’t call the provided exports from the NvAPI lib inside the SDK but rather retrieve the function pointers directly from the running nvapi.dll and execute them as such.

A handful of convenient function prototypes to get some infos, retrieve clocks and setting them up. Some of them can be found inside the public API, the others are probably from the NDA version. We use the same techniques as mentionned earlier to get to know about them:

typedef void *(*NvAPI_QueryInterface_t)(unsigned int offset); typedef int (*NvAPI_Initialize_t)(); typedef int (*NvAPI_Unload_t)(); typedef int (*NvAPI_EnumPhysicalGPUs_t)(int **handles, int *count); typedef int (*NvAPI_GPU_GetSystemType_t)(int *handle, int *systype); typedef int (*NvAPI_GPU_GetFullName_t)(int *handle, char *sysname); typedef int (*NvAPI_GPU_GetPhysicalFrameBufferSize_t)(int *handle, int *memsize); typedef int (*NvAPI_GPU_GetRamType_t)(int *handle, int *memtype); typedef int (*NvAPI_GPU_GetVbiosVersionString_t)(int *handle, char *biosname); typedef int (*NvAPI_GPU_GetAllClockFrequencies_t)(int *handle, NV_GPU_PERF_PSTATES20_INFO_V1 *pstates_info); typedef int (*NvAPI_GPU_GetPstates20_t)(int *handle, NV_GPU_PERF_PSTATES20_INFO_V1 *pstates_info); typedef int (*NvAPI_GPU_SetPstates20_t)(int *handle, int *pstates_info); NvAPI_QueryInterface_t NvQueryInterface = 0; NvAPI_Initialize_t NvInit = 0; NvAPI_Unload_t NvUnload = 0; NvAPI_EnumPhysicalGPUs_t NvEnumGPUs = 0; NvAPI_GPU_GetSystemType_t NvGetSysType = 0; NvAPI_GPU_GetFullName_t NvGetName = 0; NvAPI_GPU_GetPhysicalFrameBufferSize_t NvGetMemSize = 0; NvAPI_GPU_GetRamType_t NvGetMemType = 0; NvAPI_GPU_GetVbiosVersionString_t NvGetBiosName = 0; NvAPI_GPU_GetAllClockFrequencies_t NvGetFreq = 0; NvAPI_GPU_GetPstates20_t NvGetPstates = 0; NvAPI_GPU_SetPstates20_t NvSetPstates = 0;

Time for the main() function, the code should be fairly short and this is just a PoC or whatever, brace yourself for screaming KNF nazis.

We’ll need those variables, they are of lesser importance, just the last line is a requirement.

“NV_GPU_PERF_PSTATES20_INFO_V1” is the root struct holding all the clocking and power data for the selected gpu handle. The size of this struct is 0x1c94, for some reason Nvidia decided to use that as the “version” field after adding 0x10000 to it so we set that field to 0x11c94 or the subsequent calls using the structure will return a cryptic error code.

int nGPU=0, userfreq = 0, systype=0, memsize=0, memtype=0; int *hdlGPU[64]={0}, *buf=0; char sysname[64]={0}, biosname[64]={0}; NV_GPU_PERF_PSTATES20_INFO_V1 pstates_info; pstates_info.version = 0x11c94;

Now we actually load “nvapi.dll” in our program’s memory space and retrieve the “nvapi_QueryInterface” export that will provide us with the pointers for all the other functions. We then call it sucessively with all the IDs we need and assign the result to our function pointers.

NvQueryInterface = (void*)GetProcAddress(LoadLibrary("nvapi.dll"), "nvapi_QueryInterface"); NvInit = NvQueryInterface(0x0150E828); NvUnload = NvQueryInterface(0xD22BDD7E); NvEnumGPUs = NvQueryInterface(0xE5AC921F); NvGetSysType = NvQueryInterface(0xBAAABFCC); NvGetName = NvQueryInterface(0xCEEE8E9F); NvGetMemSize = NvQueryInterface(0x46FBEB03); NvGetMemType = NvQueryInterface(0x57F7CAAC); NvGetBiosName = NvQueryInterface(0xA561FD7D); NvGetFreq = NvQueryInterface(0xDCB616C3); NvGetPstates = NvQueryInterface(0x6FF81213); NvSetPstates = NvQueryInterface(0x0F4DAE6B);

We have all the required bits for our big plot so let’s just assemble the bricks together to get the information and data we need and display them in an ugly fashion.

NvInit(); NvEnumGPUs(hdlGPU, &nGPU); NvGetSysType(hdlGPU[0], &systype); NvGetName(hdlGPU[0], sysname); NvGetMemSize(hdlGPU[0], &memsize); NvGetMemType(hdlGPU[0], &memtype); NvGetBiosName(hdlGPU[0], biosname); NvGetPstates(hdlGPU[0], &pstates_info); switch(systype){ case 1: printf("

Type: Laptop

"); break; case 2: printf("

Type: Desktop

"); break; default: printf("

Type: Unknown

"); break; } printf("Name: %s

", sysname); printf("VRAM: %dMB GDDR%d

", memsize/1024, memtype<=7?3:5); printf("BIOS: %s

", biosname); printf("

GPU: %dMHz

", (int)((pstates_info.pstates[0].clocks[0]).data.range.maxFreq_kHz)/1000); printf("RAM: %dMHz

", (int)((pstates_info.pstates[0].clocks[1]).data.single.freq_kHz)/1000); printf("

Current GPU OC: %dMHz

", (int)((pstates_info.pstates[0].clocks[0]).freqDelta_kHz.value)/1000); printf("Current RAM OC: %dMHz

", (int)((pstates_info.pstates[0].clocks[1]).freqDelta_kHz.value)/1000);

It should already be enough for a simple monitoring program like the well known GPUz but we can do better than that and get to the delicious megahertz… /omnomnomz.

Here’s for the GPU overclocking:

if(argc > 1){ userfreq = atoi(argv[1])*1000; if(-250000 <= userfreq && userfreq <= 250000) { buf = malloc(0x1c94); memset(buf, 0, 0x1c94); buf[0] = 0x11c94; buf[2] = 1; buf[3] = 1; buf[10] = userfreq; NvSetPstates(hdlGPU[0], buf)? printf("

GPU OC failed!

") : printf("

GPU OC OK: %d MHz

", userfreq/1000); free(buf); } else { printf("

GPU Frequency not in safe range (-250MHz to +250MHz).

"); return 1; } }

And almost the same block of code for the VRAM overlocking:

if(argc > 2){ userfreq = atoi(argv[2])*1000; if(-250000 <= userfreq && userfreq <= 250000) { buf = malloc(0x1c94); memset(buf, 0, 0x1c94); buf[0] = 0x11c94; buf[2] = 1; buf[3] = 1; buf[7] = 4; buf[10] = memtype<=7?userfreq:userfreq*2; NvSetPstates(hdlGPU[0], buf)? printf("VRAM OC failed!

") : printf("RAM OC OK: %d MHz

", userfreq/1000); free(buf); } else { printf("

RAM Frequency not in safe range (-250MHz to +250MHz).

"); return 1; } }

What it does is simple. Frequencies in the struct are expressed as KiloHertz, so we multiply by 1000 the frequency offset provided by the user trying not to ease an integer overflow that may induce a frying core. ;-)

No seriously, we “allocate” some “NV_GPU_PERF_PSTATES20_INFO_V1” yet again, but it’s a bloated pain and we only want a few values changed so we make it an empty buffer the same size of the struct.

We then fill the first int with the magic 0x11c94 version number. The 2nd and 3rd ints with 1, probably meaning we’ll provide only one Pstate profile containing only one Clock domain to the SetPstates20() function.

But if we do that… it works for the GPU but not for the VRAM, how do we overclock the damn VRAM?

At this point I was saying to myself “why can those guys get an overclock in their crappy soft and I cannot, that’s just unfair”. But this approach never gives any noteworthy result so I got back in IDA and diffed my struct with the struct that MSI Afterburner provides to the same call when overclocking the RAM.

And the trick was there before my eyes, the 7th int of the struct had changed from 0 to 4. This field is probably used as a flag with 0 being the GPU, 2 may or may not be the separate shaders clock domain for the previous GPU generations of that kind and 4 would then be the VRAM.

At last we cleanup the DLL and end our program:

NvUnload(); return 0;

What’s left to be done?

Testing our new toy obviously! We compile that thing and run it:

C:\>overclock.exe [+/- GPU MHz offset] [+/- RAM MHz offset]

Here we run two benchmarks using a basic MD5 bruteforcing so that we can be sure the modified clock speed is effective and we didn’t just change some funny numbers for display only.

First at stock frequency (950MHz) and then with a 100MHz underclocking (the dev machine is a laptop, I don’t intend to make it faster so substracting 100 will suffice).

And… it works! We’re done here guys.

I am not aware of any other open source implementation of such tool, it might only be a very simple C program in the end but it exists and the minimum required details for overclocking an Nvidia GPU programmatically are now public and in plain text.

This code, for what it’s worth, is free as in free beer: take it, polish it, make a LIGHT (not the usual stellar poop, we already have those) GUI for your own needs and enjoy.

Here’s a binary build of the program, rename it as .exe as wordpress.com will only let users upload media files:

Full code for your convenience, it should compile without warnings on pretty much everything and has no dependencies:

/* DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE Version 2, December 2004 Copyright (C) 2004 Sam Hocevar <sam@hocevar.net> Everyone is permitted to copy and distribute verbatim or modified copies of this license document, and changing it is allowed as long as the name is changed. DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. You just DO WHAT THE FUCK YOU WANT TO. */ #include <stdio.h> #include <stdlib.h> #include <windows.h> typedef unsigned long NvU32; typedef struct { NvU32 version; NvU32 ClockType:2; NvU32 reserved:22; NvU32 reserved1:8; struct { NvU32 bIsPresent:1; NvU32 reserved:31; NvU32 frequency; }domain[32]; } NV_GPU_CLOCK_FREQUENCIES_V2; typedef struct { int value; struct { int mindelta; int maxdelta; } valueRange; } NV_GPU_PERF_PSTATES20_PARAM_DELTA; typedef struct { NvU32 domainId; NvU32 typeId; NvU32 bIsEditable:1; NvU32 reserved:31; NV_GPU_PERF_PSTATES20_PARAM_DELTA freqDelta_kHz; union { struct { NvU32 freq_kHz; } single; struct { NvU32 minFreq_kHz; NvU32 maxFreq_kHz; NvU32 domainId; NvU32 minVoltage_uV; NvU32 maxVoltage_uV; } range; } data; } NV_GPU_PSTATE20_CLOCK_ENTRY_V1; typedef struct { NvU32 domainId; NvU32 bIsEditable:1; NvU32 reserved:31; NvU32 volt_uV; int voltDelta_uV; } NV_GPU_PSTATE20_BASE_VOLTAGE_ENTRY_V1; typedef struct { NvU32 version; NvU32 bIsEditable:1; NvU32 reserved:31; NvU32 numPstates; NvU32 numClocks; NvU32 numBaseVoltages; struct { NvU32 pstateId; NvU32 bIsEditable:1; NvU32 reserved:31; NV_GPU_PSTATE20_CLOCK_ENTRY_V1 clocks[8]; NV_GPU_PSTATE20_BASE_VOLTAGE_ENTRY_V1 baseVoltages[4]; } pstates[16]; } NV_GPU_PERF_PSTATES20_INFO_V1; typedef void *(*NvAPI_QueryInterface_t)(unsigned int offset); typedef int (*NvAPI_Initialize_t)(); typedef int (*NvAPI_Unload_t)(); typedef int (*NvAPI_EnumPhysicalGPUs_t)(int **handles, int *count); typedef int (*NvAPI_GPU_GetSystemType_t)(int *handle, int *systype); typedef int (*NvAPI_GPU_GetFullName_t)(int *handle, char *sysname); typedef int (*NvAPI_GPU_GetPhysicalFrameBufferSize_t)(int *handle, int *memsize); typedef int (*NvAPI_GPU_GetRamType_t)(int *handle, int *memtype); typedef int (*NvAPI_GPU_GetVbiosVersionString_t)(int *handle, char *biosname); typedef int (*NvAPI_GPU_GetAllClockFrequencies_t)(int *handle, NV_GPU_PERF_PSTATES20_INFO_V1 *pstates_info); typedef int (*NvAPI_GPU_GetPstates20_t)(int *handle, NV_GPU_PERF_PSTATES20_INFO_V1 *pstates_info); typedef int (*NvAPI_GPU_SetPstates20_t)(int *handle, int *pstates_info); NvAPI_QueryInterface_t NvQueryInterface = 0; NvAPI_Initialize_t NvInit = 0; NvAPI_Unload_t NvUnload = 0; NvAPI_EnumPhysicalGPUs_t NvEnumGPUs = 0; NvAPI_GPU_GetSystemType_t NvGetSysType = 0; NvAPI_GPU_GetFullName_t NvGetName = 0; NvAPI_GPU_GetPhysicalFrameBufferSize_t NvGetMemSize = 0; NvAPI_GPU_GetRamType_t NvGetMemType = 0; NvAPI_GPU_GetVbiosVersionString_t NvGetBiosName = 0; NvAPI_GPU_GetAllClockFrequencies_t NvGetFreq = 0; NvAPI_GPU_GetPstates20_t NvGetPstates = 0; NvAPI_GPU_SetPstates20_t NvSetPstates = 0; int main(int argc, char **argv) { int nGPU=0, userfreq = 0, systype=0, memsize=0, memtype=0; int *hdlGPU[64]={0}, *buf=0; char sysname[64]={0}, biosname[64]={0}; NV_GPU_PERF_PSTATES20_INFO_V1 pstates_info; pstates_info.version = 0x11c94; NvQueryInterface = (void*)GetProcAddress(LoadLibrary("nvapi.dll"), "nvapi_QueryInterface"); NvInit = NvQueryInterface(0x0150E828); NvUnload = NvQueryInterface(0xD22BDD7E); NvEnumGPUs = NvQueryInterface(0xE5AC921F); NvGetSysType = NvQueryInterface(0xBAAABFCC); NvGetName = NvQueryInterface(0xCEEE8E9F); NvGetMemSize = NvQueryInterface(0x46FBEB03); NvGetMemType = NvQueryInterface(0x57F7CAAC); NvGetBiosName = NvQueryInterface(0xA561FD7D); NvGetFreq = NvQueryInterface(0xDCB616C3); NvGetPstates = NvQueryInterface(0x6FF81213); NvSetPstates = NvQueryInterface(0x0F4DAE6B); NvInit(); NvEnumGPUs(hdlGPU, &nGPU); NvGetSysType(hdlGPU[0], &systype); NvGetName(hdlGPU[0], sysname); NvGetMemSize(hdlGPU[0], &memsize); NvGetMemType(hdlGPU[0], &memtype); NvGetBiosName(hdlGPU[0], biosname); NvGetPstates(hdlGPU[0], &pstates_info); switch(systype){ case 1: printf("

Type: Laptop

"); break; case 2: printf("

Type: Desktop

"); break; default: printf("

Type: Unknown

"); break; } printf("Name: %s

", sysname); printf("VRAM: %dMB GDDR%d

", memsize/1024, memtype<=7?3:5); printf("BIOS: %s

", biosname); printf("

GPU: %dMHz

", (int)((pstates_info.pstates[0].clocks[0]).data.range.maxFreq_kHz)/1000); printf("RAM: %dMHz

", (int)((pstates_info.pstates[0].clocks[1]).data.single.freq_kHz)/1000); printf("

Current GPU OC: %dMHz

", (int)((pstates_info.pstates[0].clocks[0]).freqDelta_kHz.value)/1000); printf("Current RAM OC: %dMHz

", (int)((pstates_info.pstates[0].clocks[1]).freqDelta_kHz.value)/1000); if(argc > 1){ userfreq = atoi(argv[1])*1000; if(-250000 <= userfreq && userfreq <= 250000) { buf = malloc(0x1c94); memset(buf, 0, 0x1c94); buf[0] = 0x11c94; buf[2] = 1; buf[3] = 1; buf[10] = userfreq; NvSetPstates(hdlGPU[0], buf)? printf("

GPU OC failed!

") : printf("

GPU OC OK: %d MHz

", userfreq/1000); free(buf); } else { printf("

GPU Frequency not in safe range (-250MHz to +250MHz).

"); return 1; } } if(argc > 2){ userfreq = atoi(argv[2])*1000; if(-250000 <= userfreq && userfreq <= 250000) { buf = malloc(0x1c94); memset(buf, 0, 0x1c94); buf[0] = 0x11c94; buf[2] = 1; buf[3] = 1; buf[7] = 4; buf[10] = memtype<=7?userfreq:userfreq*2; NvSetPstates(hdlGPU[0], buf)? printf("VRAM OC failed!

") : printf("RAM OC OK: %d MHz

", userfreq/1000); free(buf); } else { printf("

RAM Frequency not in safe range (-250MHz to +250MHz).

"); return 1; } } NvUnload(); return 0; }

Now I have a stack of games to play with mind blowing framerate, hence this post is over.