Author Message

Eric Driver



Project developer

Project tester

Project scientist

Project administratorProject developerProject testerProject scientist Send message

Joined: 8 Jul 11

Posts: 1045

Credit: 126,230,578

RAC: 111,386

Joined: 8 Jul 11Posts: 1045Credit: 126,230,578RAC: 111,386 Message 2424 - Posted: 15 May 2019, 22:05:56 UTC It's been over a month since our last update, but I now have some good news. I have made some improvements to the GPU code and am ready to start deploying the new GPU apps.



I will start with the AMD OpenCL version for Linux. This will be a beta version. I have had a hell of a time with the AMD implementation of openCL, and this app still doesn't work on my Fedora system, and I believe strongly it's due to the graphics driver. But I have had the help of a volunteer named Wiktor and it runs fine for him (I believe he runs Ubuntu). Please keep in mind that AMD officially only supports RHEL and Ubuntu, so I will be interested to hear if this app works for anyone with an "unsupported" linux distro like myself.



I also have openCL Windows apps that were cross compiled using mingW. I have no means of testing these, so I am not ready to deploy them just yet. But if anyone would like to take them for a spin offline, please let me know, and I can send them to you. Reply Quote ID: 2424 · Rating: 0 · rate:

Eric Driver



Project developer

Project tester

Project scientist

Project administratorProject developerProject testerProject scientist Send message

Joined: 8 Jul 11

Posts: 1045

Credit: 126,230,578

RAC: 111,386

Joined: 8 Jul 11Posts: 1045Credit: 126,230,578RAC: 111,386 Message 2426 - Posted: 16 May 2019, 2:58:52 UTC - in response to Message 2425. How about writing the AMD GPU app so it works with the ROCm opencl driver. The ROCm driver works great for Einstein@home.



As far as I know, it has nothing to do with the app. The openCL code works perfectly on Nvidia and with AMD on Ubuntu. I think what you are suggesting is I try the ROCm driver on my Fedora system. I did try that early on with no success, but perhaps I should try again now that I have more experience with video drivers. As far as I know, it has nothing to do with the app. The openCL code works perfectly on Nvidia and with AMD on Ubuntu. I think what you are suggesting is I try the ROCm driver on my Fedora system. I did try that early on with no success, but perhaps I should try again now that I have more experience with video drivers. Reply Quote ID: 2426 · Rating: 0 · rate:

Diffident

Send message

Joined: 30 Apr 18

Posts: 2

Credit: 1,439,467

RAC: 0

Joined: 30 Apr 18Posts: 2Credit: 1,439,467RAC: 0 Message 2430 - Posted: 16 May 2019, 9:21:02 UTC - in response to Message 2426.

Last modified: 16 May 2019, 9:21:24 UTC How about writing the AMD GPU app so it works with the ROCm opencl driver. The ROCm driver works great for Einstein@home.



As far as I know, it has nothing to do with the app. The openCL code works perfectly on Nvidia and with AMD on Ubuntu. I think what you are suggesting is I try the ROCm driver on my Fedora system. I did try that early on with no success, but perhaps I should try again now that I have more experience with video drivers.



There must be something different. When using the ROCm driver I can run Einstein@home, but Milkway@home will instantly stop with a computation error. I think the ROCm driver should be preferred since AMD is moving everything to opensource instead using the opencl bits from closed pro driver. There must be something different. When using the ROCm driver I can run Einstein@home, but Milkway@home will instantly stop with a computation error. I think the ROCm driver should be preferred since AMD is moving everything to opensource instead using the opencl bits from closed pro driver. Reply Quote ID: 2430 · Rating: 0 · rate:

Eric Driver



Project developer

Project tester

Project scientist

Project administratorProject developerProject testerProject scientist Send message

Joined: 8 Jul 11

Posts: 1045

Credit: 126,230,578

RAC: 111,386

Joined: 8 Jul 11Posts: 1045Credit: 126,230,578RAC: 111,386 Message 2434 - Posted: 16 May 2019, 16:38:26 UTC - in response to Message 2430. There must be something different. When using the ROCm driver I can run Einstein@home, but Milkway@home will instantly stop with a computation error. I think the ROCm driver should be preferred since AMD is moving everything to opensource instead using the opencl bits from closed pro driver.



I agree. From what I've read ROCm is the way to go. When I get a chance I will look into that again. I agree. From what I've read ROCm is the way to go. When I get a chance I will look into that again. Reply Quote ID: 2434 · Rating: 0 · rate:

Eric Driver



Project developer

Project tester

Project scientist

Project administratorProject developerProject testerProject scientist Send message

Joined: 8 Jul 11

Posts: 1045

Credit: 126,230,578

RAC: 111,386

Joined: 8 Jul 11Posts: 1045Credit: 126,230,578RAC: 111,386 Message 2435 - Posted: 16 May 2019, 18:40:40 UTC Thanks to the successful testing by Speedy51, I will be able to deploy the Windows Nvidia OpenCL app. I should get to that in the next couple hours.



In the meantime, has anyone with an AMD card on linux tried to test that version? I deployed it ~12 hours ago and no tasks have been sent out yet. My own system cant seem to download tasks for it either, so I think something might be wrong with how I setup the plan class. Reply Quote ID: 2435 · Rating: 0 · rate:

Eric Driver



Project developer

Project tester

Project scientist

Project administratorProject developerProject testerProject scientist Send message

Joined: 8 Jul 11

Posts: 1045

Credit: 126,230,578

RAC: 111,386

Joined: 8 Jul 11Posts: 1045Credit: 126,230,578RAC: 111,386 Message 2436 - Posted: 16 May 2019, 20:45:09 UTC I just deployed the windows Nvidia version as a beta app. Please test and report any suspicious behavior. Reply Quote ID: 2436 · Rating: 0 · rate:

Eric Driver



Project developer

Project tester

Project scientist

Project administratorProject developerProject testerProject scientist Send message

Joined: 8 Jul 11

Posts: 1045

Credit: 126,230,578

RAC: 111,386

Joined: 8 Jul 11Posts: 1045Credit: 126,230,578RAC: 111,386 Message 2437 - Posted: 16 May 2019, 20:54:12 UTC - in response to Message 2429. Feel free to send me the openCL Windows app via mail. Would like to give it a try. :)



I just sent you the AMD version, since I now have confidence in Nvidia version. Thanks! I just sent you the AMD version, since I now have confidence in Nvidia version. Thanks! Reply Quote ID: 2437 · Rating: 0 · rate:

Aurel



Send message

Joined: 25 Feb 13

Posts: 211

Credit: 8,882,763

RAC: 0

Joined: 25 Feb 13Posts: 211Credit: 8,882,763RAC: 0 Message 2439 - Posted: 17 May 2019, 8:32:02 UTC - in response to Message 2436. Runtime from a sf5 task: 1 hour and 3 minutes; which is ~3 times faster than a CPU task.

Runtime from a sf6 DS7x10: ~7 minutes, which i cant relate to a CPU task at this time.



No errors while running the task.

It seems to work fine, at least for me. :) Reply Quote ID: 2439 · Rating: 0 · rate:

Eric Driver



Project developer

Project tester

Project scientist

Project administratorProject developerProject testerProject scientist Send message

Joined: 8 Jul 11

Posts: 1045

Credit: 126,230,578

RAC: 111,386

Joined: 8 Jul 11Posts: 1045Credit: 126,230,578RAC: 111,386 Message 2440 - Posted: 17 May 2019, 14:23:08 UTC The Nvidia windows version seems to be doing pretty well. Many successful results from multiple users. Only 2 compute errors.



I am going on a road trip (vacation) for a week. I will have cell phone coverage, but unable to do any major project maintenance. I will look into the above compute errors when I return. Reply Quote ID: 2440 · Rating: 0 · rate:

Chooka

Send message

Joined: 3 May 18

Posts: 12

Credit: 7,012,355

RAC: 0

Joined: 3 May 18Posts: 12Credit: 7,012,355RAC: 0 Message 2446 - Posted: 19 May 2019, 8:18:57 UTC - in response to Message 2435. Thanks to the successful testing by Speedy51, I will be able to deploy the Windows Nvidia OpenCL app. I should get to that in the next couple hours.



In the meantime, has anyone with an AMD card on linux tried to test that version? I deployed it ~12 hours ago and no tasks have been sent out yet. My own system cant seem to download tasks for it either, so I think something might be wrong with how I setup the plan class.



I have AMD cards but use Windows so I can't help sorry.

I have AMD cards but use Windows so I can't help sorry. Reply Quote ID: 2446 · Rating: 0 · rate:

Eric Driver



Project developer

Project tester

Project scientist

Project administratorProject developerProject testerProject scientist Send message

Joined: 8 Jul 11

Posts: 1045

Credit: 126,230,578

RAC: 111,386

Joined: 8 Jul 11Posts: 1045Credit: 126,230,578RAC: 111,386 Message 2451 - Posted: 27 May 2019, 1:14:05 UTC - in response to Message 2440. The Nvidia windows version seems to be doing pretty well. Many successful results from multiple users. Only 2 compute errors.



I am going on a road trip (vacation) for a week. I will have cell phone coverage, but unable to do any major project maintenance. I will look into the above compute errors when I return.



I found the bug that was causing the compute errors. It's minor and affects less than 1% of the WUs. I will get a fix out there later this evening. I found the bug that was causing the compute errors. It's minor and affects less than 1% of the WUs. I will get a fix out there later this evening. Reply Quote ID: 2451 · Rating: 0 · rate:

Henk Haneveld

Send message

Joined: 12 Oct 17

Posts: 2

Credit: 231,548

RAC: 0

Joined: 12 Oct 17Posts: 2Credit: 231,548RAC: 0 Message 2452 - Posted: 27 May 2019, 15:12:34 UTC The estimated runtime for results with the new version is very, very wrong.



They show a runtime of 22 seconds on my host but take about an hour to finish. Reply Quote ID: 2452 · Rating: 0 · rate:

Eric Driver



Project developer

Project tester

Project scientist

Project administratorProject developerProject testerProject scientist Send message

Joined: 8 Jul 11

Posts: 1045

Credit: 126,230,578

RAC: 111,386

Joined: 8 Jul 11Posts: 1045Credit: 126,230,578RAC: 111,386 Message 2453 - Posted: 27 May 2019, 15:38:47 UTC - in response to Message 2452. The estimated runtime for results with the new version is very, very wrong.



They show a runtime of 22 seconds on my host but take about an hour to finish.



That's because CreditNew restarts the stats calculations with each new app version. I'm not sure exactly how to change it's initial value. I too saw this last night, but by this morning it is now estimating 25 minutes per task, which is accurate for my GPU. That's because CreditNew restarts the stats calculations with each new app version. I'm not sure exactly how to change it's initial value. I too saw this last night, but by this morning it is now estimating 25 minutes per task, which is accurate for my GPU. Reply Quote ID: 2453 · Rating: 0 · rate:

Eric Driver



Project developer

Project tester

Project scientist

Project administratorProject developerProject testerProject scientist Send message

Joined: 8 Jul 11

Posts: 1045

Credit: 126,230,578

RAC: 111,386

Joined: 8 Jul 11Posts: 1045Credit: 126,230,578RAC: 111,386 Message 2454 - Posted: 27 May 2019, 17:20:54 UTC - in response to Message 2453. The estimated runtime for results with the new version is very, very wrong.



They show a runtime of 22 seconds on my host but take about an hour to finish.



That's because CreditNew restarts the stats calculations with each new app version. I'm not sure exactly how to change it's initial value. I too saw this last night, but by this morning it is now estimating 25 minutes per task, which is accurate for my GPU.



So I believe CreditNew uses rsc_fpops_est as it's initial starting point. This was at least 10x too low. I have now fixed this, so going forward the initial flops estimates should be better. So I believe CreditNew usesas it's initial starting point. This was at least 10x too low. I have now fixed this, so going forward the initial flops estimates should be better. Reply Quote ID: 2454 · Rating: 0 · rate:

Eric Driver



Project developer

Project tester

Project scientist

Project administratorProject developerProject testerProject scientist Send message

Joined: 8 Jul 11

Posts: 1045

Credit: 126,230,578

RAC: 111,386

Joined: 8 Jul 11Posts: 1045Credit: 126,230,578RAC: 111,386 Message 2459 - Posted: 7 Jun 2019, 1:55:05 UTC - in response to Message 2458. The beta apps are functioning as intended, yes?



Will we see a Windows app for opencl_amd?



Yes, thanks for reminding me. The nvidia apps have been working well, so I just promoted them to normal app status. There is still some room for optimization, but the apps are stable, so I think this is a good idea.



There are a couple people helping with the amd opencl versions. Exact same opencl code that works perfectly on nvidia, but amd cards are very finicky. I believe it comes down to inconsistent drivers. The amd opencl on linux had about half a dozen successful results which is a good sign. Yes, thanks for reminding me. The nvidia apps have been working well, so I just promoted them to normal app status. There is still some room for optimization, but the apps are stable, so I think this is a good idea.There are a couple people helping with the amd opencl versions. Exact same opencl code that works perfectly on nvidia, but amd cards are very finicky. I believe it comes down to inconsistent drivers. The amd opencl on linux had about half a dozen successful results which is a good sign. Reply Quote ID: 2459 · Rating: 0 · rate: