Congrats AMDave for getting your Linux/OpenCL/ATI GPU platform crunchin' (I know how fun that journey can be )
I'm still looking at the source and files to see how difficult it might be to rebuild the app on the chance there may be a way to squeeze more performance out. Since the build procedure is pretty non-standard, one of the mods I'm looking at is to use autoconf/automake to bring it more in line of the gnu standard build. That's just the first step in what I envision to be a long process of optimization for Stream architecture.
In the mean time, I've experimented with the app_info.xml file in order to coax multiple threads of execution out of the adapter:
Code:
<app_info>
<app>
<name>pps_sr2sieve</name>
<user_friendly_name>Proth Prime Search (Sieve)</user_friendly_name>
</app>
<file_info>
<name>primegrid_tpsieve_2.3c_x86_64-ATI-linux-gnu</name>
<executable/>
</file_info>
<app_version>
<app_name>pps_sr2sieve</app_name>
<version_num>130</version_num>
<plan_class>ati13ati</plan_class>
<avg_ncpus>0.55</avg_ncpus>
<max_ncpus>1</max_ncpus>
<flops>1.0e11</flops>
<coproc>
<type>ATI</type>
<count>0.5</count>
</coproc>
<cmdline></cmdline>
<file_ref>
<file_name>primegrid_tpsieve_2.3c_x86_64-ATI-linux-gnu</file_name>
<main_program/>
</file_ref>
</app_version>
</app_info>
NOTE: I've renamed the executable to clarify what's actually being run
I've changed the avg_ncpus to force thread affinity on my dual-core processor and the coproc count to allow for two (2) threads on the adapter. It seems to be working.