SkyBoost (topic 6)

Post » Sun May 20, 2012 11:34 am

DariusD
Argh , R4 sources have a bug , I was in hurry replacing the asi files so forgot about srcs lol .
And just checked - highest instruction set in ur compilation is SSE2 .
User avatar
Marie Maillos
 
Posts: 3403
Joined: Wed Mar 21, 2007 4:39 pm

Post » Sun May 20, 2012 6:18 am

Tested version r3, r4 SSE2 and FPU,
got lowest FPS when going out the back of honeyside in Riften and look at the mountains, about 22 fps with R4, both versions, 20-21 with R3.
no crashes at all with the r4 version
got some mods installed, except Skyrim HD lite and a lot of ini tweaks nothing heavy.
Phenom II X3 720 oc to 3.2 Ghz stable,
8 gb DDR3 Ram,
Powercolor 6870 at stock 12.1preview driver
win7 X64.

I noticed that GPU load hovers around 65-80%, and my cpu load too...
isn't that a bit low?

to Alexander:
aren't there any newer optimisations for the Phenom II then SSE2 and or FPU ?
noticed this link:
http://board.flatassembler.net/topic.php?t=5122&start=180
I seem to recall some applications compiled with the intel compiler seemed to not work very efficiently on AMD CPU's,

don't know if this could be the case here?

btw, what are the differences in your 7 versions test build?
User avatar
Robyn Lena
 
Posts: 3338
Joined: Mon Jan 01, 2007 6:17 am

Post » Sun May 20, 2012 12:35 pm

Bonusbartus
One of that 7 builds must be correct and must run without CTDs , all other - must have CTD , task is to find the correct one .
And ur link doesn't open for me .
User avatar
Stephanie Valentine
 
Posts: 3281
Joined: Wed Jun 28, 2006 2:09 pm

Post » Sun May 20, 2012 2:32 pm

Bonusbartus
One of that 7 builds must be correct and must run without CTDs , all other - must have CTD , task is to find the correct one .
And ur link doesn't open for me .
doesn't open here either.... worked a few minutes ago...
link is about optimisations for mandelbrot test on different architectures...
not at home right now, so can't test for you....

what gpu/cpu loads is everyone else getting?
User avatar
Mashystar
 
Posts: 3460
Joined: Mon Jul 16, 2007 6:35 am

Post » Sun May 20, 2012 4:02 am

Any point targeting SSE3S (seems to be the highest for Q6600)? May grab visual studio and give it a try, is the source fixed now?
User avatar
Cody Banks
 
Posts: 3393
Joined: Thu Nov 22, 2007 9:30 am

Post » Sun May 20, 2012 3:09 pm

baker99
SSSE3 is not supported by MSVC code generator .
User avatar
Ann Church
 
Posts: 3450
Joined: Sat Jul 29, 2006 7:41 pm

Post » Sun May 20, 2012 7:26 am

DariusD
Argh , R4 sources have a bug , I was in hurry replacing the asi files so forgot about srcs lol .
And just checked - highest instruction set in ur compilation is SSE2 .

Do we need to reinstall r4?
User avatar
Harry-James Payne
 
Posts: 3464
Joined: Wed May 09, 2007 6:58 am

Post » Sun May 20, 2012 3:29 pm

Thanks Alexander, won't download it then, looks like SSE3 doesn't add much over SSE2 anyway, just 13 new institutions, I haven't got enough coding skill to tweak the code itself like you so I'll leave it to the experts.
User avatar
vanuza
 
Posts: 3522
Joined: Fri Sep 22, 2006 11:14 pm

Post » Sun May 20, 2012 5:28 pm

For everyone who is asking about oblivion boost - quick look confirmed that basic optimizations were turned on there by default , no sense in tryn' to boost so far , probably I shud install the game and check everything else , but not until I will finish with skyboost .

NICK ALTMAN
nope
User avatar
Astargoth Rockin' Design
 
Posts: 3450
Joined: Mon Apr 02, 2007 2:51 pm

Post » Sun May 20, 2012 4:03 am

Wonder why they turned it off for Skyrim.
User avatar
Scarlet Devil
 
Posts: 3410
Joined: Wed Aug 16, 2006 6:31 pm

Post » Sun May 20, 2012 1:12 pm

DariusD
Argh , R4 sources have a bug , I was in hurry replacing the asi files so forgot about srcs lol .
And just checked - highest instruction set in ur compilation is SSE2 .

I know - it's more about newer compiler and processor target than about instructions :)
__forceinline float sub_64DF40(int _this, int a2) // dot product        {          return (float)(*(float *)_this * *(float *)a2                       + *(float *)(_this + 4) * *(float *)(a2 + 4)                       + *(float *)(_this + 8) * *(float *)(a2 + 8));        }

It's hard for compiler to chew code like this and see opportunities for SSE overall - as _this and a2 could point to overlapping memory there are nasty register load etc dependencies introduced by code like this. On the other hand this is pretty much optimal for AMD cpu's.

Alex - any chance of fixed r4 sources?
User avatar
Justin Bywater
 
Posts: 3264
Joined: Tue Sep 11, 2007 10:44 pm

Post » Sun May 20, 2012 3:39 am

r3 gave me problems with loading distant textures.
r4 seems to have fixed that.

Even after adding larger textures and tweaking the settings to 'higher than ultra' in the .ini files, I can't seem to find any problems with it either. Good job. Not bad.
User avatar
CRuzIta LUVz grlz
 
Posts: 3388
Joined: Fri Aug 24, 2007 11:44 am

Post » Sun May 20, 2012 5:01 am

I know - it's more about newer compiler and processor target than about instructions :smile:
__forceinline float sub_64DF40(int _this, int a2) // dot product		{		  return (float)(*(float *)_this * *(float *)a2					   + *(float *)(_this + 4) * *(float *)(a2 + 4)					   + *(float *)(_this + 8) * *(float *)(a2 + 8));		}

It's hard for compiler to chew code like this and see opportunities for SSE overall - as _this and a2 could point to overlapping memory there are nasty register load etc dependencies introduced by code like this. On the other hand this is pretty much optimal for AMD cpu's.

Alex - any chance of fixed r4 sources?

Does that count for all AMD targets?
I can imagine the Phenoms will behave differently than Bulldozers, with their new FP architecture?
User avatar
Erika Ellsworth
 
Posts: 3333
Joined: Sat Jan 06, 2007 5:52 am

Post » Sun May 20, 2012 5:25 pm

Any point targeting SSE3S (seems to be the highest for Q6600)? May grab visual studio and give it a try, is the source fixed now?

HADDPS are also pretty useless for this game (the only useful for games instruction in SSE3)
User avatar
Charity Hughes
 
Posts: 3408
Joined: Sat Mar 17, 2007 3:22 pm

Post » Sun May 20, 2012 4:04 pm

HADDPS are also pretty useless for this game (the only useful for games instruction in SSE3)

And even more useless for Q6600 as its SSE3 performance was not that great, it took Penryn and Nehalem for SSE3 to start to make sense.
User avatar
Melanie Steinberg
 
Posts: 3365
Joined: Fri Apr 20, 2007 11:25 pm

Post » Sun May 20, 2012 6:39 pm

Some People seem to have more performance with SSE2 on Phenom II, but not for me. Also using Phenom II X6 1100T, but I have more performance by using FPU. About 2 FPS in Markath. But I noticed stuttering in dungeons or caves... Wasn't that much in r3...
User avatar
Tania Bunic
 
Posts: 3392
Joined: Sun Jun 18, 2006 9:26 am

Post » Sun May 20, 2012 12:20 pm

Guyz , let's test r4 test 2 , i blocked some function optimizations , so we can try to figure out the bad one which causes CTDs
http://dl.dropbox.com/u/1237747/r4_test_2.zip

we need to confirm that there is no CTD with this version
User avatar
Crystal Birch
 
Posts: 3416
Joined: Sat Mar 03, 2007 3:34 pm

Post » Sun May 20, 2012 4:39 pm

Alexander is test 2 fpu or sse2?
User avatar
Rob Davidson
 
Posts: 3422
Joined: Thu Aug 02, 2007 2:52 am

Post » Sun May 20, 2012 9:31 am

sse2
User avatar
Nicole Kraus
 
Posts: 3432
Joined: Sat Apr 14, 2007 11:34 pm

Post » Sun May 20, 2012 12:44 pm

Thanks, I'll test tonight, but so far I haven't noticed any more crashes with r4 than vanilla, which isn't many at all.
User avatar
Jason White
 
Posts: 3531
Joined: Fri Jul 27, 2007 12:54 pm

Post » Sun May 20, 2012 9:58 am

I've been using R4 SS4 now for maybe around 8 hours of in-game time, and not a single CTD with Intel C2D E6550.
User avatar
Minako
 
Posts: 3379
Joined: Sun Mar 18, 2007 9:50 pm

Post » Sun May 20, 2012 9:43 am

I've been using R4 SS4 now for maybe around 8 hours of in-game time, and not a single CTD with Intel C2D E6550.

Same her but fpu version i did play for 12 hours finished main quest not single ctd.
User avatar
Stryke Force
 
Posts: 3393
Joined: Fri Oct 05, 2007 6:20 am

Post » Sun May 20, 2012 4:54 pm

I am guessing that sse2 test2 is mainly for those who had crashes with sse2 test1 right?
Just asking because I never had any CTDs with r4 test 1 sse2 or fpu.

Edit: Well I tested it anyway and here is a bench comparing it to the rest, keep in mind the graph has a **Minimum of 44.
http://oi41.tinypic.com/hvqseb.jpg

I found that it was actually smoother, as shown on the graph, the dips are actually less than sse2 test 1 and fpu.
Any chance for an AVX or FMA4 test version?
User avatar
Emma louise Wendelk
 
Posts: 3385
Joined: Sat Dec 09, 2006 9:31 pm

Post » Sun May 20, 2012 9:17 am

Some People seem to have more performance with SSE2 on Phenom II, but not for me. Also using Phenom II X6 1100T, but I have more performance by using FPU. About 2 FPS in Markath. But I noticed stuttering in dungeons or caves... Wasn't that much in r3...

Scratch that, just tried SSE2 test 2 and i have about 36 FPS in Markath near Watermill. With Test1 SSE I had 29-30 FPS and with FPU 30-31 FPS:


Markath (near Watermill):

r3: 27 FPS
r4 test1 SSE2: 29-30 FPS
r4 test1 FPU: 30-31 FPS
r4 test2 SSE2: 36 FPS

Using Phenom II X6 1100T...
User avatar
Shae Munro
 
Posts: 3443
Joined: Fri Feb 23, 2007 11:32 am

Post » Sun May 20, 2012 7:19 pm

Thanx Alexander for fpu version .I have SSE processor. Athlon xp.
I'll post results.
User avatar
Katie Pollard
 
Posts: 3460
Joined: Thu Nov 09, 2006 11:23 pm

PreviousNext

Return to V - Skyrim