Jump to content

Welcome to Obsidian Forum Community
Register now to gain access to all of our features. Once registered and logged in, you will be able to create topics, post replies to existing threads, give reputation to your fellow members, get your own private messenger, post status updates, manage your profile and so much more. If you already have an account, login here - otherwise create an account for free today!
Photo

Build Thread 2.0


  • Please log in to reply
464 replies to this topic

#1
samm

samm

    (8) Warlock

  • Members
  • 1162 posts
  • Pillars of Eternity Silver Backer
  • Kickstarter Backer

Build Thread 1.0

 

An Nvidia engineer has stated that the 512MB is still around 4 times faster than system memory, and the GPU still has 224 GB/s bandwidth if memory is being accessed from both pools. The only time bandwidth is 28 GB/s is when the GPU is only accessing the 512MB pool, and it's possible for the OS, drivers, and game engine to stop that happening.

Hm, as far as I've gathered, if the GPU is accessing the 512MB in question, it has to do so exclusively. So it *will* impact performance significantly at this specifc conditions, because not only is the bandwidth impaired for these 512MB, but also will the memory controller have to switch modes between accessing parts of the other 3.5 GB and these 512MB. But it is hard to construct settings where this pool is accessed exclusively: People testing for this issue specifically will have to create conditions where the memory allocation is *not* handled by the driver that would try its best to keep the usage below 3.5 or above 4GB to ask the issue, and where the memory allocated is strictly between 3.5 and 4GB, with the upper part being accessed most...
 
In normal usage, a user will probably just feel a little more stutter when some of the impaired memory blocks are accessed every now and then or if the driver swaps data to/from main memory to avoid the weak 512MB. But if the "sour spot" is achieved, something like that can result:
GTX 980 underclocked to match the GTX 970 in compute, texture and pixel fillrate ([edit]it's unclear to me whether this equality in theoretical horse power is achieved taking into account nVidias original claims regarding the 970's ROP specs or the newly revised actual specs): http://www.pcgamesha...SAApng-pcgh.png
 
Frametimes and RAM usage (usage as reported by tools at least known not to really be able to cope with the GTX 970's unique setup)
http://www.pcgamesha...TX_970-pcgh.png
vs. said underclocked GTX 980
http://www.pcgamesha...TX_980-pcgh.png
 
(if deep links fail, please manually copy / paste the links into the address bar)
 
 
As for the possibilities to avoid the situation: game engines almost certainly will not account for one GPU's particular and very specific weaknesses, unless nVidia pays the developers for the extra effort and layer of abstraction they'd have to insert for that. The same is true for an OS, especially if developed prior to the common knowledge of this issue - which is every OS that is available now or appearing soon. So if anywhere, I'd expect this to be handled at driver level.
 
 
[edit]
And regarding the claim of that engineer that 28GB/s is four times faster than main memory bandwidth: if he was not talking about access times or anything like that, I'd be ashamed if I were him wink.png Even a very common case of, let's say, a dual channel pair of 1600 DDR 3 (aka. PC3-12800, which shows the MB/s for one channel) offers bandwidth similar to that (at least between CPU and RAM, while the bandwidth available between GPU and system RAM is capped by a PCI-e 3 x16 at a theoretical maximum of about 16GB/s)
 
 
[edit 2]
To clarify: I still think the 970 offers quite competitive performance and even better efficiency for its price. It's just the misleading PR (down to actually false technical specs) surrounding it that I'm trying to debunk here.


Edited by Fionavar, 28 January 2015 - 09:06 PM.
Closed Build Thread 1.0


#2
AwesomeOcelot

AwesomeOcelot

    (9) Sorcerer

  • Members
  • 1225 posts
  • Pillars of Eternity Silver Backer
  • Kickstarter Backer

Hm, as far as I've gathered, if the GPU is accessing the 512MB in question, it has to do so exclusively.


That's what the Anandtech article says, but this is what the PC perspective article says:
 

To those wondering how peak bandwidth would remain at 224 GB/s despite the division of memory controllers on the GTX 970, Alben stated that it can reach that speed only when memory is being accessed in both pools.


As I was writing this I noticed an update further down:

I wanted to clarify a point on the GTX 970's ability to access both the 3.5GB and 0.5GB pools of data at the same. Despite some other outlets reporting that the GPU cannot do that, Alben confirmed to me that because the L2 has multiple request busses, the 7th L2 can indeed access both memories that are attached to it at the same time.


To my mind, if the 970 couldn't access both memory pools at the same time then the card would not perform anywhere near as well as it does.

In normal usage, a user will probably just feel a little more stutter when some of the impaired memory blocks are accessed every now and then. But if the "sour spot" is achieved, something like that can result:


Which hasn't been shown in game benchmarks.

And regarding the claim of that engineer that 28GB/s is four times faster than main memory bandwidth: if he was not talking about access times or anything like that, I'd be ashamed if I were him ;) Even a very common case of, let's say, a dual channel pair of 1600 DDR 3 (aka. PC3-12800, which shows the MB/s for one channel) offers bandwidth similar to that (at least between CPU and RAM, while the bandwidth available between GPU and system RAM is capped by a PCI-e 3 x16 at a theoretical maximum of about 16GB/s)


What I assume he's refering to is the scenario where the PC would use the system RAM instead of the VRAM, the PCIe link being around 4 times slower. In this context, I don't think anyone who knows a little about this would assume he's refering to the bandwidth of the memory itself, he's obviously refering to the bandwidth of the system memory in practice with this application. Also in practice the bandwidth will be less than 16GB/s.

As for the possibilities to avoid the situation: game engines almost certainly will not account for one GPU's particular and very specific weaknesses, unless nVidia pays the developers for the extra effort and layer of abstraction they'd have to insert for that. The same is true for an OS, especially if developed prior to the common knowledge of this issue - which is every OS that is available now or appearing soon. So if anywhere, I'd expect this to be handled at driver level.


Going forward Nvidia might employ this method for their second "tier" GPU, so it may effect how game engines allocate memory, especially since the GTX 970 is pretty popular, and Nvidia has the most market share. PC Perspective seem to think the OS already, at least a modern OS like Windows 8, sees the different pools and allocates memory based on their speed. Obviously the drivers are going to handling this as best they can.

I think that people are over reacting to this, and that Nvidia have shot themselves in the foot. The GTX 970 was a win in performance and cost for them, the partial disable technology allowed this card to exist, but now that's soured.

#3
samm

samm

    (8) Warlock

  • Members
  • 1162 posts
  • Pillars of Eternity Silver Backer
  • Kickstarter Backer
Thanks for the clarification from the Update to the PCPer article, sounds reasonable.
 

Which hasn't been shown in game benchmarks.

Yes it has, I posted links to game benchmark results showing an occurence of such problems.
 

What I assume he's refering to is the scenario where the PC would use the system RAM instead of the VRAM, the PCIe link being around 4 times slower. In this context, I don't think anyone who knows a little about this would assume he's refering to the bandwidth of the memory itself, he's obviously refering to the bandwidth of the system memory in practice with this application. Also in practice the bandwidth will be less than 16GB/s.

Disregarding the taunt "anyone who knows a little about this", you basically confirm what I wrote wink.png 28GB/s / 4 would mean 7 GB/s. Could be correct for PCIe 2.0, or an 8x PCIe 3.0 slot, or a lane constrained CPU with more than one PCIe-slot occupied. So all right, in many cases he would correct.
 

I think that people are over reacting to this, and that Nvidia have shot themselves in the foot. The GTX 970 was a win in performance and cost for them, the partial disable technology allowed this card to exist, but now that's soured.

Agreed. The card performs well, it just has some points where its compromises show.

What's not OK about the situation, in my opinion, is the false advertising regarding the GPU's specs - I don't believe nVidia never looked at any reviews for the 970 containing wrong spec sheets or schematics.


[edit]

PC Perspective seem to think the OS already, at least a modern OS like Windows 8, sees the different pools and allocates memory based on their speed.

To my knowledge, this is not the case. The OS is not really aware of the GPU memory, let alone how it's organized. It is up to programmers to implement memory access in an efficient manner.

Edited by samm, 27 January 2015 - 05:59 PM.


#4
AwesomeOcelot

AwesomeOcelot

    (9) Sorcerer

  • Members
  • 1225 posts
  • Pillars of Eternity Silver Backer
  • Kickstarter Backer

Yes it has, I posted links to game benchmark results showing an occurence of such problems.


a) The graph is reporting the engine using less than 3.5GB of VRAM which should only use the one pool anyway, some articles claim certain tools read the amount of allocated VRAM incorrectly.

b) There's a 3rd graph of the R290x using 4GB and having lots of spikes. Which only means that the GTX 980 is a better GPU than a GTX 970 or Radeon R290x. Spikes are what happen when the GPU starts falling over due to excessive demand on it. Many cards have spikes at those settings, many cards have worse spikes than the GTX 970, we've already seen benchmarks of frame times for the GTX 970 at ultra settings with games like Watch Dogs, the GTX 970 didn't perform worse than similar class cards with a single pool of 4GB.

c) A GTX 970 is not an underclocked GTX 980, as you yourself acknowledged.

d) I don't understand the argument, after knowing that the pools can be accessed simultaneously, that this could cause significant stuttering. A game like Watch Dogs is only using the one pool for some reason and it's the slower 512MB pool? That's not going to happen. Stuttering can be caused by a lot of reasons, and people latched onto Nai's benchmark even though it's meaningless in terms of real world game performance. If the partial disable is causing stuttering, it's not down to the reasons stated, it's not because the 512MB is a slower pool. I seem to recall people complaining about stuttering in Watch Dogs with a variety of cards, the various Titans.

#5
Bartimaeus

Bartimaeus

    (8) Warlock

  • Members
  • 1173 posts
Your image link leads to here: http://www.pcgamesha.../nodeeplink.gif

#6
AwesomeOcelot

AwesomeOcelot

    (9) Sorcerer

  • Members
  • 1225 posts
  • Pillars of Eternity Silver Backer
  • Kickstarter Backer

You can't hotlink them. All 4 of the image links are from the same article.



#7
samm

samm

    (8) Warlock

  • Members
  • 1162 posts
  • Pillars of Eternity Silver Backer
  • Kickstarter Backer

a) The RAM allocation on the 980 is quite probably reported correctly, so it should be used as a reference point on how much RAM the 970 would use (same architecture, same settings, same scene, same driver, game most probably unaware of internal differences between 970 and 980 due to its age) if its memory allocation would work the same way. Thus it's reasonable to assume that the memory use of the 970 is between 3.5 and 4GB in this case.

 

 

b) My assumptions from a) do not apply for the 290X, so the first point in section b) is irrelevant (different architecture, different driver, game very likely aware of architectural differences between 290X and 9xx). Regarding other benches showing the 970 with normal frametimes: the linked benchmarks have been explicitly selected to show the problems under the given constraint of using exactly between 3.5 and 4GB and exhibiting noticably different behaviour between a 980 and a 970 that are clocked to have identical processing power. That is not normally the case in other benchmarks. This whole discussion is about "no benchmarks for the problem", yet you seem to refuse to see that this is a benchmark specifically designed for the GPU to exhibit its problemetic behaviour.

 

c) Yes, which is the point of the discussion - where and how does a 970 behave differently than a 980 at the same settings.

 

d) I did not uphold that argument that a switching between the 512MB and the 3.5GB pool could even increase stuttering after your PCPer link. The argument is that if that small part of the memory is used as well, then the card is prone to exhibit stuttering. Yes, Watch Dogs is not a prime example of a fluid gaming experience on many cards and many settings - but for the same settings for cards with the same theoretical horsepower of the same architecture with the same driver it could be expected to behave identically. Which it does not.



#8
AwesomeOcelot

AwesomeOcelot

    (9) Sorcerer

  • Members
  • 1225 posts
  • Pillars of Eternity Silver Backer
  • Kickstarter Backer

Thus it's reasonable to assume that the memory use of the 970 is between 3.5 and 4GB in this case.

This is wrong, a) there are tools that will show you what the 970 is using, and b) the card is designed so that the game will only allocate 3.5GB if that's all it needs. On a 980 it doesn't really matter wether it allocates 3.5GB or 4GB.

My assumptions from a) do not apply for the 290X, so the first point in section b) is irrelevant (different architecture, different driver, game very likely aware of architectural differences between 290X and 9xx).


It shows that this behaviour is not indicative of having two pools of memory, but of the GPU hitting a wall, and the 290x is a similarly priced and performing GPU.

between a 980 and a 970 that are clocked to have identical processing power.


That is not possible with the differences in L2, ROP, and SM. Underclocking a 980 does not give you a 970.

the same theoretical horsepower of the same architecture with the same driver it could be expected to behave identically.


But they don't have the same theoretical horsepower.

#9
AwesomeOcelot

AwesomeOcelot

    (9) Sorcerer

  • Members
  • 1225 posts
  • Pillars of Eternity Silver Backer
  • Kickstarter Backer

Guru3d failing to find stutter after going over 3.5GB.



#10
samm

samm

    (8) Warlock

  • Members
  • 1162 posts
  • Pillars of Eternity Silver Backer
  • Kickstarter Backer

That is not possible with the differences in L2, ROP, and SM. Underclocking a 980 does not give you a 970.

(snip)
But they don't have the same theoretical horsepower.

Underclocking an 980 and its memory could give you a 970, if the GPU would actually be designed as it was marketed in the first place. But it is not.

Just as PCGH picked an example where it does matter, Guru3D picked an example where it does not... And the point, to me, in this whole posting history on that subject was: to show that this allegation is wrong: that there'd be no test that can show the difference between actual vs. initially communicated specs.

This is my last posting on the matter because basically, I do not care further than that:

The card performs well, it just has some points where its compromises show. What's not OK about the situation, in my opinion, is the false advertising regarding the GPU's specs - I don't believe nVidia never looked at any reviews for the 970 containing wrong spec sheets or schematics.



#11
AwesomeOcelot

AwesomeOcelot

    (9) Sorcerer

  • Members
  • 1225 posts
  • Pillars of Eternity Silver Backer
  • Kickstarter Backer
Anandtech also unable to find performance problems with the GTRX 970 over 3.5GB.
 

Underclocking an 980 and its memory could give you a 970, if the GPU would actually be designed as it was marketed in the first place. But it is not.


This is true, and also contradicting what you posted, and it invalidates the methodology of the benchmark. If that site is claiming that this shows a performance problem with using two pools of memory, then I'd suggest you stop using that site.

Edited by AwesomeOcelot, 28 January 2015 - 12:13 PM.


#12
AwesomeOcelot

AwesomeOcelot

    (9) Sorcerer

  • Members
  • 1225 posts
  • Pillars of Eternity Silver Backer
  • Kickstarter Backer

Hardware Canucks and PC Perspective also fail to find the stutter. The GTX 970 does not have issues going over 3.5GB. People experience problems with games all the time, and this is a nice scape goat for them.



#13
Humanoid

Humanoid

    Arch-Mage

  • Members
  • 3611 posts
From a practical standpoint it at least looks like nV have done the right thing in talking to retailers who were refusing to take returns (Newegg being one of them) and instituting a returns program, even if only semi-officially. Legally it's the right thing to do for a product not sold as advertised, though one would hope this information be published shortly in an official channel rather than a low-key "talk to your retailer if you're unhappy" message through social media or whatever.

That said, the card is still the best buy above $300 out there, and the news doesn't make the 980 any more attractive realistically. Yes, the deceptive behaviour is rather galling regardless, and whether intentional or not rather colours the public perception of the company, particularly following on the heels of other scandals like Bumpgate and Crysis 2's tesselation. Whether this would change a potential buyer's decision is a matter of personal principle: objective buying advice is still unchanged for now, though one might want to wait and see what the R9 380 brings in March-ish.

#14
LadyCrimson

LadyCrimson

    Obsidian VIP

  • Members
  • 8682 posts
  • Location:Candyland
  • Pillars of Eternity Gold Backer
  • Kickstarter Backer

This is why I don't like buying tech anymore. Never know what "tests" or what "gurus" are going to fit your personal parameters. I just buy (from a walkin store), and if I don't like it, I take it back.

 

Maybe I should just get the less expensive Titan, because even tho I swore I wouldn't do something like that again, sometimes I'm still a fool. :cat:



#15
Humanoid

Humanoid

    Arch-Mage

  • Members
  • 3611 posts

When talking about hardware inextricably linked to gaming, it's no surprise that the hardware reviews often have about as much integrity as games reviews.



#16
Fionavar

Fionavar

    Community Manager

  • Global Moderators
  • 12034 posts
  • Location:Manitoba, Canada
  • Pillars of Eternity Gold Backer
  • Kickstarter Backer
  • Deadfire Gold Backer
  • Fig Backer
  • Black Isle Bastard!

Build Thred 1.0 closed ... let 2.0 inspire!

 

Fwiiw: I am holding off my next gaming notebook upgrade as long as I can to get as close to Windows 10 release!



#17
Humanoid

Humanoid

    Arch-Mage

  • Members
  • 3611 posts

On trying to get back to a positive note too: I'm pondering a new monitor, a decision prompted, oddly enough, by a bout of back pain. I don't actually think it's the cause of the problem, but with a dual screen setup (a pair of 27s) I'm finding I'm either twisting my torso alternately a little to the left when gaming and a little to the right when surfing which isn't the best ergonomically. So I think it'd be better for me (or at least an excuse as good as any) to get one more display as a centrepiece, to be flanked by the existing screens. Logistically it's going to be a bit of a problem: I probably do have the desk space, but just barely, and it'd mean moving my bookshelf speakers off the desk as well. There's also the temptation with going bigger - 30-32" for the centre, but it'd probably be even less practical.

 

With the power of hindsight, I'd have purchased a smaller pair of displays for the flanks, to be used in portrait mode, but that ship has long sailed. And these Dell U2711s don't have pivoting stands so even trying to use them in portrait mode wouldn't be feasible.

 

On reflection it feels good that even my 5-year old system with 3-year old GPU is still holding up easily well enough such that I can think about spending on stuff like this instead. People who like to surf the bleeding edge of technology might be somewhat frustrated at the glacial pace of improvements over the past few years, but not me.



#18
LadyCrimson

LadyCrimson

    Obsidian VIP

  • Members
  • 8682 posts
  • Location:Candyland
  • Pillars of Eternity Gold Backer
  • Kickstarter Backer

Not technically related, but my thing with monitors is their height vs. my chair/sitting height. I need the center of screen to be at least eye-level if not higher. If I place them directly on the desk, they're too low (even if they have height adjustments, it's usually not enough) and I either get bad neck pain from looking down-ish, or back pain from constant slouching. :lol: So I always have them on a pedestal or a few books or whatever.

 

I built my current rig 6 years ago and it's been very nice indeed not to have to build a new one for so long, but I'm starting to really itch for shiny upgrades by now. Hopefully what I build at year's end will last me just as long, however.



#19
Humanoid

Humanoid

    Arch-Mage

  • Members
  • 3611 posts

People keep stealing the phone books at the office....



#20
LadyCrimson

LadyCrimson

    Obsidian VIP

  • Members
  • 8682 posts
  • Location:Candyland
  • Pillars of Eternity Gold Backer
  • Kickstarter Backer

People keep stealing the phone books at the office....

I thought those were for sitting on.






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users