View Full Version : [Technical Talk] - FAQ: Game art optimisation (do polygon counts really matter?)
CrazyButcher
11-25-2007, 09:15 AM
after some heated discussion came up in the "what are you working on thread" about how anal one should be about shaving those tris down, I guess its better to move the topic here.
this was a presented model, its fairly low already, and has beveled edges in a fashion that allows UV mirroring, and gives nice smooth normals (therefore is good for baking and runtime interpolation).
http://i65.photobucket.com/albums/h201/boostermoose/Tacklebox.gif
the proposed version, while taking less tris, will have some issues on interpolation side of things. (paintover!!)
http://img2.freeimagehosting.net/uploads/c33e83818e.jpg
now the major issue was "less = better", which is not always right. Basically the way driver/api works, we speak about batches being the limiting factor in a single frame. That is drawcalls. The less the better, the trianglecount per batch doesnt really matter, and once certain thresholds makes no difference at all. now a very good paper explaining the sideeffects and phenomena is this paper by nvidia
http://developer.nvidia.com/docs/IO/8230/BatchBatchBatch.pdf
be aware that "high-end" card was a gf5 back then, and even the gf2 was saturated at 130 tris, ie makes no difference if less triangles per batch.a slightly newer version of that paper (wow radeon9600)
http://http.download.nvidia.com/develope...ptimization.pdf (http://http.download.nvidia.com/developer/presentations/2005/GDC/Direct3D_Day/D3DTutorial03_Optimization.pdf)
shows how basically there is no difference at all if you send 10 or 200 triangles. And this "threshold" where it doesnt matter anymore, is constantly rising (those numbers are from 2004/2005). todays engines are all about crunching as much in as little drawcalls as possible, the triangle count of single objects becomes less an issue. there is certain types of objects that are rendered using "instancing" setups or "skinning", which are supposed to have less vertices, then other stuff. but even there technical variety is just too huge.
in short, make it look good, dont go insane about taking away tris, unless you do PSP/NDS or whatever mobile, or RTS instanced units. Any modern card can crunch so many triangles that this is not the real limiting factor.
if you run the crysis sandbox editor, you will see there is some more than million tris per frame, and it runs rather smooth still (editor is quite faster than game). In EnemyTerritory the whole 3d terrain is rendered fully... what matters is the total drawcall per frame, and not just meshes but postfx and so on are part of that, yes even gui elements. A lot of limiting comes from shading the pixels, and no matter how you lowres your box will be, if texture + size on screen remains the same, the costs thru shading will be identical. The few extra vertices that need to be calculated are negligable.
There are reasons it takes crytek, epic and other guys with tons of experience to get the most out of hardware, they do a lot of performance hunting, profile how cpu/gpu are stressed at a scene. I mean Rorshach wasnt telling you guys this for no reason, after all their studio works on the real "limits", and get more out of the hardware than most. Hence some "veterans" view will surprise you, simply as they are better informed work for the "top studios". I dont want to diss anyone's opinion, its not like people made them up themselves, its just times move on very quickly. Modern cards can crunch some insane workload, but its all about the feeding mechanism...
so yes you might do some mmo which runs on every crap hardware, but even there hardware principles stay the same and looking at steam's hardware survey, there is a lot of capable hardware around...
After all this should not imply that optimization is bad, or not needed, if you can get the work done with less, and not sacrifice quality, then do it. But there is a certain level were the work/effect payoff just isnt there anymore. (this will differ for older platforms (ps2, mobiles)
----
there is another caveat so, that is the problem with "micro / thin" triangles. If you have triangles that hardly ever occupy much pixels on screen, it kills parallism of the GPU. Which should be taken into account, ie do not go overboard with micro-bevelling.
-----
make sure to read Kevin Johnstone's posts
http://boards.polycount.net/showpost.php?p=762412&postcount=5
nice writeup and info here crazybutcher, but i notice you're only referring to "runtime" performance of stuff.
what about loading things into memory, or collision calculations etc.
if you have 100 duplicates of an object and it's not instanced, that is going to add up quite quickly on the memory side of things if you have extra polygons.
any more info on stuff like that rather than just brute force rendering of tris would be cool /images/graemlins/smile.gif
you also have to bear in mind stencil shadows, more silhouette polys lead to greater stress there. not every engine uses lightmaps.
good thread tho /images/graemlins/laugh.gif
perna
11-25-2007, 09:29 AM
What this translates to is that even on ancient hardware you can actually ADD polies to that box and maintain the exact same performance rendering it... which in turn means that, yes, there IS such a thing as "insignificant optimization".
Now, Ror has the brains to guard his words, but I don't, so I'll say to you earlier optimization-worshippers: When a whole bloody torrent of highly experienced professionals tell you you're wrong, maybe you should stop being so bullheaded and actually try to research the validity of your claims.
CrazyButcher
11-25-2007, 09:40 AM
yeah mop you are right, stencil shadows are definetely ugly case where more tris = more work. But I dont think stencilshadows will have much future, personal opinion so /images/graemlins/wink.gif
or if moved to GPU to do silhouette extraction the tris count will not be soooo important anymore.
loading into memory, well vertexcounts and texture memory are the limiting factor. I remember having made as similar thread once, stating that vertex weigh signficantly less,too. A triangle in stripping weighs just 4 or 2 bytes, and in non strip case three times as much.
collision is software stuff and might use dedicated lowest lod, geometry. Its a whole different story I would not want to touch, but yes here also less is better. However often dynamic objects are approximated with primitives such as boxes, spheres... to keep costs down. And static environment stuff is also fairly optimized to cope with a few more tris.
about "instancing". Engines will instance stuff anyway, I mostly meant higher-techniques that do the rendering in less drawcalls. You will always load that box into memory only once, and have a low weight representation for every time you use it (pos,rot,scale in world + handle to actual geometry). Its just that non-instanced rendering means a drwacall every time you render the box at another position.
of course you might do total unique box variatns, like mesh deformed permanently, but that would be exactly like modelling two boxes.
After all we are talking about a very optimized model already, its not like the box is 1000 tris. Its just the question that the tradeoff of quality/speed for < 300 or so, simply isnt worth it, considering the rendering pipeline.
Kevin Johnstone
11-25-2007, 09:49 AM
I spent a solid few months of optimizing polys, lightmap UV channels, collish meshs for everything in UT and the act of
stripping 2million polys out of a level generally improved the FPS by 2 or 3 frames.
Polycount is not the huge issue people were rightly sure it was previously. The bigger issue now is texture resolution because all assets carry 3 textures as standard, normal map, diffuse and spec and thats before you have additional mask light textures for emmisives and reflection or whatever other stuff you are supporting in the shader.
Shader complexity is also a bigger issue now because it requires longer rendering time.
Section counts are a bigger issue , meshs carrying 2 of each texture and thus requiring a 2nd rendering pass.
I can't explain things technically enough for people but the coders have explained to me a couple times that the problem with just beating everything down with optimization on things like polycount doesn't affect things as much because of different things being CPU or GPU bound.
Mesh count is a big issue now that everything is a static mesh rather than the majority being BSP. BSP is terribly inefficient compared to mesh rendering also.
A mesh is pretty much free for the 1st 600 polys, beyond that its cost can be reduced dramatically by using lightmaps for generating self shadowing and so on rather than vertex lighting.
The reason also I was saying I wouldnt take out the horizontal spans on this piece was also largely because as an environment artist you have to be thinking about
the crimes against scale the level designers will often make
with your work to make a scene work.
Just because I know its a box , doesn't mean it wont get used as something else much larger so I always try to make sure
it can hold up, whatever it is, when 4 times the scale!
Butcher mentioned instancing, this is another feature that we relied upon much more heavily to gain performance.
Due to textures / BSP being more expensive now and polycounts cheaper we made things modular, very very modular.
For instance, I made a square 384/384 straight wall using a BSP tiling texture and generated about 60 modular lego pieces that use the same texture and all fit with themselves and each other to replace BSP shelling out of levels.
This led to lots of optimizations in general and quick easy shelling of levels and it gave our levels a baseline for
addition of new forms to the base geometry level.
I doubt I'm changing anyone's opinion here, maybe making a normal map driven next gen game will convince you though.
And by next gen, i just mean the current new technology like ID's new engine, crysis engine, UE3 etc because they
are normal map driven and the press likes fancy names
for simple progression of technology.
r.
Heh, Rorshach, in our engine you can't scale static models so everything stays the same size unless someone explicitly exports a different version of the model file /images/graemlins/smile.gif
And I would expect a good level designer would know not to madly scale up an object obviously designed to be used as a small prop :/
But yes good info on all fronts, cheers guys.
CrazyButcher
11-25-2007, 10:26 AM
I should add that triangles are stored as "indexed" lists. that means you get a huge array of all vertices, and a triangle is made from either 3 or 1 index into that array.
say for a quad your have verts = [A,B,C,D]. And now you store triangle info (index starting at 1):
as list: [1,2,3, 2,3,4]
as strip: [1,2,3, 4]
each index normally weighs as much as a half uncompressed pixel. And thats it, triangles are really just adding some more indices to that index list. And their memory is sooo tiny compared to the rest... (unless for collision stuff, or stencils, there you need per face normals as well)
The mentioned texturememory used mentioned by ror, is the one thing to really optimized for, as you can just crunch x megs of texmemory into your graphics card a frame.
The other thing that may reside in memory are static meshes. Note that static here means we dont want to change vertices individually "by hand". But we change thru shaders (bones), or thru spatial placement in total. Which is nearly all vertices we normally see on a frame. The opposite is data that is generated/manipulated more fundamentally every frame, think particles.
Now a game vertex is like 32 or 64 bytes mostly. Which means a 512x512 compressed texture gives you about 8192 "small" vertices or 4096 "fat" vertices. Their weight depends on how accurate and how much extra data you need per vertex (second UVs, vertex color...), let it be a bit less than 4k if is even fatter vertex format.
Now the third texturememory eating thing are the special effects textures, which are uncompressed and can suck up quite some megs depending on window resolution.
Now the "other" memory costs are on the application side, in regular RAM, like the collision info, and "where object x is", relationships... that memory is mostly never a big problem on PC. Just on console you need to be more clever about not loading everything into RAM. As they are more limited.
So we have X memory of what fits into the graphics card which is "texturememory" "framebuffer memory + effect textures" "static meshes" and "shaders" (which are like ultra tiny compared to rest).
Now of course say we have a giant world and want to move around, it will be impossible to preload every texture/mesh into graphicsmemory. So we must cleverly unload/load some while we move around, so noone notices. The amount we can send over without hickups is not a lot (keep in mind that the rest of your scene is still rendered). So one must be very clever about swapping out textures. Hence the modularity and "reuse", rorschach mentioned, is even more important.
not only per-frame will it allow batching of "same state" objects. but even in the long run means less swapping occurs.
now what happens if I have more active textures in the scene than my graphics card can actually load per frame. then the driver will kickin and send a cached copy over (very ugly stalls). The driver also optimizes moments when to reload stuff, as most loading from RAM to Videoram is done asynchronously (ie you call the function doesnt mean it happens right now, but you let driver do it when he wants to). So now we got the driver in the memory equation as well. Now some clever strategies the driver optimizations gurus at AMD/NVidia do might create hiccups in "non common" situations. But what is "common"? If some new major game comes out and has a very specific new way to tackle a problem, we see new drivers magically appearing making smoother rides, of course they might optimize a lot more in those drivers, but anyway...
you get a brief idea of complexity of all this, and why the most common reply on this boards regarding polycounts and whatever is "it depends"
Kevin Johnstone
11-25-2007, 10:40 AM
Mop: sure, but theres always cases where they need to fill in gaps and anything will do when you can scale things.
Clearly a box prop isnt the best example, i was trying to detail a general attitude toward environment asset creation
to substitute the 'everything must go' all purpose optimization attitude.
Also another reason for extra polys in UE3 is smoothing groups. We try to use one smoothing group because it renders a better normal map.
The optimization at all costs method led me to finding out that reducing polycount and controling the smoothing with more smoothing groups costs as much as using more polys to create a better single smoothing group because
the engine doesn't actually have smoothing groups, it just doubles the edge count where the smoothing group changes.
Which costs as much or more than adding additional chamfered edges to support a cleaner smoothing group.
I still go back and forth on this issue myself but generally
the consensus at Epic is that its better to use more polys and process out a normal map that will render well with light hitting it from any angle.
If you go the purist optimization route ( as I did at the beginning of the project) and optimize with smoothing groups to control things and have half the polycount, you
end up with normals that look good only when hit by light from certain angles and its still just as expensive.
Again, I doubt anyone who has a different view is going to be changed by this information. I didn't change my
opinions until I had it reproved to me dozens of times.
r.
Yep, I totally agree that for normalmapped stuff it's way better to have some extra polys and keep a single smoothing group, than to use separate smoothing angles or groups, since as you say they pretty much amount to the same vertex memory anyway, and the former gives a better normalmap.
One of our senior programmers told me that some graphics cards can have a harder time rendering long thin tris (such as those you'd get by having thin bevels instead of smoothing groups) at long range, but I don't know to what extent or how much this would impact performance, not much it seems.
Rick Stirling
11-25-2007, 10:54 AM
Great to see this all written down with some proper technical backing to it - while there is no excuse whatsoever to piss away polygons for the sake of using them, and those verts do add up, its ALWAYS textures, textures, textures that are the main bottleneck.
As for collision information, in most cases there is little issue with optimising that mesh down. I know we don't use the full resolution mesh for collision. In many/most cases, the LOD is used for collision. For characters its pretty much a series of boxes, because let's be honest - we might have modelled fingers and the inside of the mouth, but when it comes to collision all you care about is the hand and the head.
This bit interested me:
[ QUOTE ]
A mesh is pretty much free for the 1st 600 polys, beyond that its cost can be reduced dramatically by using lightmaps for generating self shadowing and so on rather than vertex lighting.
[/ QUOTE ]
Is that because you don't have to store that data on a per vert basis?
perna
11-25-2007, 11:05 AM
It only makes sense that the objects you're likely to scale in a map are also the ones you instance a lot. Rocks, ruins, vegetation, etc. You're going to see good performance on that so the polycount isn't of the highest importance. Especially taking into account that most of those objects will be lowpoly to start with.
I never use cages for rendering normal maps anymore. It turns out to be horrible workflow-wise. They most often break when you start editing the original mesh and in the case of 3ds max restrict the processing rays to straight paths instead of bending them nicely creating a smoother result. Instead of ray cages, you're mostly better off simply adding more geo to the object. You can always make that geo count visually as well as cleaning up your normal map results.
As for temporarily adding geo to a copy of the object for baking, then using the object without that added geo for your exports, I haven't done that in a long time, but am sure it would be useful in cases still.
I'll use smoothing groups at times because clients can be very insistent on respecting polycounts and nothing else.
Basically I believe a lot of people are so insistent on the anal poly-reduction because:
-they're comfortable with it
-it's very straight forward and easy to understand
-they've developed poly optimization skills they're proud of and don't like to hear that those skills aren't as important as they've believed /images/graemlins/wink.gif
I think most of us that have been modeling for a while mastered poly optimization a long time ago. What defines you as a 3d artist now is how good you can make stuff look. It's not like the "old days" (of just a few years ago) when pretty much all games looked really bad and your success was defined by how well you could get stuff to run.
Take the polycount your art leads gives you and use ALL of them. Don't try to impress anyone by saying you used half your budget. If you NEEDED to use half the budget.. then he would have given you half!
perna
11-25-2007, 11:19 AM
I'm not sure this has really been given much attention: It's always possible that an engine will join meshes and send in one batch. This depends upon texture use/reuse, to which degree the overhead is worth it, and so on. What that results in is pushing the polycount of the final object past the safe point (600 used by Epic, as reference). Then you've broken the barrier and now start seeing polycount relevance in rendertimes - you may be up to 10000 tris in one batch now. However, the fact that you're merging several batches is going to save performance anyway.. that is, after all, why you do it in the first place.. which cancels out the fact that you're now operating with a higher polycount.
It gets complicated, and the performance result is individual to each scenario, each engine, each game. Nobody will expect a 3d artist to know these things intimately. But it pretty much boils down to this: if it'll benefit your model a lot to go 200 tris beyond the budget you were given, go ahead and do it. You're not going to make performance drop to 15 FPS. You got to remember that the budget you were given is pretty much pure guess work to begin with /images/graemlins/wink.gif
That doesn't mean you should't optimize, it means you should know what gives you the most bang for your bucks.
Joao Sapiro
11-25-2007, 11:31 AM
amzing info here guys, keep them coming /images/graemlins/smile.gif i have a question :
since smoothing groups are basically a detaching of faces ( hence the increase of vert count since the verts were duplicated ) if you have one continuous mesh and one with smoothing groups wich one would be faster to render ? my assumption is the continuos since there isnt any overlaping vertex, but i would like to know more about implementation of smoothing groups on assets , when are they a must do and when its better to make god smoothing via polygon.
i dont make sense.
Kevin Johnstone
11-25-2007, 11:41 AM
You only learn when and where to break the rules once you've
spent months doing it. I am sorry that this sounds like a cheap answer but its the truth.
I've seen my processing , with the cage, with multiple smoothing groups, then switching to 1 smoothing group on a basic non chamfered wall shape so the smoothing is REALLY
stretched and showing lots of horrible black to white gradients in max.
When I take that ingame, the smoothing forces the engine to bend the normals so when the level is rebuilt you get a
LOT more normals popping out.
Rick: that might be it, i don't remember all the technical reasons for each thing working as it does, i remember
more what works simply from habit now as there are so many
more rules and what not to bear in mind.
k.
perna
11-25-2007, 11:45 AM
Johny. You make sense.
3d hardware has no concept of smoothing groups. A vertex can only contain one normal, so what we call "smoothing groups" is actually just "split geometry".
So, here are the real vertex counts, with the object in the middle having smoothing groups.
http://www.per128.com/pub/crap/per128_smoothinggroup_vertexcount.jpg
Keep in mind that the same goes for UV coordinates. Just one coordinate per vertex.
EarthQuake
11-25-2007, 11:45 AM
More split edges(smoothing groups) are going to give you a higher # of verts, this will always be slower. How much slower in reality it actually is i have no idea, probablly not much.
perna
11-25-2007, 12:03 PM
UVmapping. In this example, the 3d model for the leg on the left will use less verts than the one on the right. It'll have more uv-distortion, but with uv-relax this is distributed throughout the UV-island and not an issue, especially when normalmapping.
edit: lesson learned is: Keep your UVmaps as continuous as possible. If you're mapping a box, all six sides should be connected. Also keep in mind that if you bevel instead of using smoothing groups, it will still increase your vert count if there's a UV seam on the beveled area.
http://www.per128.com/pub/crap/per128_uvmapping_vertexcount.jpg
CrazyButcher
11-25-2007, 12:04 PM
[ QUOTE ]
A mesh is pretty much free for the 1st 600 polys, beyond that its cost can be reduced dramatically by using lightmaps for generating self shadowing and so on rather than vertex lighting.
[/ QUOTE ]
do you mean a unique prebaked AO map?. In theory if you use a second UV for unique map, thats 2 shorts = 4 bytes, which is as much as a vertexcolor. And therefore more expensive (you send same pervertex but still need to sample AO map). it has to do with internal ut3 specific setups.
looking at the ut3 demo shaders, I actually found that you guys probably have some very complex baked lighting stuff, which is better to store into textures. It seems to be more than just a color value. In fact if the effect is done per vertex 3 x float4 are sent, compared to the single float2 for texture coordinate. Which is a lot more, that is "not normal" for oldschool lightmapping /images/graemlins/wink.gif but probably some fancy quality thing you do. havent really reverse engineered the effect, but as per-vertex effect its indeed very fat. But maybe you mean realtime shadows and not baked stuff at all. ... edit: after some more diving into it, its directional lightmapping like in hl2.
anyway this example shows that "it depends", and a magic value like the 600, has to do with vertex formats effects, ie very engine specific.
what the batch article by nvidia showed however, is that there are engine independent "limits", ie say less than 300 today makes no difference if its 1 tris or 300. (the numbers back then were around 200 I simply guessed the 300 for todays cards)
EarthQuake
11-25-2007, 12:13 PM
I think thats referring to using lightmaps as opposed to stencil shadows? Not actually ambocc type lightmaps.
hobodactyl
11-25-2007, 12:18 PM
Really cool thread! Per I had a question:
[ QUOTE ]
I never use cages for rendering normal maps anymore. It turns out to be horrible workflow-wise. They most often break when you start editing the original mesh and in the case of 3ds max restrict the processing rays to straight paths instead of bending them nicely creating a smoother result. Instead of ray cages, you're mostly better off simply adding more geo to the object. You can always make that geo count visually as well as cleaning up your normal map results.
[/ QUOTE ]
I was confused by you saying you never use cages for rendering normal maps; I thought that was the only way to render normal maps? Sorry if this is a retarded question; do you just mean you don't use Max's cages?
perna
11-25-2007, 12:28 PM
hobo: Terminology is a bit loose, it can be confusing. The cages I'm talking about are usually copies of your lowpoly mesh which are deformed to control the length and direction of the normal map processing rays. You can generate a normal map just fine without a cage, it'll use the vertex normals and a configurable ray length.
edit: in max, you can turn off the cage being shown in the projection modifier rollout, and disable it entirely in the render-to-texture menu (click Options, then Use Cage, now define a ray length)
EarthQuake
11-25-2007, 12:28 PM
You can use a cage, or you can simply use the "offset" function in max.
oXYnary
11-25-2007, 01:10 PM
Can we sticky this or add it toa PC wiki or something?
One question:
[ QUOTE ]
edit: lesson learned is: Keep your UVmaps as continuous as possible. If you're mapping a box, all six sides should be connected. Also keep in mind that if you bevel instead of using smoothing groups, it will still increase your vert count if there's a UV seam on the beveled area.
[/ QUOTE ]
So anytime you have a texture seam it will detach the vertices in the engine?
Xenobond
11-25-2007, 01:16 PM
[ QUOTE ]
Can we sticky this or add it toa PC wiki or something?
One question:
[ QUOTE ]
edit: lesson learned is: Keep your UVmaps as continuous as possible. If you're mapping a box, all six sides should be connected. Also keep in mind that if you bevel instead of using smoothing groups, it will still increase your vert count if there's a UV seam on the beveled area.
[/ QUOTE ]
So anytime you have a texture seam it will detach the vertices in the engine?
[/ QUOTE ]
Yes. UV splits & smoothing group edges will split the vertices. I remember reading a pretty good article about this in a gd mag some years ago. I'll try and dig up that article on gamasutra.
perna
11-25-2007, 01:19 PM
oxy: yes unfortunately. To help you visualize why, you can imagine the data associated with a vertex.. the structure is always the same size, so you'll only get one UV coordinate. In max, it seems you can have several uv's and several normals per vert, but even 3d modeling programs break stuff up, it's just done transparently to you. When you select and move one vert like that, you're actually moving several.
Well, I'll ask CB to give an example of such a structure. It ties together with what he said earlier about triangles just indexing a list of verts.
Rick Stirling
11-25-2007, 01:22 PM
A C&P from a half written tech doc I was working on about uvs (and smothing groups) breaking the tri-strips
[ QUOTE ]
Many artists take the number of polygons in the model as the basis for model performance, but this is only a guideline. The real factor is the number of vertices in the model. As an artist your 3d software will count the number of verts in the model, however this is rarely the same number of verts that a game engine thinks there are.
Put simply, certain modeling techniques break the triangle stripping routine, making the vert count in the game engine be higher than the one reported in your 3d software. These attributes physically break the mesh into separate parts, and thus break triangle stripping algorithms.
The most common of these are:
Smoothing groups
Material IDs
UV seams
[/ QUOTE ]
Xenobond
11-25-2007, 01:29 PM
Haha. Why am I not surprised.
http://www.ericchadwick.com/examples/provost/byf1.html
http://www.ericchadwick.com/examples/provost/byf2.html
Part 2 talks more on the whole uv/smoothing/mat splits issue.
perna
11-25-2007, 01:30 PM
Rick: Tristrips aren't relevant to your vertex count. They work differently. The idea is that a tri can be defined by one vert, "re-using" two from the previously drawn tri. This just reduces data traffic, the amount of vertex data remains exactly the same as without tristripping.
Material IDs means state changes, treating the "broken off" chunk as a seperate object, which yes, will break the border verts.
CrazyButcher
11-25-2007, 01:33 PM
eric hosts that gamedesign paper, too. I am sure we are just minutes or hours away he posts the links again /images/graemlins/wink.gif
you get a fixed set of vertex attributes. Think positon,color,normal + some extras like UV channels, tangent stuff. Simply due to pipelining each vertex has no knowledge about the triangle he is part of nor other stuff (okay untrue for latest geometry shaders). So the vertex cannot have 2 normals, or 2 uvs for the same UV channel, hence the split. There might be more splits that are not visible to you (like mirrored UVs might be connected in max, but broken for tangentspace stuff). Whenever such split occurs all other attributes are copied over, so the normal will stay the same, color... but costs are raised by a full new vertex.
A good deal of "viewport" performance depends on taking the internal 3d data (which is organized different) to those graphics hardware vertices. Hence pure modelling apps or "less complex" on vertex/triangle level, can shortcut more, and benefit from speed.
monkeyscience
11-25-2007, 02:13 PM
Before you go optimizing smoothing groups, consult your friendly neighborhood engine programmer, shared vertices may or may not be used at all by the engine and your model exporter may ignore your hard work. The engineering term is Vertex Indexing and there are some reasons not to always use it. ALL vertex data has to be the same for a vertex to be shared. Position, normal, uvs, any shader parameters all have to match. If this doesn't happen often enough, indexing is wasteful.
Also, graphics hardware still renders triangles and with no regard to shared data. Pur's example meshes would get rendered as 4, 4, and 6 triangles, or 12, 12, and 18 vertices. Indexing is only a way to compress data in memory and help transfer rates of meshes to the GPU. If transfer rate isn't the limiting factor but the computed vertex count is, smoothing groups won't help. Neither will converting to quads or triangle strips. This usually happens with expensive vertex shaders like skeletal animation skinning or stencil shadow edge finding.
For everything else though, vertex count optimization just won't get you as far as it used to. Most normal mapped games with fancy-pants shaders are fill rate or texture lookup limited. The three little computations to figure out where a triangle ends up on the screen is just prep work for the potentially thousands of pixels that need to be computed.
"Fill rate limited" btw means fill rate is way slower than other work the graphics card is doing so its best to start optimizing there. It does NOT mean all other optimization work should be neglected. That's common n00b programmer talk.
If you do optimize polycount, do it only on shit that matters. Optimize either your half million poly models or models that will be visible in large counts all at once. Spending time on a 10 poly reduction to a tool chest is only justified if somewhere in your game there's a big stack of tool chests visible all at once and you actually shave thousands of polys from that scene.
JKMakowka
11-25-2007, 02:32 PM
Awesome info, thanks guys!
One thing is still confusing me a bit... how does an engine differentiate between "regular" triangles and quadstrips?
CB already explained that they are stored much more efficiently, but how can I influence that?
Sorry if that is a stupid question /images/graemlins/wink.gif
perna
11-25-2007, 02:38 PM
JK: you can't, leave it to the engine/programmers /images/graemlins/smile.gif Well, actually you can.. make strips! you can look up on wikipedia how they work and then you'll understand how to make geometry that'll split up better into strips.
But, in general try to keep things nice and clean with quads.. that's sufficient, and has many other advantages anyway. I mean, I think even us who understand this stuff intimately still don't go too far out of our way to create super-efficient meshes. It's just not a good way to spend your time. Focus on the main issues(don't make any of the huge basic mistakes), and make good looking art. That's really all you need to do. If the programmers want strips out of you, or anything else, they should tell you.
Kevin Johnstone
11-25-2007, 03:00 PM
Per: 'Just make good art' lol
In the end this stuff gets so damn anus bleedingly technical
that 'Just make good art' 'and leave me the hell alone!' is
really what this thread will boil down to for anyone attempting to see it through /images/graemlins/smile.gif
Bottom line for me at this point is that UT3 is out and
you can see exactly what I did there to work around things.
Though obviously there's a lot of things I messed up as some
of that stuff is 3 years old to me now and pretty embarassing.
One key thing I feel I will have to point out about editing
UT3 environment assets is that lightmaps are crucial.
Lightmaps are a uniquely unwrapped 2nd set of UV coordinates that you unwrap for the engine to calculate self
shadowing on objects and reduce the cost of anything over
600 tri's
They are required because most meshs having optimized texture UV layouts to reuse mirrored sections and if
the engine used those to calculate the self shadowing it would look like ass because it would try to apply shadows
on both sides of a mirrored section when only 1 was in darkness.
The lightmap UV's need huge amounts of space around each chunk because the resolution of the lightmap will be
32x32 or 64x64 generally instead of the 1024x1024 resolution
that the actual textures are.
You also need to leave a large space around the edge of the unwrap, i texel i am told. This is because when the lighting
is rebuilt all those assets, 32 or 64 lightmap squares are compiled into large 1024 or 2048 sheets of lightmap information so if
you do not leave a space around the perimeter of the lightmap UV's diferent lightmaps when compiled on the big
sheet will bleed subtly into each other and create s sublte
shadow gradient artifact leaving out from edges where the
bleed occurs.
You also need to split the UV's in the lightmap in each location where the normals are mirrored so it doesnt bleed
between each mirrored half.
When mirroring normals on the unwrap you need to have the center point be mirrored over the X axis horizontally,
like a rorschach rather than mirroring vertically like a calender page.
This is because the normals are calculated from the combination of 3 tangents in code.
r.
CrazyButcher
11-25-2007, 03:10 PM
JK: you dont need to worry about quads, strips and all that, the exporter or engine pipeline tools will take care of that. I just wanted to show the principle of how its sent to graphics card (ie as vertex indexed lists)
MonkeyScience: when would you actually not used indexed lists? I can only think of very chunked meshes, like particle billboards, or classic BSP brush sides, with a very low "sharing" ratio, but other than that, its kinda unlikely to not benefit from reuse I think. Also the performance papers I read suggest to use indexed primitives. (like this one, even a bit aged, I think triangle lists are the most optimized way of rendering, http://ati.amd.com/developer/gdc/PerformanceTuning.pdf ). Of course the lists have to be optimized for "order" to make best use of vertex cache, but drawing non indexed will take out the benefit of vertex cache completely.
so for most "artist" created triangles, I think it will always be indexed lists, no?
hobodactyl
11-25-2007, 04:57 PM
Per: Thanks for the quick response! I thought that might have been what you were talking about since I'd seen it in Mudbox. I can see how that would be more time-efficient.
Stickied because this thread is really good.
(on the titles: triangle optimization is not, but vertex is always /images/graemlins/wink.gif )
I always work my optimizations with vertices, and I always consider other things, drawcalls, splits, and when you have to do more draw calls due to different materials/textures.
For console especially it's memory versus the vertexdrawing power, then you have to, to a slight degree keep fillrate in mind. (overlapping geometry)
as on baking with a cage, as mentioned above, if you're having a hard time with the cagebake, and your fingers are itching for the bevel buttons, Just do a combination, bake one with a cage, and one without for the straight rays,
It's just a texture, so you can combine the parts of the different renders that you like, so that you get correct edge-normals and then straight renders on surface-details that might've gone perspective-skewed.
and now again, memory, which should be a big part in this thread too, since you usually only have a small part of the 360 memory to work with /images/graemlins/smile.gif not even half of it in some cases.
reuse surfaces, dont mirror, but ROTATE!
perna
11-26-2007, 06:05 AM
[ QUOTE ]
(on the titles: triangle optimization is not, but vertex is always /images/graemlins/wink.gif )
[/ QUOTE ]
Careful now buddy /images/graemlins/smile.gif Read the whole thread, particularly the opening post. It's nowhere near as simple as that, either, which is something we're trying to communicate here.
Why we started this topic in the first place is a lot of people forget that the job of an artist is to make assets look great. Nobody will pat your back for making extreme optimizations, whether polygon or vertex based. With games now running more than one million polygons on-screen, it's all about priorities.
The hard to accept fact for a lot of you is that you simply won't get by with mostly technical skills for long. The more time passes, the less technical restrictions there are for artists. Some of you may have struggled a long time to learn to optimize a few polies off a mesh, and are now told it's not so important anymore? Of course you're not going to like that.
I see a great deal of threads and discussions on technical topics on these boards. But it gets silly... people go on about edgeflow... for a model that looks absolutely terrible. How is better edgeflow going to help? How is better use of polies going to help? The model will still look bad.
It's just that those technical things can be learned by anyone, so it's easy for people to shoot off at the mouth about them, but learning to make stuff look great is a major challenge, one that far less are prepared for.
It should give some food for thought that the people here who are the most technically capable and knowledgable also are the people that care the least about those skills.
Per, for me optimization means, looking the same, but costing less, as in, I wouldn't sacrifice the looks for a bit of juice, but there's alot that can be done without actually removing any looks but making it cost less, THAT's optimization.
It's about knowing how stuff works aswell, knowing a bit of tech as an artist.
There's always a visible hardware barrier, and we're always hitting it, way more in some games than others, and games on modern consoles are still struggling to maintain framrate.
While you're fully correct per, I can still see the headache that can come from a big team with only a few persons thinking about optimizations /images/graemlins/smile.gif
Noren
11-26-2007, 06:31 AM
[ QUOTE ]
In case of 3ds max (the cage ) restricts the processing rays to straight paths instead of bending them nicely creating a smoother result.
[/ QUOTE ]
Hi Per, can you elaborate on this please? Sounds wrong to me, but I might have misunderstood you.
perna
11-26-2007, 06:51 AM
Noren: There are two types of normal map processing cages. One that limits only length, one that controls direction and length (like in max).
Controlling direction like that means the generator is not able to tweak the results ideally.
Here's a test you can try: Push a ray cage in max X units and render out the normal maps... then disable the cage and use an offset/raylength of the same X units instead. You should get the exact same result, right? But you don't, the non-cage output is going to be significantly better in most cases. Someone may have time to provide some screenshot examples.
in the first post the optimised version still looked pretty good, so what the problem.
surely its also about modelling 'just' enough detail to support the extra detail you are trying to bring out with the normal map.
more about efficient modelling really.
If you can make something look good with 1000 polys, why make it with 1200.
JordanW
11-26-2007, 07:14 AM
ruz I'm not sure that the optimized version is an actual mesh with those changes made i think it's a paintover, so the implications that would be seen from the inaccurate normals are not shown.
perna
11-26-2007, 07:23 AM
Ruz: please bother to actually read the full thread. It's kind of dispiriting when a lot of us put in all this work to share some info and it goes in one ear and out the other.
Did you read the bit that said a 100 poly mesh will not render faster than a 200 poly mesh. There's a limit where, if you optimize below it, you are just wasting your time.
Noren
11-26-2007, 07:33 AM
Per: I'm a 3dsmax user myself. That's why I got curious in the first place because my experience with the cage has been different.
And even now, if I render the testcase you proposed I get two exactly identical normalmaps.Max 8 and Supersampling activated. I used a simple box here (one SG) and almost always use the cage except for occasions like described by eld. So it can very well be that something slipped under my radar here and I would be very interested if someone could provide an example of cage vs. no cage not matching up. (Cage just pushed, of course, not manipulated further ).
A big plus for me with the cage is, that if you happen to work with smoothingroups it will still interpolate the castingrays and you don't wind up with missing parts in the map, while the normals are still correct.
[ QUOTE ]
...the non-cage output is going to be significantly better in most cases. Someone may have time to provide some screenshot examples...
[/ QUOTE ]
You can use both though, as a combined result ( it's a texture afterall /images/graemlins/smile.gif ), as non-cage renders will usually shoot and miss its target on corners and such, but cages will usually do crazy renders on a a big flat surface that has to have details rendered onto.
per ,don't get downhearted.
What I said was that you should be modelling just enough detail to support the model you are making.
My point was that you shouldn't add more detail just for the sake of it. I did n't say anything about rendering speed.
personally I would keep taking out loops until I thought it was degrading too much in quality. Its about commmon sense a lot of the time.
you guys seem to be talking mainly about high end , next gen stuff like unreal engine/ doom engine.
What about MMO's or similar . I am sure that in the grand scale of things, polycount might have more of an impact.
CrazyButcher
11-26-2007, 08:05 AM
yes paintover, and Ruz the discussion is more about sacrificing quality for a "few tris", which is not worth it. The discussion is about those very last ultra bits of optimizing. It should not imply that optimizing at all isnt needed, its just that there is a grey zone where the amount of lowered quality or time spent with it, isnt worth the benefit in speed. So its not about "adding more", but "removing too much".
And those "few tris", are with time actually getting more and more. The hardware is still similar for MMOs as well, after all the performance pdfs mentioned, are like 3 years old, which should mean, thats the PC low-end of today.
EarthQuake
11-26-2007, 08:45 AM
[ QUOTE ]
per ,don't get downhearted.
What I said was that you should be modelling just enough detail to support the model you are making.
My point was that you shouldn't add more detail just for the sake of it. I did n't say anything about rendering speed.
personally I would keep taking out loops until I thought it was degrading too much in quality. Its about commmon sense a lot of the time.
you guys seem to be talking mainly about high end , next gen stuff like unreal engine/ doom engine.
What about MMO's or similar . I am sure that in the grand scale of things, polycount might have more of an impact.
[/ QUOTE ]
The example was obviously not for a low-end mmo, it was for a cuurent generation project. It would be too much work to cover every single platform, every single engine, every hardware level in one thread. We're talking about current tech here, mostly how current generation hardware handles rendering. Of course if you're making a model for warcraft3 you're not going to want to follow these guidelines, so take some of your own advice and use *common sense*.
yeah I sometimes kinda forget that you are 'removing' polys rather than adding them.
It just confused me in the example that because the optimised verison of the box still had a decent bevel along the edges and I thought it would still look correct with a normal map on it.
To me that extra row of loops adds nothing to the silhouette, but what do I know I am character artist:)
TBH I would experiment and if it looked ok I would trust my instinct to say yeah that looks right, the silhouettes ok and there are no weird shading artefacts which there should n't be becaseu the box has beveled edges.
interesting read. Do you guys think these principles could be true for online 3D? Director with havok, flash cs3, java etc? or are those engines not powerful enough to experience benefits from good hardware? Im just thinking of all the really low poly online games out there and was wondering wether this is due to performance or bandwidth or software or just the developers?
Some of this thread has just gone over my head but Im currently working on a Director 3D game and its my first 3D game, so this is a relevent topic as theres just 2 of us making the assets.
EarthQuake
11-26-2007, 09:02 AM
I'm going to go out on a limb and say no, definately not. Those sort of projects are designed to run in a webbrowser, on a large range of hardware and would likely not benifit from the optimizations of a current gen rendering engine.
CrazyButcher
11-26-2007, 09:03 AM
Ged I think its a mix of everything that results in low-end graphics. The last time I looked at Director3D, it was like Directx7 or 8 renderer below, and it didnt use any "modern" features, read "modern" being 4 year old already... Those bigger content apps normally dont go with the modern hardware, and do a lot more cpu stuff, and hit those batching limits earlier. They are mostly meant to run on "anything" that is some integrated chip sets, with age old drivers. Though I am not sure how good flash or the others are. I know certain java libraries that make use of performance enhancing capabilities do exist, but I dont know how widespread the stuff is.
It would be good to just test the engine with dummy assets of differnt resolution, and see how it behaves on target hardware.
Mark Dygert
11-26-2007, 09:25 AM
This reminds me of the color pallet optimization discussions. Do we use 16 shades of brown and make the most of it or do we pick stock colors and hope people like all rainbow levels. Now no one cares what pallet your textures use. We're breaking down the barriers that tie artists hands.
I think its important to keep the post as it was presented. It's not a license to waste, but approval to stop over working something to the point it hurts the end result. It's also a call to take the game as a whole into account when modeling one tiny aspect of it. I think people in general, (beginners especially) will over estimate the time that will be allowed per asset. Yes you can make a loverly dumpster out of 250tris and 2mo to work on it. Or you could make an entire alley with 25k tris and those same 2mo.
You want to be careful and not run the other way and never optimize. Being neat and tidy can be a boost to production time, especially if that asset is going to be worked on by other people. passing on something that is easy to work on can be pretty critical when the bugs start rolling in. I always hate having to go back into other peoples files, label materials and sleuth around a file for 20min before I can start fixing things. Spend 20min organizing up front to save someone else 20min of headache. Technically its a wash but people won't mind working on your files if they aren't a nightmare. At that point its not an issue of game resources but production time, which for me is king over all.
The market of games I work on is much lower then the low end mentioned in those PDF's and as such we still have to keep to the old idea of optimize until it hurts, but just a little. It will be a few years before I can toss polys to the wind and not care. I thank Microsoft for pushing quality video cards and making it a center piece of a good vista PC. It will only quicken the death of this timely tradition that keeps me from creating more.
rooster
11-26-2007, 12:03 PM
i think you made a great point in pnp vig, that polygons and draw calls aren't the only resource, time is THE resource
JKMakowka
11-26-2007, 01:53 PM
Ok maybe a bit OT, but what about animation costs (e.g. transformation costs). More vertixes would certainly mean higher transformation costs or is that mostly limited by the number of bones (and level of weights) anyways?
And what about vertex animations (.md3) and those new Dx10 geometry shaders?
Of course given the fact that the object isn't fillrate limited anyways.
Edit: hmm to clarify: I think I read some where that DX9 and below hardware only does vertex animations and all the bones and vertex weighting is done on the CPU (and then transferred as vertex animations to the GPU), while on DX10 hardware with geometry shaders the GPU can do that. Is that right?
perna
11-26-2007, 02:49 PM
JM: Someone else can give you accurate data on what you want to know, but think about it in a practical way: How many of the polygons you see onscreen in your game are bone animated anyway? Like in a typical modern fps, not many. If you're gonna have a whole bunch of characters on-screen, well there's LOD for that.
Vig, while it is true that time is the most expensive thing, optimizations and such knowledge is a skill just like the art itself is,
a great artist with technical knowlege can do those optimizations quick, and if those are done for each single prop then there's something to gain from it.
The optimizations I do for work doesn't take much extra time, it's nearly always just a quick plan on how to make the object, and a thoughtprocess when in the creating.
It even helps quite alot with the artistical side too!
fonfa
11-26-2007, 03:12 PM
interesting material. but i think you guys are just getting each other wrong.
timing is part of a game artist's skill too, along with optimizing. you just gotta find the balance between it.
CrazyButcher
11-26-2007, 05:13 PM
JKM: since the first shadermodel 1 cards (geforce3...) it was possible to do skinning on the GPU. Basically with higher shader models, it became more efficient to do (can do more bones and more weights).
This in fact is done, so what you heard is wrong /images/graemlins/wink.gif geometry shaders are mostly good for "generating vertices", which was not possible before. That can also be used to generate 6 copies of a mesh and render in all 6 cubemaps at once, for example. Geometry shaders can also be used to generate shadow silhouettes for stencil shadows. For those shadows it was indeed necessary to do only CPU before, simply as on the CPU was able to detect silhouette edges. hence doom3 is very stressful for the CPU as well. Most games however dont do stencil shadows, and benefit from GPU vertex processing for it. There are some workarounds that can do silhouette detection on older GPU hardware, too, but not so common I think.
for GPU skinning on sm1/2 and even most sm3 hardware, the bones are stored in "uniform/constant" memory, of a vertex shader. Typically sm1 had limits like 25 bones, and sm2 is like up to 75 bones. Then you must feed per vertex the bone index and a weight. typically that is like 2 shorts per assignment. Vertex shaders will then be written for a certain number of maximum weigts per vertex (say 2 or 3), and all vertices (regardless of their actual weights used) will be transformed the same way. Hence if you know the weights per vertex the engine allows, there is no reason at all to not use them at full extent. Typical would be 2 or 3 max weights.
the bones matrices are computed by CPU, before and sent as those "constants". The less max weights per bone = less instructions in shader + less per-vertex data to be stored. The less bones in total = less "constants" to be sent every time the model is rendered.
vertex animation aka morphing is a bit different story, and requires another per-vertex attribute stream that is either also preloaded and "fixed" (think morph targets), or dynamically changed every frame (aka md3). The latter is particularly ugly as it means sending pervertex data every frame, which is supposed to be avoided.
Skinning basically allows all mesh data to be preloaded and stored in vidmem, and only the bones' matrices must be resend. Hence its the preferred way.
However there is several higher techniques possible for animation, that store matrices in textures (sm3 vertex shaders can access textures, but kinda slow), or use renderto vertexstream, stuff and so on. However not in the common case. ut3 and crysis still use just the constants, as I would say nearly everyone else.
On consoles with dedicated vertex processing hardware (like what SSE was supposed to be for CPU), skinning might be done in software for load balancing. Like PS3's Cell has 7 streaming units, that can work with the GPU directly and "help out". Or when real complex vertex stuff is done (unlikely so), or stencil shadowing...
Sectaurs
12-06-2007, 12:06 PM
this thread is incredibly informative.
thanks for taking the time and effort.
i'm learning!
Just so I may recap on a couple of points made early on:
Vert count between 2 polys, in-engine, will increase if:
-The 2 poly's share seperate smoothing group
-The 2 poly's are a part of two different UV islands
Vert count between 2 polys, in-engine, will stay 'the same' to the application's count if:
-The 2 poly's share the same smoothing group
-The 2 poly's are a part of the same UV island
Is this correct?
Rick Stirling
12-17-2007, 05:04 AM
Adam, I believe that is is correct, and I believe you can also add in shaders/materials. If you apply a different shader to each polygon that will break it into 2 objects.
perna
12-17-2007, 05:12 AM
AdamBrome: The best way to look at it is one vertex can only contain one data entry of each type, be it position, normal, uv coordinate or color. Whenever you need two data entries (as with a smooth group break: You'll need two normals), you need two vertices.
So, if the "same" vert has 2 positions on a UV map (such as you get when there are seams) there needs to be 2 3D verts as well.
In max, when you're in UV edit mode, you can select geometry in the 3d viewporport and the selection will be reflected in the UV viewport.
Sometimes you'll select an edge in 3d and it selects TWO in the uv viewport. You'll select a vert in 3d and it selects SEVERAL in the uv viewport. The highest count is always the real count. If if shows 4 verts selected in the uv window, then you actually have 4 verts in the 3d window as well, they're just "grouped" and handle as one.
Smoothing groups just control vertex normals. Whenever you make a hard break with different SGs you create more verts. You'll have one vert pointing one direction and the other pointing another direction. That's how you get the hard shading there.
So from this, you'll understand that things aren't split up twice. If your UV seam is in the same place as your smoothing group seams, that doesn't mean 4 times the verts.
edit: Yeah plus what Rick says. That's handled differently than the above stuff though, as material isn't a per-vertex thing, you first set material, then you push the geo data, set another material, push other data. So basically seperate material means seperate model.
Rick Stirling
12-17-2007, 06:20 AM
I *think* that when it comes to the shader, that breaks the polygons into a different Drawcall, instead of a batch, but I'm willing to stand corrected.
As to the smoothing groups adding extra verts, if you get your normal maps nailed you can often forego the smoothing groups and set your entire object to a single SG. Also, in the past we'd use smoothing groups on hard edges to stop the polygon shading leaking round (cuffs, jacket hems, hard edged machinery). Since adding this group adds extra verts and (will usually) break your batching, it's (usually) cheaper just to chuck in those extra polygons that a bevel will give you.
Usually cheaper, but when you are dealing with deformable objects (skin/morph), you've got more transforms to compute, so it's a toss up there.
perna
12-17-2007, 07:27 AM
Rick: Transforms are on vertex-level
Eric Chadwick
12-17-2007, 08:26 AM
Adam this picture really helped me get my head around it.
http://www.ericchadwick.com/examples/provost/byf2_figure2.jpg (http://www.ericchadwick.com/examples/provost/byf2.html#wts)
Good thread dudes.
Per, Rick, and Eric.. thanks!
While I have definitions for the words in my head, can someone else state what these mean: Batch & Drawcall. I will put what I think they mean as I am sure its wrong.
"Batch"
-If a model is duplicated a number of times and its material isn't changed then they're all apart of the same 'batch' so long as they aren't grouped or defined by LOD's. Guh, that make sense? Probably not..
"Drawcall"
-Not entirely sure. I want to say that its when material layers (spec, diffuse, etc) are called to the frame buffer but I am not certain.
EDIT: Also, vertex normals. Aaargh, wtf! haha I always thought the normal of a triangle to be important and now I learn of vertex normals? Anyone have a handy picture demonstrating how a vertex's normal is defined?
Rick Stirling
12-18-2007, 05:05 AM
Not entirely sure...but I think I'm in the right area here:
Batch: a chunk of verts that are sent from the CPU to the GPU. Batches are part of a draw call, and you can have several batches in a single drawcall. Batches are chunks of continuous verts that don't get broken by smooth groups, uv boundaries etc.
perna
12-18-2007, 06:45 AM
AdamBrome: Strictly, there is no such thing as a polygon normal. Of course you can measure the normal of a polygon if you want, but it has no relevance in rendering. You know how on a non-flat object such as a sphere the shading goes gradually from light to dark? This shading goes from vertex to vertex, and the vertex normals determine how much influence the light should have.
It's important to realize that if light was calculated "realistically", light on an angular lowpoly object would never look smooth. Even hipoly FMV work has to use the fake gouraud style shading to get by.
There's a method to interpolate vertex normals on a curve as opposed to linearly, that is used in offline rendering and can be done in realtime shading as well.
CrazyButcher
12-18-2007, 07:48 AM
for coders mostly batch = drawcall.
drawcall:
typically consists of "world matrix", fragment & vertex shader, render states (blend,alpha-, depth-test...), textures to be used and the geometry
the geometry is a vertex buffer + indices which make the triangles (1,2,3, 2,3,4 ...), you dont need to use all vertices in the buffer, so it doesnt really matter...
it might be that the vertex data is made of different streams, that reside in different buffers as well. but lets not overcomplicate things.
the reason those non-material "splits" are nasty, is simply storage space. the more "duplicates" the larger the vertexbuffer, and the less chance of reusing the same vertex.
the vertex chunk often resides on graphics card memory (static meshes). sometimes it may be copied dynamically (lots of copies concated of the same stuff for simple instancing, or say some manual vertices say from rocket trails, particles, shadow silhouettes..). These kind of copies are not super fast, most data in games is static.
it is simply the raw data for each vertex, i.e. it is already broken down to the "unique" vertex level. several vertices of course can be indexed multiple times when they share a triangle that doesnt require splits..
batching:
being able to "render as much as possible" in a single drawcall. that means try to maximize the triangles being drawn, as every time you "start a new drawcall" its not as fast as if rendering a lot with one call/batch.
so say we have that huge vertex data chunk, already stored on vram, so we dont need to send it per frame. Then batching basically means trying to make long series of "indices" into those vertices, that we actually need now.
I can recommend reading the pdfs linked to in the very first post (the very first slide of the batchbatchbatch paper starts with "What Is a Batch?")
Another reason to avoid extreme optimization: LODs.
So a few months back, Okkun made a good point to our team as we were making a bunch of LOD models. Some of our artists were focused on reducing poly counts to exactly the target numbers, even if it meant massacring a model. He made the point that if performance was poor, what might be the first way to improve things?
They'd bring the LODs in much closer, where those craptacular models are right in plain view.
So rather than completely destroying an LOD model to save a whopping 34 polygons, it might be better to just leave them in and maintain the proper silhouette and UV borders. Even if several dozen of these models are on screen at the same time, an extra 5000 polys means very little to a modern graphics card. But as Per was pointing out, bad looking art is bad looking art no matter how you slice it. And in the case of some of these LODs, the degradation for a very minimal polygon savings was quite extreme.
cardigan
01-07-2009, 06:29 PM
I've had long long discussions with my lead engine coder about this, and he has raised something which I don't think has been mentioned here yet:
Having bevels along edges that could otherwise be hard (as in the first toolbox example on this thread), whilst not increasing the vertex count, does lead to quite a lot more small triangles on screen, especially if applied to everything in an environment.
My lead engine guy says that because gpus can fill multiple pixels in parallel, but only within the same triangle, having lots of triangles that contain very few pixels leads to stalls in filling the pixels and thereby hits your fillrate.
Example - if your triangle is 2 pixels big on screen the maximum number of pixels that can be processed simultaneously is 2, when actually the GPU could be pushing a lot more through.
As our engine was fillrate limited (I believe most are these days), he felt that this was a significant factor and therefore said that we should use hard edges where possible.
Has anyone else heard this? Any thoughts?
CrazyButcher
01-08-2009, 04:19 AM
yes thats true, but I think it was mentioned before (I put it into first post now). Thin triangles or triangles smaller than a pixel or just a few pixels will be bad for the reasons he mentioned.
That boxmesh example here, depends of course on the size of the box "on screen". Once it's just background, or always small on screen, the bevels would be overkill, and one can live with a few minor shading artefacts.
Ie its a question of LOD and "how the model is used". If you have a game with corridors and the box will always be exposed in reasonable size, there is no reason for LOD/lower res model, as the bevel tris will be okay (unless you talk about ultra fine bevels, which always would be too thin).
cardigan
01-09-2009, 03:33 PM
I've been wanting to actually do a side by side performance comparison test and see what the impact is when dealing with a whole environment, unfortunately it requires building the environment twice. If I get round to it I'll report back!
00Zero
02-04-2009, 07:55 PM
perfect example: CoD WaW uses MODELS for their grass. not alphas.
Rob Galanakis
02-04-2009, 08:39 PM
lol wut?
Frankie
04-16-2009, 08:58 AM
The start of this thread is pretty old and I haven't spent much time keeping up to date with the cutting edge of 3d engines. Are there any big changes in the way things are done worth knowing about that haven't been mentioned in this thread?
CrazyButcher
04-16-2009, 09:27 AM
not much should have changed, basically this is more or less a hardware / priniciple rendering issue, instead of individual 3d engine's advancements.
We are pretty much fixed to "older" hardware anyway (especially with the consoles). Imo we wont see a "real change" until the all new consoles or all new gpu systems (larrabee or whatever advanced ati/nvi gpus in years to come) are mainstream, ie a few years to go still...
And it's hard to define "mainstream" with tons of wii, iphone, psp, ds or casual Pcs having "last-gen" hardware.
Tumerboy
04-16-2009, 10:48 AM
No, this was simply pointing out that technology is to a point where counting every last triangle isn't really a big deal any more. This is not an excuse to do sloppy work, or to not take appropriate optimization steps, but rather to say that it's better to leave in a few bevels to make the model look better.
Frankie
04-16-2009, 11:59 PM
Thanks for the reply CB, I'm also interested in how the new shadowing meathod works and how expensive it is. (compared to stencil shadows) if you dont mind explaining :)
CrazyButcher
04-17-2009, 12:20 AM
You have to render the scene's depth from the lights frustum (classic shadow mapping). But as you only require depth, this is very fast (ie mostly vertex bound).
There is numerous tweaks to enhance quality, by rendering multiple (cascaded) shadowmaps for the sun, shearing... Which will just raise the number of how often a single object might be rendered to depth.
Once the shadowmaps are created, "normal" shaders can sample them (again different methods exist, for soft shadows, remove aliasing... or whatever).
The setup for stencil shadows as such was more complicated (requiring stencil buffer, requiring volumes/silhouettes...) However for less complex geometry this is still practical as it gives you pixel-perfect shadows, while the shadowmap approach has to do more tricks to hide aliasing. Then again its better suited for soft shadows.
Shadowmapping is more pipeline friendly, as you have a real "shadow texture" to sample and therefore can do more tricks. But it needs more tweaks and advanced methods to get really nice results.
So how expensive it is, depends on the quality you want to achieve, but you can get "more" out of it then stencil. And silhouette generation for stencil stuff requires CPU (slow) interaction on pre SM4 hardware (ie the majority of hardware out there).
Richard Kain
04-17-2009, 09:07 AM
I am given to understand that a lot of outside-the-box coders are seriously considering a revolution in rendering. Apparently a lot of them think that the advent of multi-core processors is going to make the traditional GPU obsolete, and that the standard rendering pipeline will also be phased out in a few more years. As I understand it, the ability to mutli-thread opens up new avenues for software-based rendering, avenues that will make software based rendering comparable to, and in some cases superior to, traditional GPU-supported rendering.
Older rendering techniques that were discarded years ago are now being looked at anew. I believe there are some who are attempting to resurrect voxel rendering. It could be that in the future polygonal modeling will be rendered obsolete.
Of course, this is pure conjecture at this point. It will be years before multi-core processors are common enough to make such development financially viable. And current polygon-based tools and engines are so prevalent that such a major shift in methodolgy is sure to be slow.
Still, its probably a good time to be a C programmer.
Frankie
04-19-2009, 12:05 AM
Thanks for the replies, interesting to read.
CrazyButcher
05-20-2009, 08:12 AM
removed the link, it was an artificial benchmark test (48 to 48 000 sided circle) to prove the adversary effect of thin triangles on performance
NOT suggesting to use certain layouts for "caps" of low poly, with so little geometry at hands interpolation artifacts are much more dominant...
crazy butcher, wow that is such a difference in speed. Definately something to keep in mind :). thanks for that link.
Mark Dygert
05-20-2009, 08:38 AM
Hey, that's the crazy guy that documented the interior mapping pixel shader isn't it?
Interesting read, thanks for posting, but like you said I don't think I'll be redoing any cylinders anytime soon...
ArtsyFartsy
05-20-2009, 08:38 AM
added this link
http://www.humus.name/index.php?page=Comments&ID=228
to the first post
illustrates how triangulation strategies affect speed. Basically shows the micro thin triangles issue vs large area triangles. Bear in mind that the "number of sides" is really high in the stats, something you will not reach in regular modelling. Ie the low numbers you work with have same performance costs, so dont redo your cylinders ;)
That was a great little read.
I guess these are issues the engine programmer needs to deal with rather than the modeler, since the modeler will supply mostly quad based geometry.
However, if you're following the workflow of reimporting high poly geometry into a modeling app (3ds max/maya) and then optimizing it, then the optimizer should perform the recursive area subdivision which is, I think, the way 3ds max does it.
Mark Dygert
05-20-2009, 08:46 AM
I think you're right, this is the method 3ds uses to create cylinder caps, nice to know there's some logic behind what looked like chaos.
That was a great little read.
I guess these are issues the engine programmer needs to deal with rather than the modeler, since the modeler will supply mostly quad based geometry.
Modelers need to be aware of what their hidden edges are doing, and not think that engines work in quads but know that everything will be interpreted into triangles. In Max its pretty easy to flip hidden edges around, which sometimes falls on the rigging/animation guy. In Maya I think you have to force the edge by creating it or making it visible? It tends to re triangulate the hidden edges on its own... maybe there's a way to force it to stop and I haven't found it yet.
CrazyButcher
05-20-2009, 08:51 AM
@vig, yes humus did that
@artsy, I agree normally you dont run into extreme situations like that, ie most triangles in a gamemesh should have similar sized edges.
For reimport I think useability is more important, ie something like the last triangulation scheme would be sucky to work with.
Mark Dygert
05-20-2009, 09:05 AM
Would it be easily scriptable to recalculate the edges based on the last method, say on a selection or only on polys with more then 2 tris? Maybe evaluate the mesh kind of like STL Check, highlight any polys that might not be optimal and give the person the chance to deselect/select them before changing them around?
I'm not suggesting anyone get cracking on this, just wonder if its possible... seems like it would...
CrazyButcher
05-20-2009, 09:26 AM
well first of all we are talking of more than 48 sided ngons, I very much doubt you will find those in real world ;) I somewhat think these images of the 12 sided cylinders burnt into your head, that even on that detail level layout would matter much... which I doubt ;)
I second the idea of highlighting triangles with extreme porportions, and whose relative area compared to the rest of a mesh is very small...
but I would leave it to the person fixing stuff
Zwebbie
05-20-2009, 09:38 AM
added this link
http://www.humus.name/index.php?page=Comments&ID=228
to the first post
While the difference in speed is impressive, the Max Area method of triangulation completely ruins your normals. When I have to choose between a performance boost and correct normal maps, I prefer the latter.
CrazyButcher
05-20-2009, 02:51 PM
I will remove the link, it creates too much false impressions
zwebbie you will never gain a performance boost on regular "ingame" cylinders because they have too little sides... and yes with so little geometry interpolation issues are much more important.
EarthQuake
05-21-2009, 01:53 PM
Ok math dorks, since i saw some mention of how this applies to cones in that article, i was curious myself.
WHICH IS BEST
A = standard cone
B = Each loop has half the amount of edges
C = Most uniform method i could come up with
http://dl.getdropbox.com/u/499159/cones.jpg
Edit, actually im sure 64, 48, 32, 16 would have been more uniform for C, oh well.
Proxzee
06-01-2009, 02:12 PM
I think optimization will still be relevant, since most people do not own High end cards.
The casual market is still huge, and things like handhelds and phones still require low fidelity models.
[PB]Snoelk
09-01-2009, 03:28 PM
i think the b cone would be best.
cone a needs plenty of smoothing groups the look smoothed without over-smooth the cone end. something like face a smoothes with b, face b with c and a, c with d and b.
uv space uses the same vertices as the normal mesh and 2 vertices more for the seam.
in cone b you can smooth all group 1 except the cone head. uv space uses same as mesh and 2 vertices more for the seam.
cone c works like cone b but uses more vertices.
I go with option D, just a tiny poly at the tip using different smoothgroup.
sama.van
12-21-2009, 06:05 AM
I never added that one here. but maybe it could help some?
It was an attempt for someone to create a detailled military box from... a box (=cube )... :)
This is not really original work but it could help some to understand how to delete some polygons with "good" diffuse and some other shader?
http://www.samavan.com/3D/Realistic/Box_A/samavan_Obj_Box_A001.jpg
http://fc02.deviantart.net/fs48/o/2009/337/9/6/965775fcac1a424cde129ca0fad29247.jpg
http://www.samavan.com/3D/Realistic/Box_A/samavan_Obj_Box_A002.jpg
Hitez
02-04-2010, 01:36 AM
I spent a solid few months of optimizing polys, lightmap UV channels, collish meshs for everything in UT and the act of
stripping 2million polys out of a level generally improved the FPS by 2 or 3 frames.
Polycount is not the huge issue people were rightly sure it was previously. The bigger issue now is texture resolution because all assets carry 3 textures as standard, normal map, diffuse and spec and thats before you have additional mask light textures for emmisives and reflection or whatever other stuff you are supporting in the shader.
Shader complexity is also a bigger issue now because it requires longer rendering time.
Section counts are a bigger issue , meshs carrying 2 of each texture and thus requiring a 2nd rendering pass.
I can't explain things technically enough for people but the coders have explained to me a couple times that the problem with just beating everything down with optimization on things like polycount doesn't affect things as much because of different things being CPU or GPU bound.
Mesh count is a big issue now that everything is a static mesh rather than the majority being BSP. BSP is terribly inefficient compared to mesh rendering also.
A mesh is pretty much free for the 1st 600 polys, beyond that its cost can be reduced dramatically by using lightmaps for generating self shadowing and so on rather than vertex lighting.
The reason also I was saying I wouldnt take out the horizontal spans on this piece was also largely because as an environment artist you have to be thinking about
the crimes against scale the level designers will often make
with your work to make a scene work.
Just because I know its a box , doesn't mean it wont get used as something else much larger so I always try to make sure
it can hold up, whatever it is, when 4 times the scale!
Butcher mentioned instancing, this is another feature that we relied upon much more heavily to gain performance.
Due to textures / BSP being more expensive now and polycounts cheaper we made things modular, very very modular.
For instance, I made a square 384/384 straight wall using a BSP tiling texture and generated about 60 modular lego pieces that use the same texture and all fit with themselves and each other to replace BSP shelling out of levels.
This led to lots of optimizations in general and quick easy shelling of levels and it gave our levels a baseline for
addition of new forms to the base geometry level.
I doubt I'm changing anyone's opinion here, maybe making a normal map driven next gen game will convince you though.
And by next gen, i just mean the current new technology like ID's new engine, crysis engine, UE3 etc because they
are normal map driven and the press likes fancy names
for simple progression of technology.
r.
DID ANYONE READ ALL THIS?! that fucking blew my mind thats so insane. all that crazy dedication, no wonder i still think UT is one of the best looking of the "next-gen" era.
fade1
02-04-2010, 03:36 AM
Even if i come from the lowspec game dev side(wii&ds), i just can underline kevin johnstone's post. Of course you shouldn't waste polys on senseless edge loops and details, but it's the shaders and sfx, who send your performance to stop motion. To get our games on wii running 60fps our bootleneck is the shaders, which need to be simplified here an there. The vertex count is just a base issue and something comperable easy to optimize.
C2S07
03-20-2012, 12:20 PM
In his article, Guillaume Provost suggested that before optimizing, first it should be determined whether the mesh is transform or fill bound. Question: How does one estimate the transform and fill cost of a mesh? Are there any tools, scripts or plugins for this purpose?
Impossible9
05-07-2013, 09:18 AM
Greetings. I'm a 3d modeller/level designer/game designer person, aspiring to be one of the above. I just want to make games, in general, so I'm interested in learning all associated trades.
Hence why I decided to become part of this community, where I hope to learn lots of new things, ask for, and in time, give advice.
vBulletin® v3.8.4, Copyright ©2000-2013, Jelsoft Enterprises Ltd.