StarMade v0.18999 - Some serious optimizations and balance changes
this update is something very special. While we are still working on a lot of new features, we took the time to take an indepth look at several systems. Mainly the graphics and the network code.
A lot of this game’s choices in terms of style and design have been made to make the game as big and as scalable as possible. This is the whole reason we stayed with the block only system as opposed to detailed polygon graphics, as blocks present a unique set of optimizations and designs that would be lost in a conventional LoD (level of detail) polygon system.
To tell the results upfront, we managed to increase the performance of both graphics and network immensely.
In numbers, this means an almost doubled framerate (depending on hardware), and a decrease of average network traffic to about 10% of what it was previously (profiled by stress tests on the server).
These kind of improvements are some of the most technical ones to be made for any program. A vast amount of hours went into analyzing and profiling, while in other areas like the network that even required to write additional modules to make analyzation even possible.
A big thanks to servers who gave us profiling information as well as the tester team and players creating stress tests to verify and improve the changes.
For any one interested, there is a more technical explanation of methods used and designs on the end of this post.
before/after the optimization on the same planet with the same settings
But while we have a lot of more features in queue to come very soon, there are also some additional little features added to this version:
Custom Starter Gear
Servers are now able to define their own starter gear for new players. This includes credits, blocks, meta items, and blueprints (filled or not).
To edit the starter items, after the first start in this version a GameConfig.xml will appear in the starmade base directory, which should be fairly self-explanatory to edit.
This new door types added by kupu adds wonderful looking new doors.
Faction permission changes
A lot more options for customization has been added to faction roles. There are now permissions to control relationship changes (declare war, personal enemies, etc), faction news posting, taking/abandoning a homebase, and territory claiming/clearing. If you are a faction leader be sure to check and adapt your roles accordingly.
Calbiri, Lancake, as well as the tester team modified and gave feedback to a whole bunch of balance changes.
Hull blocks have been buffed in order to improve the survivability of ships:
Normal hull has 75 HP now.
Standard armor now has 60 armor and 100 HP, bringing it to 250 EHP.
Advanced armor now has 75 armor and 250 HP, bringing it to 1000 EHP.
Power supply beams have been buffed to bring it back to near pre-rework levels.
They are slightly less effective compared to what they used to be but they still offer around 5 to 6 times more power regen than their onboard equivalent in pure power regen blocks.
Power supply/tick: 40 -> 240
Power consumption/tick: 50 -> 270
The piercing defensive effect cap has been decreased but the amount of blocks you need to achieve max percentage has been decreased. This will bring the max achievable EHP to 2500!
Shield capacitors have now 2 times more shield HP, this is to make fights last longer and increase the chance of surviving high alpha damage.
Regen rate has remained unchanged.
Shield Capacity Total Mul: 55 -> 110
Warheads (dis-integrators) have doubled block damage.
Missile + Pulse
Radius has been nerfed to 48, with explosive you can get 58
Slightly faster, nerf: 4 -> 3
Does more damage, buff: 1 -> 2
Missile + Beam
Slower so anti missile turrets have less chance of missing buff: 2 -> 1
Ingots and crystals are 2 times cheaper to make it easier to craft advanced and crystal armor.
Shield cap and rechargers are 2 times cheaper.
HP of the more expensive systems, mainly weapons and support tools, has been increased by a factor of 2-3 in most cases.
The punch through damage system now has an effective penetration depth of around 7 blocks.
It used to make the other hull damage effects more or less obsolete and is in large groups gamebreaking. It is now equal to the piercing and explosive system in terms of block destruction per hit.
The piercing effect blocks are now switched over to the punch through damage system, with its double damage for blocks it would be ideal to use this system to destroy heavily armored ships.
This is also great for low damage weapons but with the drawback of doing no to low shield damage, depending on effect ratio.
The punch through effect blocks use the piercing damage system, making it capable of damaging blocks below armor without really destroying the armor itself first. It did lose its armor efficiency bonus though so if you want to destroy armor, piercing is the way to go. It doesn’t have a shield debuff though.
Explosive has not changed. It doesn’t exactly penetrate but it can damage a wide surface area with a minimum of weapon groups.
To clear things up, these are the current 3 hull damage types (not blocks):
Piercing damage system: Damage applied on a block gets halved and goes to the next block. This will automatically lead to a softcap at around 7 blocks deep. This also makes it able to destroy/damage blocks below armor plates without destroying that first.
Punchthrough damage system: Damage applied on a block gets deducted with that block’s EHP, then it is halved and passes to the next block. This also has a softcap at around 7 blocks.
Explosive damage system: ⅙ of damage gets applied to all touching blocks.
Unfortunately these changes do require an overhaul in ship design. Because of the stronger shields, and more durable hull, it would be highly recommended to put more armor and weapon blocks on your ships and sacrifice some shields.
Any constructive feedback is appreciated.
Bug 1769: Nocx Charged Circuit Wedge (typo bug)
Bug 565: Pentas are actually Heptas
All fixed prices have been adjusted so that they are around 1.5-3 times higher than their dynamic price.
There also have been some bugfixes to combat the slowdown of sound played, as well as more rare problems like driver crashes from the new GUI, and a lot of smaller bugfixes that caused crashes and glitches.
Technical explanation of optimizations
To begin with, these optimizations will only work with occlusion culling off. Occlusion culling is a nice concept with a fata backdraw: It’s very hardware dependant and will be slower, cause glitches, or straight up crash on crash on some systems.
For graphics the main work was to find out the bottleneck of the graphics systems. There were two main bottlenecks, which are either CPU or GPU bound. StarMade graphics are not CPU bound right now, which means there is something in the GPU processing causing the longest wait per frame.
Since OpenGL is a pipeline design, waits are not identifiable by just checking how long the code needs to execute on the specific commands. What happens is that the graphics card works asynchronous to the program execution. That means that all calls to openGL may be executed at an undefined time within the frame. If the program is GPU bounds it usually takes longer for the instruction to synch change to the next frame at the end, as the CPU waits for all the instructions that haven’t finished yet.
To actually find out what part in the code there has to be a deep analysis of graphics processing, which means switching of parts of graphics processing one by one to identify where exactly a framedrop happens, while keeping the graphics card processing large amounts of work.
In openGL (and other systems) there are 5 main bottlenecks to check: Framebuffer Fillrate, Vertex processing, Fragment Processing, Light processing and Texture Fetches.
By turning off specific parts of graphics processing combinations of what parts make the graphics run faster can be used to identify the bottleneck.
In StarMade’s case, there was a severe bottleneck going on with fragment processing (putting the pixels into the polygons and the screen). This started the second tier of finding the bottleneck within the fragment shader. It turned out to be the shire load of passing interpolated variables like occlusion and normals to the fragment shader.
To combat that I first changed the simpler per vertex lighting to look exactly like the per pixel lighting, and that would have worked overall, but it would look worse at close distance as wella s completely disable bump/normal mapping which depend on per pixel normals.
The solution to finally break this was to do preprocessed shaders, which use a simpler lighting on distance and a more advanced up close. The result was immense giving about 70% more fps to a planet of ~230 radius.
The next bottleneck identified was fillrate of textures. Since on high res, the textures have to lookup and interpolate pixels a lot more than on low res, another optimization was made to use low res textures at a distance. This also had the nice effect of eliminating texture noise when viewing objects at a distance.
Lastly the render queue has been optimized to greatly reduce draw calls and depth lookups by separating drawing in opaque and transparent parts, and also using chunked multi-drawing to the chunks of whole objects called with just one command.
Overall the optimizations yielded that a frame with large objects on it can be drawn in half the time it used to, which essentially doubles the graphics performance.
Furthermore these optimizations will also greatly benefit the shadow system, which also should be a lot faster now.
Network profiling is a lot more tricky as all you see on a basic level is individual bytes. Even listing packets is not very efficient as for one it itself costs a lot of performance, and secondly, it’s very hard to interpret.
Thanks to the design of our network protocol it was easy to aggregate per class and fields of time, which made it possible to exactly analyze network traffic and where it came from.
Furthermore, a profiling tool was build to help catch any fluctuation in the traffic by sent and received. It can even save individual timeframes for later analysis. if you are interested, you can look at it with F12, as well as turn on the live graphs in the options.
There were several bottlenecks identified immediately:
Block modification: All modifications to blocks, be it by battle, or by building were sent to all players which meant a huge load for players that have nothing to do with what happens in another part of the Galaxy. These are now made into private channels to only be sent to people in the area. Other players entering the area will get all the changed with the usual chunk requests. (note that making private channels is not advisable for everything as the cost for building and sending individual packets can outweigh what private channels would actually save).
There were a lot of other small places where making things private saved a lot of basic bandwidth.
The biggest optimization however was to bigger ships and their control structure. In order for client and server to synchronize, the data of what block connects to what has to be transmitted as soon as that ship is loaded. This was already done on a private channel, but a bigger ship, even with compression and everything turned on, cost 2.8 mb of data per player in the area. That is of course not acceptable, as it caused servers to burst data, and depending on provider entering a slow mode to compensate.
This was solved with a specially developed algorithm. By taking advantage of the block system once again and how ship systems work, a 2D map based on the shortest two dimensions can be made and per line regression can be applied. While this is a little to complicated to explain in details, here is the result: That 2.8mb ship now only took 14Kb, which is an improvement by 2000 times
What comes next
As said a lot of features are in queue. Foremost the rail system and the advanced chat (maybe even with IRC interface)
Thanks for playing StarMade,
schema and the Schine Team