These forums have been archived and are now read-only.

The new forums are live and can be found at https://forums.eveonline.com/

EVE Information Portal

 
  • Topic is locked indefinitely.
 

Dev Blog: Building a Balanced Universe

First post First post
Author
CCP Explorer
C C P
C C P Alliance
#121 - 2013-12-04 08:17:13 UTC
Suomi Khan wrote:
CCP Prism X wrote:
There's no sense of locality or proximity in WH space so they just get a very dumb but efficient method applied to them.

Read the Devblog and must say, looks like you guys out in a lot of work to make TiDi less frustrating, thank you a lot for that :)

Is it possible for you to make an addition to the Devblog explaining how the load is distributed in w-space by chance? We now know and understand in detail what is happening in known space, but it could be very cool to know how you solve w-space, even though it might be "dumb but efficient" :)
Greedy minimal-first without locality.

Erlendur S. Thorsteinsson | Senior Development Director | EVE Online // CCP Games | @CCP_Explorer

CCP Explorer
C C P
C C P Alliance
#122 - 2013-12-04 08:18:07 UTC
Max Kolonko wrote:
CCP Explorer wrote:
One more thing to mention, as a part of this change then the cluster is starting up 2 minutes faster. See here https://forums.eveonline.com/default.aspx?g=posts&m=3897467#post3897467 and here https://forums.eveonline.com/default.aspx?g=posts&m=3899297#post3899297 for details.
So we have like, what? 6 - 8 minutes of sleep left daily?
More during the week since deployment target is < 24 minutes, but on weekends you're down to 6-8, yes.

Erlendur S. Thorsteinsson | Senior Development Director | EVE Online // CCP Games | @CCP_Explorer

CCP Explorer
C C P
C C P Alliance
#123 - 2013-12-04 08:23:08 UTC
Kossaw wrote:
CCP Explorer wrote:
The final piece of this puzzle, the intra-node jumps vs. inter-node jumps we ultimately want to solve with Brain in a Box.
We're still waiting for that dev blog mate Blink
I know Smile The timeline really is that it's going to be released when ready Blink

Erlendur S. Thorsteinsson | Senior Development Director | EVE Online // CCP Games | @CCP_Explorer

Mashie Saldana
V0LTA
WE FORM V0LTA
#124 - 2013-12-04 09:28:33 UTC
Would it be possible to have a node set the TiDi to 0 to permit a hot node remap without kicking out the players?
Rob Crowley
State War Academy
#125 - 2013-12-04 09:47:03 UTC  |  Edited by: Rob Crowley
LakeEnd wrote:
but my understanding is that you are still trying to balance the load geographically instead of doing it statistically?
It is balanced statistically, they just added geographic correlation to contain TiDi to the location which causes it.

Quote:
First of all, should you actually stop treating the problem for nullsec and highsec (+lowsec I guess) as similiar?
Why? As of now they can't do load balancing more than once a day. And if that daily balancing works fairly well, why not apply it to low and high too?

Quote:
Related to the randomness of the nullsec load, my second concern is how fast you will iterate this node splitting or will it be set in stone and forgotten about in few months?
If by iterate you mean them reworking the algorithm I guess the answer is "never if it works", but I think you mean how often the system balancing is done and as I understand it the answer is "every day at DT".

Quote:
Wont this mean that hosts running the servers of nullsec regions of north are almost idling, supporting the occasional ratter and which ever host gets systems from current conflict regions (Immensea etc) is always going to be heavily overloaded?
No, cause in this scenario the "empty" nodes in the north would handle a lot more systems each than the "busy" ones where the fighting is happening. I think your misunderstanding is that you think they're doing the balancing purely geographically, but of course that's not the case, if it were they wouldn't need an algorithm, they could've just had a kid draw equally sized circles around systems on the map with a crayon.
TrouserDeagle
Beyond Divinity Inc
Shadow Cartel
#126 - 2013-12-04 10:35:21 UTC
When is lowsec being fixed? Fleet up more than 20 dudes and you get massive tidi and traffic control every jump/undock.
Sakuma Ogunuchi
#127 - 2013-12-04 12:19:30 UTC
Can we get a copy of that map with System names instead of IDs?
Steve Ronuken
Fuzzwork Enterprises
Vote Steve Ronuken for CSM
#128 - 2013-12-04 12:32:42 UTC
Sakuma Ogunuchi wrote:
Can we get a copy of that map with System names instead of IDs?



If all you want it for is a large star map, with system names: http://imgur.com/a/opwDm

(use the cog in the top left of each image to get the full resolution version)

Woo! CSM XI!

Fuzzwork Enterprises

Twitter: @fuzzysteve on Twitter

Lord Egger XIV
Doomheim
#129 - 2013-12-04 12:48:33 UTC
Mashie Saldana wrote:
Would it be possible to have a node set the TiDi to 0 to permit a hot node remap without kicking out the players?


At the risk of drunk posting.
I think that's what they're hoping for but have not quite got right yet.
Cordeaux en Cedoulain
State Protectorate
Caldari State
#130 - 2013-12-04 13:18:28 UTC
In the dev blog you suggested that there are many many new players. Is it too much to ask for actual numbers? Like where are we on the active subscribed accounts count at the moment? And how much does multi character training gets used?
Mei ra'Zhault
Kimotoro Trading Company
#131 - 2013-12-04 13:53:51 UTC  |  Edited by: Mei ra'Zhault
LakeEnd wrote:
First of all, should you actually stop treating the problem for nullsec and highsec (+lowsec I guess) as similiar? Because the way I see it, highsec system load should be rather predictable and mostly consistent due to tradehubs and missioning centers remaining largely the same. Nullsec however is rather more unpredictable, system load is generated by whim of player coalitions and where they choose to clash that day (timers and random acts of violence).


As the above is inarguably true, why not load-balance highsec by locality as outlined in the devblog, and balance null-sec by minimizing intra-node jumps?

(edited again) or, similarly:

a) partition space by repeatedly dividing it in two as in the devblog
b) categorize the resulting partitions as regions of either consistently high or consistently low activity
c) discard the low activity partitions and return all of their systems to a single set
d) stir vigorously, assign to remaining nodes
War Kitten
Panda McLegion
#132 - 2013-12-04 13:56:01 UTC
Didn't you say it would be more efficient if adjacent systems were on different nodes though?

Could your optimizations be run down to the level of node pairs rather than single nodes, and then distribute those clusters of systems amongst the two nodes so that there are a minimum of same node connections?

It might be tricky, but it also might alleviate some of the gigantic-fleet-on-the-move lag since each jump would involve 2 nodes rather than usually only one.

I don't judge people by their race, religion, color, size, age, gender, or ethnicity. I judge them by their grammar, spelling, syntax, punctuation, clarity of expression, and logical consistency.

warock
University of Caille
Gallente Federation
#133 - 2013-12-04 15:09:56 UTC
Awsome dev blog, really enjoyed the extra techical information to chew on.

moar! Moar! MOAR! Lol
TheSmokingHertog
Julia's Interstellar Trade Emperium
#134 - 2013-12-04 15:13:04 UTC
Steve Ronuken wrote:
Sakuma Ogunuchi wrote:
Can we get a copy of that map with System names instead of IDs?



If all you want it for is a large star map, with system names: http://imgur.com/a/opwDm

(use the cog in the top left of each image to get the full resolution version)



Nice

"Dogma is kind of like quantum physics, observing the dogma state will change it." ~ CCP Prism X

"Schrödinger's Missile. I dig it." ~ Makari Aeron

-= "Brain in a Box on Singularity" - April 2015 =-

CCP Prism X
C C P
C C P Alliance
#135 - 2013-12-04 15:23:12 UTC  |  Edited by: CCP Prism X
I'M BACK!

I just want to clear up some confusion I'm seeing first: this is not meant to be the Holy Grail of lag reduction. This is a static load balancer. It is in no way the end-all solution to our load problems, it's an initial step that's required before anything further can be done. Optimizing underused resources is just wasted work. Well it's not wasted but it makes more sense to do it the other way around.

There is no load reduction going on here. Todays total load pressures with, or without, my code would be exactly the same. But with my code it will be more evenly distributed between nodes so that the probabily of a "wild" TiDi appearing have been reduced. TiDi in unreinforced systems with a massive fleet presence will still happen as it always has. We'll need to reduce the CPU footprint per user if we want to prevent Fleet TiDi (and my money is on that only increasing fleet sizes until TiDi becomes unbearable again).

So with that being said I'm going to try and answer some of the more frequent questions here.


"Adjacent systems" are bad for fleet movement / staging.

They always have been. They've actually been worse because the old system would be so aggressive on grouping systems of the same constellation together that it would, is so many cases that I was given time to work on this, chose to overload the node rather than split up the constellation.

If your Staging system is not reinforced, it's probably going to share its node with other systems. If the fleet is large enough to cause TiDi it will cause TiDi no matter what these other systems are. It's even possible that this staging system had enough load caused on it the day before to be reinforced on its own because its simply too loaded to share its node with any other system. But that will not help you with TiDi if your fleet is large enough to overload that node.

If you control the space around your staging system, you can now command people to stay out of those systems to avoid TiDi-ing your fleets staging system. I'd love to offer you a map of all node allocations so that you could discern wether or not that was needed.. but I'm certain people would metagame TiDi into existance through that.

I'm not sure what more to say. Nobody likes being TiDi'd. I'm not going to try to convince you to like it. But you'll be able to anticipate it now. And in case it's not clear: Nullsec and Empire do not run on the same nodes, and they have not for many years. Fleets in Nullsec do thus not cause TiDi in Empire, or vice versa. They cause TiDi in other Nullsec systems. This means that under the old system a staging system from the north could be allocated to a node already running a staging system from the south. That can no longer happen. That's something, yeah?


This does not help with sudden escalations.


Absolutely not. This is a static load balancer that balances system between nodes at server startup. I'm actually reading a paper from some people at the University of Bonn about predicting destinations based on previous system jumps. It's pretty interesting but my brain now hurts. But if we could hook something like that in we could detect staging systems forming. We'd still not detect sudden escalations. Sudden escalations need dynamic load balancing if we're to handle them gracefully under the current CPU per User fingerprint.


What about reinforced nodes?

Reinforcing nodes means that we, usually, move that system to a single node that is running nothing other than that system. We currently have three nodes on standby for the premapper to use according to reinforcement requests. Any system marked for reinforcement, at startup, is completely excluded from this premapping process. They're effectively marked as "Fleet Fight Systems" rather than "Null Sec Systems" and thus the "Null Sec System" load balancing method will not include them.

Dynamic V Static Load Balancing.


As I mentioned dynamic load balancing would solve a lot of our issues. But there are massive hurdles to that happening. I know that sounds weird to some people that work in an environment where it's easy as pie. But that's not our environment. We simply are not in the state where we can move a system between nodes without offlining everyone first. Would we like to be in that state: Ofcourse! But we're not. So we're stuck with the static approach until that changes.

So instead we run three other spare nodes (that are not the fleet fight nodes mentioned above) that we can allocate systems to if we need to separate them from a system with a sudden escalation fight in it.


Why is Empire split from Null?


Because Empire has a completely different load fingerprint than Nullsec (Crimewatch has been mentioned). Players in Empire also have a fairly different behaviour than players in Nullsec. Wormhole space is also seperated from these two groups of systems for the same reasons.

Sadly I think we have to few WH space nodes allocated now. If Empire runs smoothly today and tomorrow then I'm thinking I'll move one or two empire nodes into the WH space rotation for a better weekend experience.

And now I have to run to a meeting (probably already too late by the time I finish editing this on the forums). Sorry if these answers feel a bit abrupt, I was in a hurry. Big smile
Gilbaron
The Scope
Gallente Federation
#136 - 2013-12-04 15:35:20 UTC
wait,

Quote:
Absolutely not. This is a static load balancer that balances system between nodes at server startup. I'm actually reading a paper from some people at the University of Bonn about predicting destinations based on previous system jumps. It's pretty interesting but my brain now hurts. But if we could hook something like that in we could detect staging systems forming. We'd still not detect sudden escalations. Sudden escalations need dynamic load balancing if we're to handle them gracefully under the current CPU per User fingerprint.


the university of bonn does papers about our jumps through TQ ?

is there any kind of support for university papers ? i might actually be interested (not on a technical level, but for markets or politics)
Verite Rendition
F.R.E.E. Explorer
#137 - 2013-12-04 15:50:58 UTC
Thank you for the delicious technical details, Prism X. These are by far my favorite type of dev blogs.Smile
Vincent Athena
Photosynth
#138 - 2013-12-04 16:20:12 UTC
CCP Prism X wrote:
..............We'll need to reduce the CPU footprint per user if we want to prevent Fleet TiDi (and my money is on that only increasing fleet sizes until TiDi becomes unbearable again).

Actually that's not always true. There are a finite number of people who want to be in big fleet fights. My guess is its around 10,000. If you could handle that many without TiDi then you would have solved the issue at this time. Keep improvements coming at the same rate as the game's population increases, and it will continue to be solved.

So what efforts are being done to reduce the CPU footprint per user?

Know a Frozen fan? Check this out

Frozen fanfiction

Sentient Blade
Crisis Atmosphere
Coalition of the Unfortunate
#139 - 2013-12-04 16:22:56 UTC
I've mentioned it elsewhere, but why are these machines not virtualised (or are they?) surely something like vMotion would be able to move high-use systems onto dedicated hardware without the need to pause anything.
CCP Prism X
C C P
C C P Alliance
#140 - 2013-12-04 16:33:07 UTC
Vincent Athena wrote:
So what efforts are being done to reduce the CPU footprint per user?

I'm not privy to the plans of others, and have never been one to promise work on behalf of anyone other than myself. But I can totally tell you that I and RESCINDED just joined Gridlock to contribute to the Brain in a Box project.

That's of course a project outside our feature expansion release cycle. It's done when it's done. So I can't tell you anything more concrete than that. But feel free to berate me about it until then. I've got a hide as thick as my head. But I don't want you to do that to RESCINDED because that would be promising stuff on their behalf. Blink