These forums have been archived and are now read-only.

The new forums are live and can be found at https://forums.eveonline.com/

EVE Information Portal

Forum Index

EVE Forums » EVE Information Center » EVE Information Portal » Dev Blog: Building a Balanced Universe

Topic is locked indefinitely.

Dev Blog: Building a Balanced Universe First post First post
Author

Previous Topic Next Topic

Production N Destruction INC. F O R M I C I D A E 82
Haseo Antares Production N Destruction INC. F O R M I C I D A E Likes received: 82	#161 - 2013-12-05 03:41:37 UTC 1 Magic, got it. We currently have the world's greatest linguists and scientists trying to decode what you just said.

Dersen Lowery

The Scope

Likes received: 1,782

#162 - 2013-12-05 04:44:58 UTC | Edited by: Dersen Lowery

Sentient Blade wrote:

The underlying VM has no idea at all it's been moved.

And since it's taken a nontrivial amount of time to move relative to the 1HZ physics engine, meaning that the odds are very good that your half a second will cross a tick boundary, that means that every move must be followed by a resync with adjacent systems to get everyone back on the same page, right? If one node is off by a server tick, how do you handle that?

Proud founder and member of the Belligerent Desirables.

I voted in CSM X!

Abdiel Kavash

Deep Core Mining Inc.

Caldari State

Likes received: 2,736

#163 - 2013-12-05 04:52:16 UTC

Dersen Lowery wrote:

Sentient Blade wrote:

The underlying VM has no idea at all it's been moved.

During TiDi different systems are not running in sync either.

(I'm not saying this as a proof that this will be easy, rather as anecdotal evidence for it.)

Splinter Foundation 1,770
Pak Narhoo Splinter Foundation Likes received: 1,770	#164 - 2013-12-05 04:59:49 UTC CCP Prism X, is there any relation between the "balanced universe" and the perceived unresponsiveness from this thread?

Pator Tech School Minmatar Republic 115
NinjaTurtle Pator Tech School Minmatar Republic Likes received: 115	#165 - 2013-12-05 06:01:49 UTC Great dev blog! Thanks so much for giving us insight into how you balance the clusters, I for one had been wondering what your process was for some time. Can't wait to see the results

Rn Bonnet

Perkone

Caldari State

Likes received: 22

#166 - 2013-12-05 08:09:20 UTC

Dersen Lowery wrote:

Sentient Blade wrote:

The underlying VM has no idea at all it's been moved.

Vmotion at least is truly transparent to the underlying VM. You will see a "pause" but incoming network packets etc. are not dropped ,just queued while the machine is in motion afaik.

Steve Ronuken

Fuzzwork Enterprises

Vote Steve Ronuken for CSM

Likes received: 6,759

#167 - 2013-12-05 11:00:17 UTC

Rn Bonnet wrote:

Dersen Lowery wrote:

Sentient Blade wrote:

The underlying VM has no idea at all it's been moved.

Vmotion at least is truly transparent to the underlying VM. You will see a "pause" but incoming network packets etc. are not dropped ,just queued while the machine is in motion afaik.

Nope.

Set up a continuous ping of a VM, then vmotion it, and you'll see a couple of dropped packets.

Woo! CSM XI!

Fuzzwork Enterprises

Twitter: @fuzzysteve on Twitter

Royal Amarr Reclamation 57
Cerulean Ice Royal Amarr Reclamation Likes received: 57	#168 - 2013-12-05 15:25:39 UTC I noticed a typo in the 3rd to last image, detailing how the x/y split works to better facilitate the repeated splitting in half. http://content.eveonline.com/www/newssystem/media/65499/1/wholePowerOfTwoSolution.jpg In the blue text for the 1st split, 85/64 is not 75.3%. 64/85 is, however. ^^ Cerulean Ice, Professor, E-UNI

World Welfare Works Association 577
Cygnet Lythanea World Welfare Works Association Likes received: 577	#169 - 2013-12-05 16:43:24 UTC It's nice to see work done on high sec, even if it took the servers burning up before CCP would admit that highsec exists... LOL

Mioelnir

Brutor Tribe

Minmatar Republic

Likes received: 199

#170 - 2013-12-05 21:03:44 UTC | Edited by: Mioelnir

About the 2 seconds: that's straight from the vendor. So while, in practice, it may not take more than half a second, you still need to design your cluster to be able to handle a 2 second move. Better yet, a 4 second move. If every client disconnects because a move took .7 instead of .5 seconds, you gained nothing.

And to the every solar system on its own VM: yes, that is rather easy to maintain - from the POV of the virtual infrastructure. But it means x30 more connections on the internal end of the session servers. It also means x30 more SQL sessions which probably can't be scaled down by x30. It also means a larger memory foorprint for the entire server (x30 more OS instances) and decreased cache efficiency. That's why I called it a workaround.

vMotion works nicely for applications where you can add redundancy via IP failover. For protocols with standing connections and high degree of time synchronization - let's just say it gets complicated fast.

Abdiel Kavash wrote:

Dersen Lowery wrote:

Sentient Blade wrote:

The underlying VM has no idea at all it's been moved.

During TiDi different systems are not running in sync either.

(I'm not saying this as a proof that this will be easy, rather as anecdotal evidence for it.)

The tick between different systems runs differently. It probably always has. While all TQ nodes will run with similar latencies against the same NTP to keep the cluster internal clocks sync'ed, I doubt CCP sync'ed the server tick. Unless they use a wallclock second to initialize the first tick after starting the process - which actually they might have, thinking about it.

But this is not really that important inside the cluster. There really only the wallclock has to be sync'ed so timestamps represent consistently the same to all involved. That can be handled, NTP solved that problem decades ago.

The move is much more likely to desync the tick-count between server and client dogma simulation. The clients would be some seconds ahead of the server.
Here the server could:
- skip forward to the clients, discarding input for the skipped ticks
- skip forward, (try to) apply the entire input queue to the next processed tick
- issue all clients to roll back to his tick, discarding input
- signaling the clients a higher TiDi level than the server actually runs at until the it has caught up again
In any case, the server would have to be notified by the infrastructure that it has been moved, since the eve clients are untrusted terminals and the server can not trust them even if every connected client agrees that the server-tick is off by the same offset.

Btw, I think it's awsome that we as players sit here talking about TQ's cluster architecture.

[Edit]
The most elegant solution would actually be to have the infrastructure send an "intent to move" message to the server. The server could then set TiDi to 100%, completely freezing the universe (similar to how it works at downtime), trigger some "Sol-Node is being moved" message on the client, signal "ready to move" back to the infrastructure. After the move, the the infrastructure would send a "move complete" message, and the server would lift the 100% TiDi and continue the game.

Diomedes Calypso

Aetolian Armada

Likes received: 245

#171 - 2013-12-06 06:11:34 UTC

These sorts of blog posts make me love the game even though I tend to think of a python as a snake in the amazon or a pet snake around someone's neck at a park in Berkeley California.

Respect for the intelligence and knowledge of the users.

Treating us like adults.

I love that the company has so firmly decided (lol yes, since the 1000$ pants debacle) not to assume that people who don't really grasp more than the broad strokes will be put off by "too much detail"/

Yes I do understand the clusters and understand deviations and balancing etc but get lost or glazed eyed a bit deeper. I love that I'm told more than I want to know on some topics but can suck in the details on topics I'm interested in (start talking the velocity of money and I get real interested)

And .. heck.. I can always start researching terms I don't understand and enjoy the whole thing and be more knowledgeable about computers from playing the game !

Blue Harrier

Likes received: 225

#172 - 2013-12-06 15:40:30 UTC

Can I just pop in and say having read all 9 pages of this thread I wish more threads were like this on the forums.

Constructive talking among a diverse group of some very and some not so very knowledgeable members, no one having tantrums, throwing teddies out of prams, nothing but reasoned arguments.

Some putting forward what if’s, others debating and showing why this would not be possible but leaving room for further debate in case they missed something.

Must be the spirit of Christmas or something, well done to all.

"You wait - time passes, Thorin sits down and starts singing about gold." from The Hobbit on ZX Spectrum 1982.

Katrina Bekers

A Blessed Bean

Pandemic Horde

Likes received: 272

#173 - 2013-12-06 17:13:17 UTC

Steve Ronuken wrote:

Nope.

Set up a continuous ping of a VM, then vmotion it, and you'll see a couple of dropped packets.

Ping is connectionless and has a timeout of 3 seconds.

A TCP connection is - duh! - connection based, and usually the timeout is at 30 seconds.

Perfect? No.

But a dropped ping doesn't necessarily mean a dropped connection.

<< THE RABBLE BRIGADE >>

Steve Ronuken

Fuzzwork Enterprises

Vote Steve Ronuken for CSM

Likes received: 6,759

#174 - 2013-12-06 17:50:33 UTC

Katrina Bekers wrote:

Steve Ronuken wrote:

Nope.

Set up a continuous ping of a VM, then vmotion it, and you'll see a couple of dropped packets.

It does mean dropped packets though. /That/ is what I was saying.

Woo! CSM XI!

Fuzzwork Enterprises

Twitter: @fuzzysteve on Twitter

Rain6637

GoonWaffe

Goonswarm Federation

Likes received: 35,121

#175 - 2013-12-06 21:34:29 UTC

wormhole mass accumulation needs to be looked at, specifically: how it relates to traffic control. traffic control prevented a wormhole jump, giving me a "you will be cleared to jump within the next X seconds," but also counted my ship's mass against the remainder on the hole, subsequently shutting it down while I stared at a traffic control timer. if quiet systems = sisi-esque dropped jump attempts, the least consideration you could also make is preventing dropped jumps from contributing to wormhole mass limits.

Help, I can't download EVE

President of the Commissar Kate Fanclub

PLEX: A Giffen good? (It's 1B?)

Rain6636

GoonWaffe

Goonswarm Federation

Likes received: 3,242

#176 - 2013-12-07 00:43:40 UTC | Edited by: Rain6636

I've submitted a bug report, referencing the dev blog, and outlining the scenario in which traffic control will reject a wormhole jump while the ship's mass is still counted toward the hole's mass limit (as if the jump was successfully made). I can't find a bug report number to list here.

tell me if i'm wrong, and if traffic control does not affect mass limit totals under any node/load condition.

#1 Fan of the Commissar Kate Fanclub

Jessica Danikov

Network Danikov

Likes received: 454

#177 - 2013-12-07 13:20:31 UTC

Andy Koraka wrote:

Maybe I'm misunderstanding something, but as far as I can tell this will only have a negative effect on the quality of game play in regards to already painful fleet combat.

Frankly I don't remember the last time I was in a full fleet and there wasn't heavy Ti-Di. Every time a solitary 250 man fleet jumps a gate the system spikes to 10% tidi for 30-45 seconds. Even if every fleet fight was on an individual reinforced node (reinforced nodes are the exception, not the rule) the issue of gate Tidi is going to be exponentially worse under the new regional scheme since every individual fleet in the area traveling to (or from) the combat system is going to be sequentially triggering gate lag on the same node. It's going to be a particularly painful change given the recent quality of life hits to the majority of fleet ships, there's nothing fun or engaging about staring at a warp tunnel for 10 minutes per system the entire trip home.

As far as the metagame is concerned, even without a published node map it's going to be exploited. For example in a defensive Sov war, if most of a region is on the same node it's not going to be hard to find a linked system by trial and error and dock/undock repeatedly to cascade the entire node (most of a region in the current scheme) into a sustained 10% tidi to discourage siege fleets from grinding structures.

Yes the old system wasn't perfect, but the guy ratting in an empty system halfway across EvE could have just moved over to a different system and continued ratting. Maybe this is the right solution for Empire where loads are usually steady from day to day but it's the wrong approach in Nullsec.

The changes made haven't done much to change this problem significantly- both systems create large areas of connected systems that are all on a single node, the new one just ignores constellation boundaries and balances the (predicted) load across nodes better, while also ensuring all solar systems on a node are fairly local to each other. At worst, it may make the contiguous spaces a little larger.

The static mapper could do a lot more for this issue by striping nodes if the difference between intra-node and inter-node jumps really is significant (especially when scaled up) and the efforts to do so should be fairly minimal. If not, the Brain in a Box is going to be the next big advance in that area.


Rain6636 GoonWaffe Goonswarm Federation Likes received: 3,242	#178 - 2013-12-07 20:10:36 UTC still waiting for confirmation that failed wormhole jumps with traffic control messages count against the wormhole mass, but will be looked into. (meanwhile there will be support tickets, handled by uninformed customer service staff) #1 Fan of the Commissar Kate Fanclub

OK Researches And Inventions 16
Alex Logan OK Researches And Inventions Likes received: 16	#179 - 2013-12-07 23:06:04 UTC I don't think we should trust CCP Prinsm X. I don't think libras are serious and trustworthy. Sorry but I won't read your stuff.

Viziam Amarr Empire 11,532
James Amril-Kesh Viziam Amarr Empire Likes received: 11,532	#180 - 2013-12-07 23:22:00 UTC Christ, these changes are awful. "We noticed that inter-node jumps are less expensive than intra-node jumps" And then proceeds to put adjacent systems on the same node to increase intra-node jumps. Enjoying the rain today? ;)

First pagePrevious page78910Next page

EVE Information Portal

Production N Destruction INC.

F O R M I C I D A E

The Scope

Deep Core Mining Inc.

Caldari State

Splinter Foundation

Pator Tech School

Minmatar Republic

Perkone

Caldari State

Fuzzwork Enterprises

Vote Steve Ronuken for CSM

Royal Amarr Reclamation

World Welfare Works Association

Brutor Tribe

Minmatar Republic

Aetolian Armada

A Blessed Bean

Pandemic Horde

GoonWaffe

Goonswarm Federation

GoonWaffe

Goonswarm Federation

Network Danikov

OK Researches And Inventions

Viziam

Amarr Empire