These forums have been archived and are now read-only.

The new forums are live and can be found at https://forums.eveonline.com/

EVE Technology Lab

 
  • Topic is locked indefinitely.
 

How does EVE servers handle so much Net traffic? - help to write report to Uni class

First post
Author
TL Castiel
The Scope
Gallente Federation
#1 - 2011-11-04 14:57:11 UTC
Hello everyone,

I just took in this semester Computer Networks I and our task right now is to examine an existing server application and give a report on how they work. Many choose Apache, some chose ProFTPd and so on... So our task is to look at different ways to handle TCP connections and later write a TCP server and client that follows that approach.

So I decided to go a bit... more technical and try and look at some Game Server's Server side source code. I found none thus I will most likely write my own report on some open source application's design however I became interested:

Creating a blocking IO model based TCP server just doesn't seem viable with so much net traffic going on. Imagine that there is a blocking IO based TCP server listening at CCP's side and whenever there is something happening on client side all the other clients have to wait for the first client's network connection to finish. With a few hundred connections I can imagine that that could be fast enough but with 40-50.000?

So I came to the conclusion that it must be a non blocking IO and async TCP server at CCP's side. So whenever a client connects he either gets a thread to handle his connection or gets a connection resource from a pool. I very much doubt that CCP would want 40-50000 threads running on their servers since that would mean a lot of context changes that takes up a lot of CPU time and memory.
Another problem with Async IO is: the execution order of the threads are not dependent on the time they were created so they have no way to ensure say a FiFo order of execution. Depends on the OS scheduler. Imagine what would happen if say you hit warp and a little later you opponent in PVP hits on the scrambler. In a real world scenario I'd imagine that warp should be executed first and you stay alive, but it is entirely possible that client 2's scram request gets executed first. You are doomed O_o
If Async IO is used how do they ensure this FiFo stack of events?

I heard some rumors that CCP is using microthreads with stackless python. Would a python application have the power to serve so many connections? As well that would mean that the application would have to run on a single core over multiple cores. Not very ... efficient...

Cluster to serve 1 universe

Another thing that just came to me while not related to my original problem but is still interesting:
Whenever there is an event on client side: say you hit on F1 through F8 to kill that SoB, the server gets notified of it. There are a lot of such events occurring concurrently. 1 Server would not be unable to keep up with all that information to be processed so they need a lot of servers in a cluster. This is actually what CCP has.. I am wondering tho how they solved the issue of having a computer's core dedicated to a solar system.
There must be an entry server that deserializes the object that has been sent over to the cluster and forwards that "message" to the appropriate execution node which in turn will add it to a "message" stack and do what needs to be done and sends back the response.
As far as I know CCP uses a gzipped python pickle's for this purpose. I'd love a field trip at CCP's HQ. :) hehe
scrambled
Doomheim
#2 - 2011-11-04 16:07:26 UTC
Not necessarily true because you can use different ways of handling a connection, for example, if you have Apache with the mpm_prefork worker, it will fork off a bunch of children; when a connection comes in, it's passed to one of the children to handle, and the parent process goes back to waiting for more connections.

Then there's the asynchronous way of dealing with things. One application sits and accepts connections, and uses either the select() call or one of the more advanced ones (kpoll, etc.) to check and see whether a given socket (connection) is ready for reading, or writing. Ready for reading means that there is some data waiting, and ready for writing means that we can send it data if we have any.

The asynchronous one is basically what most game servers do.

It ends up much like this:

Quote:

server_socket = set_up_some_listen_socket();
while(not_quitting) {
sockets_with_data = select(allsockets, READ, 0.0);
if(sockets_with_data.length > 0) {
foreach socket in sockets_with_data {
if socket == server_socket then accept_new_connection else handle_data(socket);
}
}
}


Yes, bad pseudo-code :P

The thing here is that using a language such as Python (stackless or not) lets you do this relatively easily. Also don't forget that EVE's entire architecture on the server end is based a lot on RPC; if memory serves me right, the client connects to a proxy server (the one that handles the connection and takes the raw data and bunches it into packets), the proxy then sends packets to a Sol server (the actual node that runs the system), and the Sol server at that point can pass things on to the market service or other things.

Long time ago there was a nice diagram on a devlog, not sure if the structure is still the same.

All this means though is that the proxy you connect to is in charge of dealing with the data, as you can imagine, that's all they do, so they're very fast at doing it - there's also a lot of them, I think the ratio of proxy servers v.s. Sol servers is something like 10 proxies per Sol (if, again, memory and that diagram serve me right).


TL Castiel
The Scope
Gallente Federation
#3 - 2011-11-04 16:58:32 UTC  |  Edited by: TL Castiel
Don't worry, your pseudo code was perfectly understandable.

I figured that EVE is using the latter, so the async way. There is only the question remaining how they maintain the FiFo order of things. I suppose that is the task of the proxy servers, to unmarshal or deserialize the data that is received and decide which one goes first?

Edit:
found a few dev blogs:
http://www.eveonline.com/devblog.asp?a=blog&bid=286
http://www.eveonline.com/devblog.asp?a=blog&bid=678
VheroKai
Vhero' Multipurpose Corp
#4 - 2011-11-04 17:05:21 UTC
You're looking into wrong direction.
Each connection is handled separately.
More to that, http://www.eveonline.com/devblog.asp?a=blog&bid=584
TL Castiel
The Scope
Gallente Federation
#5 - 2011-11-04 17:12:36 UTC
VheroKai wrote:
You're looking into wrong direction.
Each connection is handled separately.
More to that, http://www.eveonline.com/devblog.asp?a=blog&bid=584


That just sais stackless IO. Where do you get your information from that each connection is handled separately?
Steve Ronuken
Fuzzwork Enterprises
Vote Steve Ronuken for CSM
#6 - 2011-11-04 17:26:06 UTC
I believe the source for Quake 3 has been released?

If you want to get your hands on the engine source code. (the game logic was release a long time before the engine.)

No where near as complex as Eve, as it supports far fewer users, but that might help.

Woo! CSM XI!

Fuzzwork Enterprises

Twitter: @fuzzysteve on Twitter

CCP porkbelly
Pator Tech School
Minmatar Republic
#7 - 2011-11-07 10:11:44 UTC  |  Edited by: CCP porkbelly
Hullo.
You pose some good questions. As one of the architects of the EVE networking, erm, architecture, I think I can be of assistance:


  1. Connection model: There is a single client connection maintained between the client and the game server. As observed, a classic blocking model is untenable because of the sheer amount of threads that would be required for this.
  2. Instead, we use asynchronous communications, with Stackless Python microthreads. Each microthread (or Tasklet as they are more correctly known) sees its communications as being blocking, but behind the scenes an IO scheduler thus blocks it and wakes it up again when IO is ready.
    The particular implementation we use is based on Windows IO Completion ports. This does not make use of the select() system call, which would be very inefficient with thousands of connections being handled by a single process. In stead, the scheduler gets a callback on an arbitrary worker thread from the OS every time IO is ready. The scheduler then proceeds to locate the tasklet that was blocked, prepares the results for it, and unblocks it.
    The end result is, from the application programmer's perspective, a blocking IO model.

  3. Syncronosity: The problem you describe is solved by careful application of Homer's law: Ignoring it and it goes away.
  4. Seriously, think about the problem you describe for a bit:
    "if say you hit warp and a little later you opponent in PVP hits on the scrambler."
    Realize that you and your friend are sitting in different locations. You are seeing, on your monitors, different representations of the state of the game. Even if you have synchronized atomic clock displays on your wall (ignoring for a bit relativistic effects), you might be in Essex while your friend could be in Tonga. So, according to your atomic clock, you would see everything slightly before him. But even more importantly: His percection of what you are doing is also skewed. He observes your actions through server latency, and sees them long after you initiated them, as measured by your atomic clocks. And the reverse is also true! You observe your friend's actions after the fact too. So, there is a biased symmetric delay mechanic in action between the pair of you.
    This makes any notion of simultaneity an feeble concept at best, and the phrase "a little later" becomes meaningless.

    On a different level, your intuition about IO scheduling is right, but useless, for a different reason: There are so many levels of delays and reordering present in the physical and virtual networks, that it becomes pointless. Even assuming equal distances to both clients, the order of arrival of the network packets to the host hardware is completely arbitrary. What the network interface hardware then does to it is a black box. So is the operating system which is beyond our control. Even for a blocking model, there is absolutely no guarantee that the first packet in on the wire will cause the corresponding blocked thread to be scheduled before the other. And even if it were, there is no reason why the threads in question could not be arbitrarily rescheduled by the preemptive scheduler before they could even so much as acquire a critical section.
    The above should make it clear to you that there exists no inherent order between data arriving on separate TCP connections and therefore it is pointless to pretend that there is and try to somehow maintain it.

    Having said all that though, we are careful to try to maintain whatever ordering is presented to our user level program. Since we work with tasklets, explicitly scheduled microthreads, we do schedule them to be run in the order that the IO callbacks from the OS are made to our application (barring of course preemtive thread scheduling of the already executing callbacks.) And we do go to certain lengths to ensure that the tasklets are executed in a round-robin fashion to guarantee fairness to all parties.


Hope that clears things up a bit.
Tonto Auri
Vhero' Multipurpose Corp
#8 - 2011-11-08 00:21:02 UTC
TL Castiel wrote:
Where do you get your information from that each connection is handled separately?

From basic networking principles.
You could also google for "C10K problem". Top 3 results will give you sone insight on the general issues and different ways to deal with them.

Two most common elements in the universe are hydrogen and stupidity. -- Harlan Ellison

scrambled
Doomheim
#9 - 2011-11-18 17:10:36 UTC
Tonto Auri wrote:
TL Castiel wrote:
Where do you get your information from that each connection is handled separately?

From basic networking principles.
You could also google for "C10K problem". Top 3 results will give you sone insight on the general issues and different ways to deal with them.


If each connection were handled separately, we'd be talking about a forking daemon that spawns off a child for every connection made. The C10K problem is usually solved by way of asynchronous IO, which means that there is one single application running that handles every connection - separately, yes, they're isolated from eachother, but it's one process that does it.

I also want to clarify that my using select() as an example is hideously oldfashioned. Most *nixes have some form of implementation that allows for registration of callbacks when data is ready, and as such can merrily build on this if needed.

Nari Neya
Vhero' Multipurpose Corp
#10 - 2011-11-18 18:27:50 UTC
scrambled wrote:
If each connection were handled separately, we'd be talking about a forking daemon

Your leap of logic is not understandable.
Nathan WAKE
Deep Core Mining Inc.
Caldari State
#11 - 2011-11-28 14:20:02 UTC
A late addition to this topic, but I stumbled upon that article and thought it might me of interest

http://www.talkunafraid.co.uk/2010/01/eve-scalability-explained/

Cheers

Nathan

"I'm a very good housekeeper. Each time I get a divorce, I keep the house"

Zaza Gabor