These forums have been archived and are now read-only.

The new forums are live and can be found at https://forums.eveonline.com/

EVE Information Portal

Forum Index

EVE Forums » EVE Information Center » EVE Information Portal » Dev blog: Tranquility Tech III

Topic is locked indefinitely.

Dev blog: Tranquility Tech III First post First post
Author

Previous Topic Next Topic

Fuzzwork Enterprises Vote Steve Ronuken for CSM 6,759
Steve Ronuken Fuzzwork Enterprises Vote Steve Ronuken for CSM Likes received: 6,759	#41 - 2015-10-13 20:34:55 UTC 5 CCP Gun Show wrote: Ix Method wrote: Volcano-powered Singularity. Yes. we are thinking about renaming Singularity to Eyjafjallajökull kidding I'd just like to say: You are a large scary man Woo! CSM XI! Fuzzwork Enterprises Twitter: @fuzzysteve on Twitter

66
Haffsol Likes received: 66	#42 - 2015-10-13 20:59:37 UTC 1 Quote: [..... bla blah nerdy things....] what could possibly go wrong?! Exactly

madmen of the skies 3,613
Bienator II madmen of the skies Likes received: 3,613	#43 - 2015-10-13 21:10:26 UTC so you will have fewer solar system nodes but they will have more bandwidth and be better connected? how to fix eve: 1) remove ECM 2) rename dampeners to ECM 3) add new anti-drone ewar for caldari 4) give offgrid boosters ongrid combat value

Nafensoriel

Brutor Tribe

Minmatar Republic

Likes received: 269

#44 - 2015-10-13 21:29:38 UTC

So... since your engineers have decided to stop purchasing our superior minmatar duct tape...

Well actually that's it.. we're screwed. CCP engineers were 90% of our customer base. I guess we can start making server polish?

Seriously though awesome. Old codes going out the door and now the kludge hardware is going to. This is an awesome day for EVE.

Though seriously.. the engineers convinced you to keep the old cluster so they could play doom on it and have nerdgasams.. admit it.

virm pasuul

Imperial Academy

Amarr Empire

Likes received: 446

#45 - 2015-10-13 21:29:48 UTC

Bienator II wrote:

so you will have fewer solar system nodes but they will have more bandwidth and be better connected?

Hardware or software?
I think the nodes are probably virtualised, so divide the total hardware resources by the number of nodes.
Virtual stuff when done properly can be very efficient. e.g. a big gang roaming hops from one node to another, but they are virtual software nodes, if on the same host the net load on the underlying hardware would remain unchanged even though the gang had moved node.

CCP will be able to provision new nodes and drop unused nodes automatically. Also see the load balancing presentation they did a few fanfests ago where they explained their node balancing algorithm in detail.
Moving nodes around to do hardware maintenance with virtualisation is a doddle. Nodes can be moved live from hardware host to hardware host whilst still doing active work for clients and not dropping a single packet mid move.

The hardware abstraction from virtualisation, the storage abstraction, along with all the hardware redundancy makes the setup described pretty bulletproof. The only point of failure left now is that little "feature" in the CCP automation system that no one thought could break. Amazon, Google, Microsoft, and pretty much every UK bank have all had unbreakable cloud setups break.

It is an amazing bit of kit that CCP is investing in. There's probably well over seven digits of new hardware there.

Now if only CCP could come up with multi threading server code.... :)

CCP DeNormalized

C C P

C C P Alliance

Likes received: 465

#46 - 2015-10-13 21:40:59 UTC

Master Degree wrote:

as a IT pro, from experience i can tell that high IO SQL DB running in M$ Failover cluster @ vmware is not the best choice, rather go SQL always on, more storage needed, agree, but failovers are much easier (and much faster).. and you can replicate more times eg active, replica 1, replica 2 etc, can use one of the replica for reads and dont bother with operations on active writting db .. only thing what can be problem is switching of listener between nodes during sudden HW crash or vmotion (MAC address conflict in vmware 5.0, hope they fix it in 6.0 while running vmotion on loaded hosts)

eventually switch to hyper-v (core preferably due patches), license is cheaper as esx(i), but the downside is, that hyper-v is with features at least two releases behind vmware (if you dont pay huge money for scvmm)

just my 5 cents, i assume you made the math already :-)

PS: really nice HW, just vendor is not one of my favorites :)

thx for the comment and info MD!

I hear you on the VMWare possibly not being the best choice as there is definitely overhead invovled (both in I/O resources as well as licensing costs!). We'll do some testing and see the impact it has, and if we don't get to where we want with it, it's out! :)

In regards to AlwaysOn we'll be using this on top of whatever route we go w/ the cluster. This will be our primary replication method for keeping both our DRS in sync as well as offering live reporting services to internal users.

CCP DeNormalized - Database Administrator

CCP DeNormalized

C C P

C C P Alliance

Likes received: 465

#47 - 2015-10-13 21:49:41 UTC

Steve Ronuken wrote:

CCP Gun Show wrote:

Ix Method wrote:

Volcano-powered Singularity.

Yes.

we are thinking about renaming Singularity to Eyjafjallajökull Big smile

kidding

I'd just like to say: You are a large scary man Blink

This doesn't become really really true until you spend 2 days of heavy drinking in the middle of the icelandic wilderness with the man...

"Don't wake the Balrog!" Is a slogan we force all new Operations team members to learn very early on :)

Ops Offsite best offsite!

CCP DeNormalized - Database Administrator

Bastard Children of Poinen 271
Gospadin Bastard Children of Poinen Likes received: 271	#48 - 2015-10-13 21:52:27 UTC I'm shocked that a system designed to deploy in 2016 is even using rotating drives. That data must be REALLY cold.

KarmaFleet Goonswarm Federation 1,951
TigerXtrm KarmaFleet Goonswarm Federation Likes received: 1,951	#49 - 2015-10-13 22:03:50 UTC 1 No worries people. EVE is still dying on schedule. That's why they are pumping I don't even know how many hundreds of thousands of dollars into new server hardware. Because if it's going to die, it's going to die in style My YouTube Channel - EVE Tutorials & other game related things! My Website - Blogs, Livestreams & Forums

xrev

Brutor Tribe

Minmatar Republic

Likes received: 12

#50 - 2015-10-13 22:10:42 UTC

Gospadin wrote:

I'm shocked that a system designed to deploy in 2016 is even using rotating drives. That data must be REALLY cold.

It's called auto-tiering. The hot storage blocks reside on the fast SSD's or the internal read cache. When blocks of data aren't touched, they move to slower disks that are still more cost effective if you look to volume for your buck. Compared to Ssd's, hard disks suck at random i/o but serial streams will do just fine.

Bienator II

madmen of the skies

Likes received: 3,613

#51 - 2015-10-13 22:30:22 UTC | Edited by: Bienator II

virm pasuul wrote:

Bienator II wrote:

so you will have fewer solar system nodes but they will have more bandwidth and be better connected?

Hardware or software?

http://i.imgur.com/xCjjFc9.png

virm pasuul wrote:

Now if only CCP could come up with multi threading server code.... :)

eve has 8k solar systems or so. which means there will be over 100 solar systems per physical server node. So parallelism is already possible without having the actual server code multithreaded. Thats prob why ccp seems to see MT as low priority atm.

how to fix eve: 1) remove ECM 2) rename dampeners to ECM 3) add new anti-drone ewar for caldari 4) give offgrid boosters ongrid combat value

Cor'len

Doomheim

Likes received: 10

#52 - 2015-10-13 23:13:39 UTC

Bienator II wrote:

Thats prob why ccp seems to see MT as low priority atm.

Actually, CCP would love to multithread the ~space code~ (can't remember the component name, haha). But it's practically impossible to get a consistent result; operations must be done in sequence, otherwise you get dead ships killing living ships, and other ~exciting~ edge cases.

This is the ultimate limiter on EVE performance. They might conceivably be able to MT the processing of different grids in a single system, but everything that happens on a single grid must execute in a deterministic fashion, and in the correct order.

Plus, even if that wasn't a problem, they run Stackless Python, with the beloved global interpreter lock which effectively prevents multithreading.

tl;dr CCP wants to multithread all the things, but it's so hard it's bordering on impossible. Hence the effort to not have big fights.

Gospadin

Bastard Children of Poinen

Likes received: 271

#53 - 2015-10-13 23:16:18 UTC

xrev wrote:

Gospadin wrote:

I'm shocked that a system designed to deploy in 2016 is even using rotating drives. That data must be REALLY cold.

I know how it works.

It's just interesting to me that TQ's cold data store is satisfied with about 10K IOPS across those disk arrays. (Assuming 200/disk for 10K SAS and about 50% utilization given their expected multipath setup and/or redundancy/parity overhead)

Bienator II

madmen of the skies

Likes received: 3,613

#54 - 2015-10-13 23:54:02 UTC

Cor'len wrote:

Bienator II wrote:

Thats prob why ccp seems to see MT as low priority atm.

splitting tasks up is only one way of parallelism. You can distribute sequential tasks on different compute hardware via pipelining/layering etc.

but the thing is ccp does not have to do that. since they already can reach parallelism by simply running multiple processes on the same node. again: they are running 100+ systems on a single node. All they have to do is to run them in N processes instead of 1. (would not surprise me if they would run every system in its own process tbh)

mustithreading would only help in the worst case scenario: whole eve population is in the same system
but according to ccp this is not even certain since the bottleneck seems to be memory bandwidth not computing power.

how to fix eve: 1) remove ECM 2) rename dampeners to ECM 3) add new anti-drone ewar for caldari 4) give offgrid boosters ongrid combat value

Lightweight Dynamics 3
Berahk Lightweight Dynamics Likes received: 3	#55 - 2015-10-14 00:13:40 UTC 3 So, few questions How much closer does this server setup bring us to never needing downtime? Also How closer to being able to failover a tremendously busy system onto one of the combat nodes without having to wait until the following downtime? (or booking it in advance) Thanks /b

Cosmic Goo Convertor 6,095
Mara Rinn Cosmic Goo Convertor Likes received: 6,095	#56 - 2015-10-14 00:56:01 UTC Berahk wrote: How much closer does this server setup bring us to never needing downtime? Most important question in the thread :D http://community.eveonline.com/news/dev-blogs/death-to-downtimes/ Day 0 Advice for New Players

Alundil

Rolled Out

Likes received: 1,105

#57 - 2015-10-14 01:45:01 UTC | Edited by: Alundil

Raphendyr Nardieu wrote:

OMG, amazing blog. Nice that you added so much specifics.

I hope you get the virtualization working. Would provide nice benefits :)

Came to say this. Excellent article. Vmotion on terrific hardware is sweet sweet sweet. We use this in our 20000 user environment to great effect.

Keep up the great work.

I'm right behind you

Shamwow Hookerbeater

Nine Inch Ninja Corp

Likes received: 1

#58 - 2015-10-14 03:26:38 UTC | Edited by: Shamwow Hookerbeater

Gospadin wrote:

xrev wrote:

Gospadin wrote:

I'm shocked that a system designed to deploy in 2016 is even using rotating drives. That data must be REALLY cold.

Kinda funny in a way, at my last company we had some rather beefy 7420 ZFS appliances with ram/ssd/15K disks and we weren't happy when we were only getting approx 50-60K IOPS from pure disk operations across multiple pools. We could hit 200K+ on things that were cached....but we only needed that performance for some edge cases of ours. then we tested an AFF on our extreme edge cases...and were like crap why didn't these things get cheaper faster.

The AFF was incredibly faster than our 7420s in most cases especially anything approaching high levels of random IO (not surprising) it was so bad that a moderately powered vm (4 or 8 vcpus and like 64GB) was beating our 24 core 196GB physical boxes in total transactions when running things like HammerOra

Bienator II

madmen of the skies

Likes received: 3,613

#59 - 2015-10-14 04:56:12 UTC

Mara Rinn wrote:

Berahk wrote:

How much closer does this server setup bring us to never needing downtime?

Most important question in the thread :D

http://community.eveonline.com/news/dev-blogs/death-to-downtimes/

having DT only every second day would be a start :P

how to fix eve: 1) remove ECM 2) rename dampeners to ECM 3) add new anti-drone ewar for caldari 4) give offgrid boosters ongrid combat value

PeregrineXII 570
Raiz Nhell PeregrineXII Likes received: 570	#60 - 2015-10-14 05:21:32 UTC 1 Amazing stuff... Wish I could convince the boss that we need a 10th of this stuff... Keep up the good work... P.S. Would like to see photos of Sisi's Volcano powered lair :) There is no such thing as a fair fight... If your fighting fair you have automatically put yourself at a disadvantage.

Previous page12345Next pageLast page

EVE Information Portal

Fuzzwork Enterprises

Vote Steve Ronuken for CSM

madmen of the skies

Brutor Tribe

Minmatar Republic

Imperial Academy

Amarr Empire

C C P

C C P Alliance

Bastard Children of Poinen

KarmaFleet

Goonswarm Federation

Brutor Tribe

Minmatar Republic

Doomheim

Lightweight Dynamics

Cosmic Goo Convertor

Rolled Out

Nine Inch Ninja Corp

PeregrineXII