| Pages: [1] :: one page |
| Author |
Thread Statistics | Show CCP posts - 2 post(s) |

Azaqui
|
Posted - 2007.09.11 23:19:00 -
[1]
This is of course rather straightforward idea, but - as I understand it - for now it's like every server in the cluster is responsible for a single star system.
Those star systems that are heavily populated are also rather heavily lagged.
Therefore, would it be possible to create a server buffer, as in, say, 10 servers not assigned to any system, who would dynamically check the workload of the system servers and just "helped" (shared the workload) the overloaded ones, like the "top 10 with most ppl on"?
Of course I understand that the border area would be a bit problematic (with the jump gates it's pretty obvious) - but I would say that relocating all the deadspace mission areas to one server (of course sharing the data needed for system scans etc.) would lessen the workload and reduced lag tremendously.
On a side note - what got me into EVE was the idea of epic fleet engagements, like 300 ships on each side. Of course with the current server structure this seems rather... difficult to achieve. This seems rather similar to a RL problem that is already solved - as in, flight traffic control. Maybe those solutions that worked in air traffic could also be adapted to the EVE structure (control zones, overlapping, handing over the flight from control point to control point, continuously etc.).
Would love to have a reply from a dev that actually can tell me what is silly and what is not, maybe there is a solution to the growing problem of lagged systems? 
|

ElfeGER
Black Eclipse Corp Band of Brothers
|
Posted - 2007.09.11 23:48:00 -
[2]
a solar system(or group) can only switch server aka node when it is restarted (all ppl get disconnected in that system)
a solution for 0.0/low sec could be to run systems with a big lag potential (fleet battles) as single process (like jita and others) on a massive multicore blades (8 or 16 cores) so these 50-100 systems get distributed by the os in realtime and if one system has a big fight coming a single core is available for that system only while the other systems just use the remaining cores and 32gb should do the trick)
|

Mr Billybob
Caldari Rampage Eternal Ka-Tet
|
Posted - 2007.09.12 06:35:00 -
[3]
they could keep a few servers in standby(not assigned to anything/system) then when there is heavy load somewhere the server will kick in -------------------------------------------- grrrrr |

Azaqui
|
Posted - 2007.09.12 08:23:00 -
[4]
Of course the question is what is the bottleneck - processing power or network bandwitdth.
If the power, then creating "helping hand" servers who dynamically take some of the workload of the overloaded ones is a valid idea.
If the bandwidth (imagine two fleets, 300 players each, all those events as enabling/disabling modules, targetting, changing course - hefty load I'd say) then the idea of helping hand server is still valid, it requires some kind of emergency protocol of inter server comms. (as in: helping hand server takes 200 players and processes events created by them, supplying just the final data, not individual events, to the overloaded system server which governs the battle).
Sounds feasible? :)
|

Mesoholy
|
Posted - 2007.09.12 09:11:00 -
[5]
Whilst I don't know much about the structure of the cluster, I would assume that a solution to this would require a rewrite of some of the server software.
The problem with something like EvE is that it is not a fully predictable system. Sure, traffic in systems can be predicted based on past events so that's ok. The problem comes when you have unexpected events such as raids or perhaps fleet battles.
If there is a sudden convergence of pilots on one system which is ususally pretty quiet, the server is going to slow down and build up a queue.
Now as far as a solution is concerned, it rather depends on how CCP feel about where they put their resources.
They could invest in massive blade systems but masive blade systems cost massive amounts of cash. This solution would also leave your massively cool and massively expensive blade systemrunning at 1/3rd capacity for 99% of it's operational life.
This isn't really a very elegant solution and it kinda suck imho, It's like sending in a steamroller to make cookies when all you need is a rolling pin (replace with similar eve analogy).
A much better and probably much cheaper solution would be to redesign the server software so that it supported "System Transfer" which would be an operation where a system is transferred from one normal server onto a big multicore blade server if a population surge is detected.
Now this operation would take a few seconds so unless someone at CCP is really clever ( :-) not saying you guys aren't smart, it's just a nice problem) and comes up with a smart solution for this, the system will have to be suspended for the transfer.
This could be explained in the story by saying that when gates are suddenly subjected to heavy use, they need to draw more power from subspace or something and create ripples in the space time continuum (who watches star trek?)
As far as I understand it this would be a pretty elegant solution if well implemented and the whole subspace rubbish could be said to affect a large portion of the galaxy and so the transfer could encompass say 100 systems based on some kind of predictive algorithm which guesses where the nastiness is likely to happen. This way, the battle itself wouldn't even be interrupted and all the transfers and interruption could take place before a shot was fired.
Anyway, that was my two pence.
|
|

CCP Lingorm

|
Posted - 2007.09.12 11:44:00 -
[6]
I know this seems like a very simple idea, but it is not.
Currently all services for a Solar system must be on a single 'node'. Currently a node can not 'span' multiple cpu's or multiple Cores. Most nodes run multiple Solar Systems (certain large Solar System get their own nodes, Jita, Salia etc).
So the current limit is that a single node can only handle a certian amount of load. and there is no way to split that load IF it all comes from 1 Solar system.
We are investigating options for our next upgrade to remove this bottle next but it is not a trivial code change. It will be part of our investigations into Infiniband and true cluster technology.
CCP Lingorm CCP Quality Assurance QA Engineering Team Leader
|
|

Claska
Amarr Rising Tibetan Star Knights Of the Southerncross
|
Posted - 2007.09.12 13:45:00 -
[7]
good luck with that, gonna be a nightmare to work it out.
|

UberL0rd
Minmatar Brotherhood of Polar Equation Mordus Angels
|
Posted - 2007.09.12 14:06:00 -
[8]
Originally by: CCP Lingorm It will be part of our investigations into Infiniband and true cluster technology.
I dunno if you can use their tech or not, but look up these guys http://ainkaboot.co.uk/
The owner gave a talk at the past two Gentoo/UK conferences and at the last one he demoed a working prototype of a 3D image processing cluster. Now you're not going to use image processing, but all the software used is OSS. He also gave a live demo on how to adapt a simple number cruncher to from uniprocessor to threads to MPI for clusters. --- Gentoo/FreeBSD/Linux developer |

Tonto Auri
|
Posted - 2007.09.12 14:17:00 -
[9]
There's one additional small idea behind these "simple" solutions. EVE consists of at least three different partd. 1. Global navigation. No, not interstellar travel, but movement of "static" objects. Moons, roids, planets, starbases and whole starsystems. 2. Local navigation controls ships flight and interaction. 3. Market and contracts system.
Of course, there may be some additional things like drone control modules and player connectivity trunks.
If we split each module to separate server and differentiate load, like: 1. Connectivity proxy receive players and connect them to local navigation, market and drones/starbases AI server. 2. Local navigation connected to proxy(players), global navigation, AI server. 3. AI server connected to proxy to receive orders and to local navigation control to actually interact with world. 4. Market server connected to proxy to trade.
This system may be much stable and easy to monitor I guess.
As a note, "server" not actually means separate hardware/software but idea of structure organisation. -- Thanks CCP for cu<end of sig> |

Grawshellar
|
Posted - 2007.09.12 15:32:00 -
[10]
Originally by: CCP Lingorm I know this seems like a very simple idea, but it is not.
Currently all services for a Solar system must be on a single 'node'. Currently a node can not 'span' multiple cpu's or multiple Cores. Most nodes run multiple Solar Systems (certain large Solar System get their own nodes, Jita, Salia etc).
So the current limit is that a single node can only handle a certian amount of load. and there is no way to split that load IF it all comes from 1 Solar system.
We are investigating options for our next upgrade to remove this bottle next but it is not a trivial code change. It will be part of our investigations into Infiniband and true cluster technology.
Out of curiosity, why is this limitation in place? Is it a result of a the current overall design of the server coding, or is there a hard and fast reason why a node can not 'span' multiple cpu's or multiple cores?
|

Miranda Duvall
Gallente OPM Holdings
|
Posted - 2007.09.12 16:48:00 -
[11]
Edited by: Miranda Duvall on 12/09/2007 16:50:08 VMWare ESX has a feature called VMOTION where you can move a virtual "guest" OS from 1 physical host to another, without skipping a beat. no restarts, no packet loss, no nothing, maybe 1 or 2 retransmits...
I've seen a live windows media server (guest) being "vmotioned" while on another screen watching a videostream served by that virtual server. The stream continued perfectly....
Perhaps an ESX cluster could be made where lots of virtual guest OSes host a few solar systems each, and when one server gets sudden load, it could be vmotioned to another physical box with more power that has no OSes on it yet.
My Skills -Invention HowTo |

Azaqui
|
Posted - 2007.09.12 16:48:00 -
[12]
This problem seems similar to a complex particle simulation - as in, when simulating a fluid that consists of many particles, there are some limitations:
to compute the state in the time t, you need to have complete knowledge of the state in time (t-1).
it is not easily splittable to independent CPUs , as the particles are influencing each other
Commercial particle simulation packages like the RealflowÖ do take advantage of multi-core systems, so obviously this can be done, probably by the means of dividing the area into 3-d cells, then computing each cell and just compute the meta-interaction between the cells.
Translated to EVE - rough example: CPU1 controls one fleet, CPU2 controls the other. All the individual module activations, boosters, skills etc. get computed by the apropriate CPU. CPU1 sends just the final, computed data product (as in 20 missiles velocity 300 explosion radius 600 are impacting target #1 - after summing up all shooters, skills etc).
CPU2 then sends its relevant computed data product (target #1 has signature radius 125 has 4000 shields and has speed 200 - after taking all the gang modules, remote boosting into amount).
Solar system CPU just takes the two data products and makes the ruling, distributed to both "helping hand" CPU1 and CPU2.
I know that this is easily said and very difficul to do - because the EVE code is evolutionary and this would involve a massive code rewrite...
Nevertheless, with the decision of creating just a single global server (very brave, and very unique, and again, praises for that!) this seems to be the greatest danger involved at the moment.
As with most things, EVE will reach a critical mass soon. At first all MMOs start from a low player base, and then reach a threshold where some players loose interest (and the MMO dies) or quite the opposite - the players start talking their friends into playing.
The latter scenario is happening now, EVE is growing rapidly. I think that sooner or later distributed, scaleable game server code will be just "live-or-death" matter for EVE.
I hope that the discussion sparked by this topic will help :)
|
|

CCP Lingorm

|
Posted - 2007.09.12 16:57:00 -
[13]
Originally by: Grawshellar
Originally by: CCP Lingorm I know this seems like a very simple idea, but it is not.
Currently all services for a Solar system must be on a single 'node'. Currently a node can not 'span' multiple cpu's or multiple Cores. Most nodes run multiple Solar Systems (certain large Solar System get their own nodes, Jita, Salia etc).
So the current limit is that a single node can only handle a certian amount of load. and there is no way to split that load IF it all comes from 1 Solar system.
We are investigating options for our next upgrade to remove this bottle next but it is not a trivial code change. It will be part of our investigations into Infiniband and true cluster technology.
Out of curiosity, why is this limitation in place? Is it a result of a the current overall design of the server coding, or is there a hard and fast reason why a node can not 'span' multiple cpu's or multiple cores?
I think this was mentioned in a Dev blog or something, but frommemory it goes like this.
When EVE was concepted it was envisioned that the 'cluster' would be globally dispersed to reduce lag.
Multi-Core CPU's where not really considered. And Multi CPU where big server Iron. The expectation was that the speed and power of the CPU would continue to increase as previously.
This of course did not go all to plan. While Boyles' laws still hold they hold on number of transistors, but not speed, the heat generated by faster CPU's proved to much and Intel and AMD went Multi-core for further development.
So we have not seen a significant increase in the 'power' of a single core in recent years, instead we have seen the addition of extra cores. Thus the 'power' of an EVE Node has not increased and thus the limits we are experiencing with the amount of 'people' a node can support.
This means that some of our base assumptions are now false. It is not something that could be easily predicted but we are now working to adapt to the change and move forward.
As mentioned we are looking at other technologies that will help us break these limits but they are major changes to very low level part of our game architecture. This means that the impact on the game is critical and it needs extensive planning and testing.
CCP Lingorm CCP Quality Assurance QA Engineering Team Leader
|
|

Mr Billybob
Caldari Rampage Eternal Ka-Tet
|
Posted - 2007.09.12 21:27:00 -
[14]
So a node is a CPU CORE not a CPU -------------------------------------------- grrrrr |

ElfeGER
Black Eclipse Corp Band of Brothers
|
Posted - 2007.09.12 23:46:00 -
[15]
= running something like 8 nodes on a blade with 8 cores ist boring
running 32 nodes with 1/4 the sol systems per node would allow balancing on the blade itself, reduce the affected areas and improve the performance for the lagging systems as less other systems run on that node
this could be bumped up to 1 sol system per node and the node processes are then balanced to where cpu time is by os
|

Ilor Prophet
|
Posted - 2007.09.17 01:30:00 -
[16]
In my mind the way to best ameliorate the problem is by changing the granularity from a system level to a grid level, though as Lingorm says, this is a major overhaul of the code. Not something to be taken lightly, and I'm actually impressed that CCP is even considering it.
|

solbright altaltaltalt
|
Posted - 2007.09.17 03:25:00 -
[17]
Originally by: Ilor Prophet In my mind the way to best ameliorate the problem is by changing the granularity from a system level to a grid level
Won't help for fleet work, which is where it's needed most. Fleet combat is all in a single grid coordinate so can't be fixed in this manner.
Jita is okay apart from the odd stuck ship syndrome which will just be a bug. Places like Motsu can be spread out with some fine tuning of where spawns are placed so doesn't really need finer CPU granularity.
Going for grid level granularity won't really help at all, imho.
More optimising will help. Maybe the segmenting can be population based rather than spacial. If there is a way to make a single node run multi-processor then that will be the real winner.
|

LTcyberT1000
Caldari LDK
|
Posted - 2007.09.27 01:29:00 -
[18]
Originally by: CCP Lingorm So we have not seen a significant increase in the 'power' of a single core in recent years, instead we have seen the addition of extra cores. Thus the 'power' of an EVE Node has not increased and thus the limits we are experiencing with the amount of 'people' a node can support.
At this point there is 2 ways: code single proccess load balancing over entire cluster on eve server side OR.. leave that for operating system
From OS side there is already projects which allows program (which runs for 1 CPU) to split over entire cluster. This might be interesting for CCP: http://openssi.eu/openssi_features.html That project takes single proccess and makes it run over entire cluster like just in 1 big virtual computer 
---- T-1000, the old school gamer, started with 8286 machine, 11 years so far for playing games. ******************************************** Skill level: Freelancer Wolf in Moon day :) ******* |

solbright altalt
State War Academy
|
Posted - 2007.09.27 02:14:00 -
[19]
Originally by: LTcyberT1000 At this point there is 2 ways: code single proccess load balancing over entire cluster on eve server side OR.. leave that for operating system 
It has to be a single OS thread to solve Eve's issue. That will never happen, so Eve has to change instead.
Quote: From OS side there is already projects which allows program (which runs for 1 CPU) to split over entire cluster. This might be interesting for CCP: http://openssi.eu/openssi_features.html That project takes single proccess and makes it run over entire cluster like just in 1 big virtual computer 
Won't directly help. Those's features are for more transparent network IPC. Eve obviously already has well developed IPC between nodes.
|

Zaenar
|
Posted - 2007.09.27 09:47:00 -
[20]
IMHO, nodes need to handle players rather than star systems. Say one node handles one fleet (256 members max as they are today), and acts as aggregators for fleet to fleet comms (sending fixed arrays of vectors representing player activity), and handles the game mechanics. It should be more scalable, since each node would deal with a known and limited load. Should 'matrix-like' game mechanics handling be feasible, it might even benefit from GPGPU technologies.
Sure this is a very rough idea that needs much more thinking than this, but dealing with 'epic fleet battles' doesn't seem to fit with the current star-system centric nodes.
This idea might be clever or incredibly stupid, only CCP devs will know though, so please no need to flame.  |

Tairon Usaro
The X-Trading Company Mostly Harmless
|
Posted - 2007.09.27 12:31:00 -
[21]
Originally by: CCP Lingorm ... When EVE was concepted it was envisioned that the 'cluster' would be globally dispersed to reduce lag.... [/quote
interesting insights in EVE's core game design considerations !
step 0 (with out hardware changes) would be design a ingame procedure which enables a quasi on-the-fly switch of a system from a multiple system covering node to a dedicated node. if this needs a node to be shut-down one could think of a shut-down and restart procedure which interupts PVP engagements in a defined manner by for example system wide ecm-bursts 5 minutes prior shut-down and 5 minutes after restart disrupting targeting. This does not solve issues arising from players entering a system and eventual gate engagements, but is a nice help for large fleets doing POS warfare.
Step 1 a true on-the-fly node switch. If I have to deal with multiple seconds/minutes server lag i will not mind if a 5 sec context/node switch to dedicated node solves the issue completely.
step 2 multiple CPUs for one system. Context switch on grid change. Only solves problems like Jita but wont help with fleet blobbing, since everybody is in one grid
step 3 multiple CPUs for one grid.
________________________________________________ Some days i loose, some days the others win ...
|
| |
|
| Pages: [1] :: one page |
| First page | Previous page | Next page | Last page |