Pages: 1 [2] 3 4 5 :: one page |
|
Author |
Thread Statistics | Show CCP posts - 14 post(s) |
gtiness
Sick Tight Controlled Chaos
|
Posted - 2011.03.24 18:10:00 -
[31]
Edited by: gtiness on 24/03/2011 18:10:39 What is the size of the EVE database?
Edit: to properly snipe page2.
|
theocratis
|
Posted - 2011.03.24 18:11:00 -
[32]
Originally by: Ban Doga Edited by: Ban Doga on 24/03/2011 18:02:45
Originally by: CCP Yokai
Originally by: Ban Doga Edited by: Ban Doga on 24/03/2011 17:46:25 I remember that someone (CCP Explorer ?) said not so long ago, that the DB is not anywhere near its performance limit. So I'm really looking forward to seeing no performance improvements from this one.
I'm happy that you are happy, tho...
Check the disk busy... CPU... um and other graphs in the blog... We dug deeper and found some bottlenecks recently.
I didn't say the DB is not running better now (it certainly is). I was just repeating what a CCP official said: the DB is not a bottleneck itself (I'll try to find the original statement on eve-search).
If your SOL nodes hit 100% CPU load your improved DB performance won't matter that much...
*EDIT* Reduced stress for the machines is great (especially for the ones in charge of the health of those servers), but how does this translate into perceived performance for the users?
"We have also seen another positive, albeit unplanned, side effect of the increased performance of the new database systems. Previously if our SQL Server cluster needed to fail over to the redundant server, every node in the cluster died and all players were disconnected.We recently failed over to our secondary database server on the new system and only 3 nodes out of 208 died! This means with some tweaks we may be able to fail the servers, storage, and switching environment without a single disconnect!"
|
|
CCP Yokai
|
Posted - 2011.03.24 18:11:00 -
[33]
Originally by: gtiness What is the size of the EVE database?
Lots of "stuff" in there but somewhere around 1.3TB
|
|
EliteSlave
Minmatar Macabre Votum Morsus Mihi
|
Posted - 2011.03.24 18:14:00 -
[34]
Originally by: CCP Yokai Edited by: CCP Yokai on 24/03/2011 17:57:15
Originally by: EliteSlave
Originally by: CCP Yokai
Originally by: EliteSlave I wish i could post that screen cap of Stan after sneaking into that trailer with internet....
But, I have so many questions about the hardware... like specific model numbers, firmware. More so to do with possible integration here at my office as we have a pretty large database and are looking to scale up our hardware as we are approaching saturation of it.
so any **** is good ****
Ask and I'll try and answer... better yet if you are at Fanfest... I present all that tomorrow.
Wow, didnt expect a response like that..
Well I guess the first few questions are we are currently doing the FC/IP (Fibre Chan over IP) and we are kinda limited in the I/O factor of around 30 that we have currently attached to 12 Dell Powervault 3610's ( ISCSI / FC/IP and FC ) and we are already maxing out the hardware and we are trying to get the next bang for out buck with going full FC but since this will be our first foray into the "Enterprise" level we are reading "blah blah blah" and dont really understand what we should be looking for of sorts. Now im not expecting you to give me a visio flowchart of the equipment or how its setup. But if you can say broad terminology that is well acceptable for growth of the next 2-3 years and allows for 1000-1500 ( ideally would like to have hardware that supports 2500-3000 to allow for growth) users concurrently hitting the database at any given time would mucho appreciated.
First off... nothing over anything if you can do it... FC is the best bet for "Enterprise" like you said because of the session based communications and the potential for synchronous protection to redundant SAN controllers. I am biased... this is my opinion but I avoid iSCSI unless forced buy sharp object.
Next... direct connection or true SAN need to be decided on early. Direct connecting the disks is faster (like nano seconds only) but cheaper because you don't have to buy switches. BUT!!! When you run out of host ports and you need to get another system attached or accessing the LUNs, boy are you gonna miss those switches.
Next... SAS unless you know better. SAS is great! I love this stuff... cheaper than FC-SCSI and much better than SATA... best mid-ground disks. SSD's rock my socks but you will pay massively for them or you will get cheap ones and hate what you did later.
Last... Find the right performance metric. Concurrent users means lots to lots of people. I would suggest looking at your IOPS. A 12 drive tray of SAS usually nets 5-10K IOPS on normal workloads for a DB.
Hope that helps a little. All I can toss at you between beers at Fanfest :)
CCP Yokai
Hey thanks for the tidbit of advice, (Give me a virtual server farm and I can do wonders... give me a database make me cry...but im a masochist and would love the time learn it)
I definitely agree with the anything over anything is going to be a hassle and try to avoid it, but the CTO before me well... had X budget and if he spent only Y he got a certain commission on that and the company finally learned that well being cheap and rewarding cheap only got them deeper in the hole later on and finally fired him and now going into compliance and looking to stay ahead of the curve.
Can you reccomend any classes to take and which to avoid as you think they are a waste of time and or just nothing to gain from?
PS: If i find you at fanfest I will throw a beer your way and plus my resume ( even tho i know you prolly cant hire me ) hahha
|
|
CCP Valar
|
Posted - 2011.03.24 18:17:00 -
[35]
Perhaps the perceived performance for the users won't change much with the new database hardware, but it gives us a lot of room to grow and makes us able to perform more online maintenance without affecting users and prevents us from having to schedule extended downtimes for things we needed to do offline before. Also, a major part of the decision to upgrade the hardware was to increase availability and options for disaster recovery.
---- Senior Virtual World Database Administrator Virtual World Operations CCP Games |
|
Leet Magician
Evolution IT Alliance
|
Posted - 2011.03.24 18:20:00 -
[36]
maybe now the logs will actually show something!!
|
AnonyTerrorNinja
Minmatar Atomic Geese
|
Posted - 2011.03.24 18:24:00 -
[37]
If I had to be told
"you are going to work in the server room"
I swear I would take a sleepingbag, one of those funny inch-thick mattress things, a perpetual coffee machine, a portable shower and never leave. Just being around that kind of awesome is enough. Besides all of this, you just might be reading my signature. |
Sarmatiko
|
Posted - 2011.03.24 18:25:00 -
[38]
*fap fap fap*
Thanks!
|
Ban Doga
|
Posted - 2011.03.24 18:27:00 -
[39]
Originally by: theocratis
Originally by: Ban Doga Edited by: Ban Doga on 24/03/2011 18:02:45
Originally by: CCP Yokai
Originally by: Ban Doga Edited by: Ban Doga on 24/03/2011 17:46:25 I remember that someone (CCP Explorer ?) said not so long ago, that the DB is not anywhere near its performance limit. So I'm really looking forward to seeing no performance improvements from this one.
I'm happy that you are happy, tho...
Check the disk busy... CPU... um and other graphs in the blog... We dug deeper and found some bottlenecks recently.
I didn't say the DB is not running better now (it certainly is). I was just repeating what a CCP official said: the DB is not a bottleneck itself (I'll try to find the original statement on eve-search).
If your SOL nodes hit 100% CPU load your improved DB performance won't matter that much...
*EDIT* Reduced stress for the machines is great (especially for the ones in charge of the health of those servers), but how does this translate into perceived performance for the users?
"We have also seen another positive, albeit unplanned, side effect of the increased performance of the new database systems. Previously if our SQL Server cluster needed to fail over to the redundant server, every node in the cluster died and all players were disconnected.We recently failed over to our secondary database server on the new system and only 3 nodes out of 208 died! This means with some tweaks we may be able to fail the servers, storage, and switching environment without a single disconnect!"
I didn't miss that, but it's not about performance. It's about stability. You don't get dropped, but you're not getting improved performance while staying online.
Also found the original statement I was referring to:
Originally by: CCP Atlas We only have a single database and it's easier to scale that up than the sol nodes and we're already ahead of the curve in terms of what the DB can deliver. We do cache very aggressively on the server though and consolidating these character node calls onto a half a dozen nodes rather than servicing them throughout the cluster does remove a bit of the DB load since we get more cache hits, but like I said, the DB is not a big issue in this regard today.
http://www.eveonline.com/ingameboard.asp?a=topic&threadID=1371750&page=2#39
|
Thunderf00t
|
Posted - 2011.03.24 18:32:00 -
[40]
Was wondering about the Windows side of things. Can/Do you disable the FS buffer cache for particular FS ( don't know if it's possible ) so you don't have the same data, 2 times in the memory?
Is there a option to disable the FS file locking mechanics in windows, or maybe the SQL can open the database files with some option so it disables the buffer cache for the particular file that the DB process opens and maybe the locking of the said file, so if you have more SQL processes accessing the same file they can write to it concurrently?
Does the storage system support active-active mode, or is it passive-active to the storage processors/controllers?
What SAN switches do you use? Brocade...Cisco?
I suppose the DB cluster is some sort of active-active setup ( something like Oracle RAC maybe)?
|
|
J Kunjeh
Gallente
|
Posted - 2011.03.24 18:35:00 -
[41]
Yet another sultry Dev Blog for those of us who love the tech pron. So enlightening to read more in-depth about the Eve architecture. Just finished another article over at Gamasutra that went into some depth on the architecture as well (here for those who are interested). Keep up the good work CCP!
~Gnosis~ |
Batolemaeus
Caldari Free-Space-Ranger Morsus Mihi
|
Posted - 2011.03.24 18:41:00 -
[42]
I came.
|
Ariane VoxDei
|
Posted - 2011.03.24 18:41:00 -
[43]
Originally by: CCP Yokai Check the disk busy... CPU... um and other graphs in the blog... We dug deeper and found some bottlenecks recently.
Yes, the "disk before" graphs is scary. Like really really scary. If that translates to something like the similar graph in windows, it gives peen-shrinking shivers of diskwaits - making even the mightiest cpu/ram/gfx combo seem like stoneage implements choking any game into a stuttering slideshow while it desperately waits for IO requests to complete. (memories of logging into lagaran come to mind).
Interesting graph of online-players you had about fall/fail-over to redundant SQL server. That smaller spike coincides well with the recent mass disconnent many of us suffered, where we got repeated disconnect after logging back in for quite some time. Think it was about 2 weeks ago. Was that it or something else?
Quote: Ask and I'll try and answer... better yet if you are at Fanfest... I present all that tomorrow.
Looking forward to watching that. Anyway, if that failover was the cause, could you try to talk about that on the presentation? And talk about it anyway, so we know what to expect when it does happen.
The ugly thing about that problem was that it only dropped "some". If it drops everyone, well no worries, your enemies dropped too. Partial drops are nastier. I am not a titan pilot, but I think you can get the picture, and thats just one of the ugly scenarios. Match that with the revision of the reimburse policy, as per GM blog, and that can suddenly be very expensive.
|
Andrea Griffin
|
Posted - 2011.03.24 18:42:00 -
[44]
So, how much did this all cost (roughly)?
Being a nerd girl I love hearing about this stuff. I really enjoy CCP's tech blogs.
I'd love to play with that hardware, but being able to play ON it is almost as fun (and I don't get called at 4am when it breaks).
- "When I nerf something, it takes 2-3 months for your dreams to be crushed." - CCP Big Dumb Object |
Celebris Nexterra
Gallente Lowsec Static
|
Posted - 2011.03.24 18:46:00 -
[45]
Man, you guys have been churning out devblogs like it's your freaking JOB these past two weeks!
You- wait...yes...OK, I'm being told it is in fact your job to post devblogs. Nonetheless! It is still awesome! Keep up the great work, and give me more fapping material like the screen cap of 16 hyper-threaded CPU cores =D!!!
|
Ciaa
Gallente The Executives
|
Posted - 2011.03.24 18:48:00 -
[46]
Nice blog, good to see some tech love/**** :D Any chance of some photos around the office and server room? DON'T PANIC! |
Rambobinette
|
Posted - 2011.03.24 18:52:00 -
[47]
Is it really a single database? if it is, it means you have a active-passive cluster configuration which is a waste of computing resources because when you have 2 or more, you can balance the databases on each nodes giving you an active/active cluster configuration. You can also add more nodes thus reducing the work load.
|
Charles37
|
Posted - 2011.03.24 18:54:00 -
[48]
Those are some incredibly sexy graphs. Thank you!
This also makes my trusty computer feel... rather inadequate. But then again, spec sheets aren't everything, right...? Right?
|
Aldariandra
Gallente MunsterMunch The 0rphanage
|
Posted - 2011.03.24 18:57:00 -
[49]
Edited by: Aldariandra on 24/03/2011 19:04:21 This is very interesting. At our company we are currently trying out a Whiptail SSD SAN (rated theoretically up to 250000 IOp/s) and we get about 65000 I/ops out of it on account of being limited to 4Gb FC Blade switches (Brocom). This is to run VMware storage on btw.
Didn't you guys use Blades aswell (IBM)? If so, seems like a lot of ports being taken up for both network and HBA?
What kind of SAN switches do you use?
Maybe I am picky, but average disk queue of about 3 still seems high to me. The fact that your storage still seems to get to 100% disk use also explains the queuing probably. I would not be happy with ever seeing disks bottleneck on 100% use, its something I would see as a definite problem to solve still.
I find it very interesting that you run Eve on MS SQL. We have a lot of performance problems with some of our SQL servers and our DBA's are quite inexperienced. I would very much like to know how its set up and how you spread the load out. What kind of IOP/s does the database eat?
|
Myobi Rush
|
Posted - 2011.03.24 19:00:00 -
[50]
This effect every system or just Jita? :Unamused:
|
|
Rambobinette
Caldari Angels of Death Corp
|
Posted - 2011.03.24 19:05:00 -
[51]
Originally by: Aldariandra
I find it very interesting that you run Eve on MS SQL. We have a lot of performance problems with some of our SQL servers and our DBA's are quite inexperienced. I would very much like to know how its set up and how you spread the load out. What kind of IOP/s does the database eat?
thre is a limit a DBA can fix. Even with the appropriate index and optimizations, if the app do a full table scan because the developer doesn't know SQL rules, well you will have problems. I had great experiences with MSsql.
Luc R
http:://www.lucraymond.net MCSE&MCDBA |
Koshiko Murakami
|
Posted - 2011.03.24 19:20:00 -
[52]
So how many TPS does the current database hit? How do you see this expanding?
|
Soldarius
Caldari Northstar Cabal R.A.G.E
|
Posted - 2011.03.24 19:23:00 -
[53]
Wait. A business is using its income from subscribers to actually improve the business?
Impressive numbers. Great job, CCP. Keep it up.
Originally by: CCP Shadow ...I cannot guarantee (my) sobriety or decency.
|
ORCACommander
|
Posted - 2011.03.24 19:25:00 -
[54]
Originally by: Vuk Lau :fapfapfap:
this ^^^^
|
DiaBlo UK
ZDK
|
Posted - 2011.03.24 19:33:00 -
[55]
ball park figure on the cost of the upgrade???
Originally by: CCP Navigator Pretty sure someone is selling tinfoil hats. You should buy one
Originally by: CCP Zulupark Trollin' with my homies!
|
Ishina Fel
Caldari Terra Incognita Intrepid Crossing
|
Posted - 2011.03.24 19:42:00 -
[56]
Originally by: Ban Doga
Also found the original statement I was referring to:
Originally by: CCP Atlas We only have a single database and it's easier to scale that up than the sol nodes and we're already ahead of the curve in terms of what the DB can deliver. We do cache very aggressively on the server though and consolidating these character node calls onto a half a dozen nodes rather than servicing them throughout the cluster does remove a bit of the DB load since we get more cache hits, but like I said, the DB is not a big issue in this regard today.
http://www.eveonline.com/ingameboard.asp?a=topic&threadID=1371750&page=2#39
That quote is from August 2010... I'm sure it was compeltely true back then. But since then, they released Incursions, activated resource depletion on planets, overhauled the whole inventory system, and did other code improvements... I'm pretty sure that it is especially the latter two things that trouble the database.
Imagine - they just released a blog where they state that they can allow for Jita's maximum population to grow by over 1000 additional people, because the new efficient inventory code allows the node CPU to handle that many more inventory operations per second. But where do all these inventory operations go? Well, they hit the database. And now there's going to be a whole lot more of them in the same amount of time. Not only in Jita, but in every system that saves CPU cycles due to this coding change.
So the very database that ended up sitting around bored because TQ couldn't generate enough requests to saturate it, suddenly had to scramble to keep up, approaching its limits. So an upgrade made sense.
(This is of course pure guesswork, I have no idea what really happened. I only know that more often than not when you improve one part of a complex system, you end up stressing a different part without even meaning to.)
And on topic: that is a beautiful database server you got there. As a system integrator myself, consider me jealous - I can't find a single thing I would have done different! - Signature? What signature? |
Diomedes Calypso
|
Posted - 2011.03.24 19:50:00 -
[57]
cool stuff.. just a thumbs up to let you know its being read and enjoyed even by people like me who have little clue about what some of the stuff means and use it as a learning experience.
|
Dian'h Might
Minmatar Cash and Cargo Liberators Incorporated
|
Posted - 2011.03.24 20:14:00 -
[58]
Awesome blog. Technical details like that are great and give me an excuse to read eve forums at work - - - Dian'h Might - C&Ps resident "internet kleptomaniac" |
Mr LaForge
|
Posted - 2011.03.24 20:21:00 -
[59]
Whoah....Dude..
So I herd u got new hamsters...
|
Shandir
Minmatar EVE University Ivy League
|
Posted - 2011.03.24 20:22:00 -
[60]
Originally by: DiaBlo UK Edited by: DiaBlo UK on 24/03/2011 19:54:23 ball park figure on the cost of the upgrade???
I mean, do I need to win the Saturday night jackpot or wait around for a Euro millions double roll-over?
I suspect they already have to put a *lot* of effort into cooling, although this is an idea for reinforced nodes. If CCP currently is looking into multicore as they cannot push single-core processing as much as they'd like - what is stopping you from taking the highest clock-speed rating CPU commercially available, and then overclock it under heavy cooling for the max performance reinforced nodes? - Vote Trebor Daehdoow for CSM and Chairman of CSM. Trebor's Campaign Manifesto |
|
|
|
|
Pages: 1 [2] 3 4 5 :: one page |
First page | Previous page | Next page | Last page |