Pages: [1] 2 :: one page |
|
Author |
Thread Statistics | Show CCP posts - 3 post(s) |
|

CCP Wrangler

|
Posted - 2007.09.15 16:17:00 -
[1]
We still haven't been able to identify the exact cause for the recent database issues on Tranquility. However, we're moving slowly closer and each crash brings us one step further in diagnosing what is wrong. We know that something causes a random database thread to block other threads access to tables (application locks), which then rapidly prevent more and more threads to access data, ultimately resulting in the database initiating a fail-over to its active stand-by server.
Even though the resulting database outage is only for a 5 second period, the application servers running EVE are not able to survive that long without database connections. This is due to the real-time nature of EVE, we require database access speeds measured in milliseconds. Thus, the fail-over causes a crash.
In addition to our server operations and core server team investigating and analyzing logs of this, we brought in IBM hardware, storage and server technicians and just yesterday a Microsoft SQL server expert was flown in to Iceland to further assist us in this effort. This fault can be in anything from our own software, to the SQL server, operating system, server hardware or storage. Nothing has been ruled out so we got everyone we could get.
We are able to minimize the effects on Tranquility by a team constantly monitoring the SQL server for blocking threads, which we are usually able to clear up. However, when it's a thread that's working with a frequently accessed table, we are unable to keep up and the number of blocked threads finally overwhelms us, resulting in Tranquility going down. If it were not for these guys, we would be crashing several times a day.
We can't stress enough how serious we take this situation. We realize that not posting news daily or hourly about this might give the perception that we aren't taking it seriously, but we prefer to post news when we have something new to report.
Unfortunately, this time, we have no news, except that we have the crashes minimized and that we're sparing no manpower or expense in getting this fixed. We hope this gives you better insight in what is being done and assure you that we take this seriously and that we're doing everything we can to find a solution to this issue.
Wrangler Community Manager EVE Online
Contact Support - Contact Moderators - Report Bug - Submit News Leads - Knowledge Base Player Guide - Policies - Join ISD - Fan Submissions - DevFinder LiteÖ |
|

Price Watcher
|
Posted - 2007.09.15 16:22:00 -
[2]
Thanks, Wrangler.
An honest "We don't know yet" is far better than continual silence.
POST WITH YOUR ALT!
The Shame o' The Galaxy |

Stitcher
Caldari legion of qui Freelancer Alliance
|
Posted - 2007.09.15 16:24:00 -
[3]
yay for explanations!
Also, the monitoring team deserve medals. - The game is not the problem. The problem is that you are not adapting to the game.
|

Arngorf
Minmatar x13
|
Posted - 2007.09.15 16:25:00 -
[4]
Same procedure as with all problems.
SEACRH AND DESTROY
gl getting it fixed. Hurry it up btw. I'll be home in 1 hour  ________________________________________________ FORMER!!! I said FORMER Pirate...
|

General StarScream
Cybertronic Decepticons
|
Posted - 2007.09.15 16:26:00 -
[5]
Great efforts, i think people should understand that trubbels happen sometimes.
It seems your doing your best and then some more.
flying in people shows how Devoted you are to fixing such things.
Great thanks , and hope the problem is solved , and allso how to fix it from happening over. [ |

Born Slippy
|
Posted - 2007.09.15 16:26:00 -
[6]
Nice one, keep up the good work!
- Your #1 fanboy  |

Fujiko MaXjolt
|
Posted - 2007.09.15 16:28:00 -
[7]
Ouch!
Having experienced this on somewhat big systems myself, I can say that I do not envy you your task of combatting this demon 
Thank you for your dedication in keeping eve up - on stilts if not her own feet 
Oh, and atleast the problem doesn't just occur once every blue moon so you guys have a shot at clearing it up.
Good job and good luck guys !!!
|

Sicori Malaki
Caldari Thundercats RAZOR Alliance
|
Posted - 2007.09.15 16:32:00 -
[8]
Thanks for the update, this should hopefully reduce the numbers of "whine whine whine REFUND whine whine" threads. ______________ Only in the Tales that humans tell, do the hunters kill the wolf in the end.
|

DHB WildCat
BURN EDEN Terra Incognita.
|
Posted - 2007.09.15 16:41:00 -
[9]
Check for Vodka bottles behind the servers I bet one of your drunk employees spilled some on them
|

Lucky 8
Minmatar
|
Posted - 2007.09.15 16:42:00 -
[10]
So you guys have a few real peopleÖ closing DB threads manually?
mental image: team of Dutch lads plugging dykes with various digits.
wow is your game broken or what! --
Originally by: Nicho Void This thread is like a chum slick for forum alt trolls.
|
|

Pirate Tom
|
Posted - 2007.09.15 16:43:00 -
[11]
Edited by: Pirate Tom on 15/09/2007 16:43:58 In order to save the whiners the time it would take for them to formulate obscure reasons founded on opinion and misinformation about why they are entitled to a refund, I propose a preemptive measure.
Originally by: 'lots of future threads by idiots'
I accepted the EULA But did not read it. Didn't even try.
I'm so pompous that I think threatening to deprive CCP of my lousy pittance of a subscription fee will make all the difference in the world.
You can't have my stuff because I'm not really quitting. I have no intention of quitting. It's all idle threats.
I will not be happy with less than perfect. By perfect, I mean that the game must be recoded entirely to suit my specific style of gameplay at the expense of all others.
I now end my argument by stating that I think CCP is a bunch of amatuers, Microsoft is the heart of all evil, and I could do a better job writing all the code myself because I saw an episode of [insert Primetime Television Drama Series here] where they talked about how servers worked so now i'm an expert.
|

Born Slippy
|
Posted - 2007.09.15 16:45:00 -
[12]
Originally by: Pirate Tom Edited by: Pirate Tom on 15/09/2007 16:43:58 In order to save the whiners the time it would take for them to formulate obscure reasons founded on opinion and misinformation about why they are entitled to a refund, I propose a preemptive measure.
Originally by: 'lots of future threads by idiots'
I accepted the EULA But did not read it. Didn't even try.
I'm so pompous that I think threatening to deprive CCP of my lousy pittance of a subscription fee will make all the difference in the world.
You can't have my stuff because I'm not really quitting. I have no intention of quitting. It's all idle threats.
I will not be happy with less than perfect. By perfect, I mean that the game must be recoded entirely to suit my specific style of gameplay at the expense of all others.
I now end my argument by stating that I think CCP is a bunch of amatuers, Microsoft is the heart of all evil, and I could do a better job writing all the code myself because I saw an episode of [insert Primetime Television Drama Series here] where they talked about how servers worked so now i'm an expert.
Let's not turn this into a flame fest but ya, i feel ya.  |

Scott Price
|
Posted - 2007.09.15 16:46:00 -
[13]
I appreciate this explanation. A concise "We don't know yet" is better than smoke up our butts.
Besides, I can amuse myself with other games while it's being worked on. Bioshock is pretty nice for the downtimes, and I'm sure that EVE will be fixed soon.
|

Pirate Tom
|
Posted - 2007.09.15 16:47:00 -
[14]
Think of it like a controlled burn to prevent a forest fire. |

Aleyah Dawnborn
Caldari SMASH Alliance
|
Posted - 2007.09.15 16:48:00 -
[15]
Originally by: Sicori Malaki Thanks for the update, this should hopefully reduce the numbers of "whine whine whine REFUND whine whine" threads.
When pigs can fly...  ---
My God! It's logic! Flee! |

Andrue
Amarr
|
Posted - 2007.09.15 16:54:00 -
[16]
Thanks for the update, Wrangler. I think those of us that have had to troubleshoot software and systems ourselves understand. I remember a couple of situations where I had a bug that only triggered after a process had been running for nearly a day. It was a data recovery process and of course we had a customer waiting for us to get their data back. Those kind of faults are far and away the worst to track down.
You waste so much time just waiting for the bug to trigger and after hours of waiting you often only get a small amount of information. Even after applying what you think is a fix you can't be sure of it until a lot more time has past. -- (Battle hardened industrialist)
[Brackley, UK]
My budgie can say "ploppy bottom". You have been warned. |

Sister Impotentata
Caldari State War Academy
|
Posted - 2007.09.15 16:56:00 -
[17]
Well said. Out of curiosity, Does CCP directly foot the bill for all these techs? Or do they come with the service plan? ----- TANSTAAFL
When I engage these coils normally I do about 2x10^6 dps. But I try to avoid that because people, entire populations, like die. So I try to keep it to about 4x10^3 dps. |

Martineth
|
Posted - 2007.09.15 16:58:00 -
[18]
This game is massive. So as we greeks say Big Ships get big storms. But tbh i think that once revelations 3 is out before running in a new patch i suggest you people get all previous issues fixed. Most noted the lag issues. If this game as it is bin said wil open to Linux and Mac gamers we need more than hardware to keep it tight. Use your brains. You got me hooked for 2 years using just that. Thumbs up CCP. Thanks for sharing too. 
|

Pirate Tom
|
Posted - 2007.09.15 17:00:00 -
[19]
I'm guessing at the very least CCP is expected to provide accommodations for the outside techs while they're in Reykjavik. |

shibbymonkey
Minmatar Spartan Dynamics
|
Posted - 2007.09.15 17:01:00 -
[20]
ahh...the thread gremlin!
at least I am glad that he left my shop, cause I swear either he, or one of his friends, have been in my systems for a month!
GL, and thx for the FYI --------------------------------------- Ever in search of new ways to turn ISK into noise and smoke.
|
|

Andrue
Amarr
|
Posted - 2007.09.15 17:08:00 -
[21]
Originally by: shibbymonkey ahh...the thread gremlin!
at least I am glad that he left my shop, cause I swear either he, or one of his friends, have been in my systems for a month!
GL, and thx for the FYI
He'd bloody better not come to ours then. We are about to send our next build to QA and this time they are prolly going to actually try and connect fifty clients to our server at the same time. Okay so it's nowhere the scale of Eve but we're relying on .NET remoting to be up to the challenge and the back-end is something we've coded ourselves entirely from scratch.
We don't want no stinkin' thread gremlins, please  -- (Battle hardened industrialist)
[Brackley, UK]
My budgie can say "ploppy bottom". You have been warned. |

Telkkar
|
Posted - 2007.09.15 17:11:00 -
[22]
Thanks for the update Wrangler, and good luck to all the people working on this issue :)
Good luck, and hope you find a solution soon.
|

Making stuff
|
Posted - 2007.09.15 17:35:00 -
[23]
Edited by: Making stuff on 15/09/2007 17:39:46 bah so i guess we expect more downtimes.
|

Phonotw
|
Posted - 2007.09.15 17:41:00 -
[24]
thx, for info, it's for sure better than no communication at all. Btw. i would rather prefer accurate info with progres and "We don't know yet" rather than "TQ issues are being investigated"; plz keep us up to date (thanks to you're message i've changed skill's from my alt's 2,5 h training time to my prime char, 4 days time, so if TQ crash i won't lose skill point.) and tell when TQ will be stable enough to relay on training skill time tables.
btw. best regard to your technicians and programmers chasing the bug you deserve a premium pay or a bonus for your work. Good luck.
|

Damon Ra
Caldari
|
Posted - 2007.09.15 17:43:00 -
[25]
Edited by: Damon Ra on 15/09/2007 17:45:39 I would check all those "fixes" for Heat that were just made with the last patch.. Ever since CCP added Heat things have been less than reliable, and that is being about as kind as I can be considering that Heat is utter crap.
Current Tranquility status: SELECT production_code FROM SISI WHERE testers = 'players' AND testers <> 'ccp_staff' AND testing_duration <> 'sufficient'; |

Liang Nuren
The Refugees
|
Posted - 2007.09.15 17:43:00 -
[26]
It sounds like it might be a system process [like Autovacuum for Postgres] that's causing the lock contention. Good luck figuring it out ... that's a bear of a problem.
I haven't ran with MSSQL in ages, and I'm not even positive that an autovacuum daemon is available for MSSQL.
Liang
Yarr? |

Ian Graeme
|
Posted - 2007.09.15 17:47:00 -
[27]
Also having some experience on big systems, I would suspect this being something so simple as to be easily over looked which means it can be a stone cold pain to find.
|

Shari Vegas
Minmatar Ctrl Alt Elites
|
Posted - 2007.09.15 17:52:00 -
[28]
Being that this is an ongoing problem for around a week now, an explanation about what's happening and what's being done to remedy it is much appreciated.
Originally by: CCP Wrangler I have no clue.
|
|

CCP Explorer

|
Posted - 2007.09.15 17:58:00 -
[29]
We now believe that we have found the cause of the problem and fixed it. The problem has been reproduced on our test server, Singularity. More details later.
Erlendur S. Thorsteinsson Software Director EVE Online, CCP Games |
|

Shari Vegas
Minmatar Ctrl Alt Elites
|
Posted - 2007.09.15 18:00:00 -
[30]
Originally by: CCP Explorer We now believe that we have found the cause of the problem and fixed it. The problem has been reproduced on our test server, Singularity. More details later.
Outstanding.
Originally by: CCP Wrangler I have no clue.
|
|
|
|
|
Pages: [1] 2 :: one page |
First page | Previous page | Next page | Last page |