| Author |
Thread Statistics | Show CCP posts - 9 post(s) |
|

CCP Wrangler

|
Posted - 2007.09.11 18:40:00 -
[1]
Recently we have been experiencing a lot of server issues, please check this thread for news as they become available.
Update 1: The server failed over, again, and will be restarted.
Wrangler Community Manager EVE Online
Contact Support - Contact Moderators - Report Bug - Submit News Leads - Knowledge Base Player Guide - Policies - Join ISD - Fan Submissions - DevFinder LiteÖ |
|
|

CCP Wrangler

|
Posted - 2007.09.11 18:45:00 -
[2]
The server failed over, again, and will be restarted.
Wrangler Community Manager EVE Online
Contact Support - Contact Moderators - Report Bug - Submit News Leads - Knowledge Base Player Guide - Policies - Join ISD - Fan Submissions - DevFinder LiteÖ |
|
|

CCP kieron

|
Posted - 2007.09.11 18:46:00 -
[3]
The server ops team has been monitoring the server issues from both yesterday and today. We have log dumps that are being analyzed by both internal staff and outside resources in an effort to both troubleshoot and resolve the lack of server stability. We hope to find the solution very soon, as we dislike seeing TQ drop as much (or even more so) than the community does.
kieron Director of Community Relations, EVE Online EVE Online, CCP Games Email/Netfang Look Ma, I'm in a Dev thread! Oh wait... |
|
|

CCP Wrangler

|
Posted - 2007.09.11 18:58:00 -
[4]
Originally by: Selene Fenestre
Originally by: CCP Wrangler Recently we have been experiencing a lot of server issues, please check this thread for news as they become available.
Update 1: The server failed over, again, and will be restarted.
Update 3: The server ops team has been monitoring the server issues from both yesterday and today. We have log dumps that are being analyzed by both internal staff and outside resources in an effort to both troubleshoot and resolve the lack of server stability. We hope to find the solution very soon, as we dislike seeing TQ drop as much (or even more so) than the community does.
Update 2: ????
Update 4: Profit
(sorry, im bored)
Oops, buttons were right next to each other. 
Wrangler Community Manager EVE Online
Contact Support - Contact Moderators - Report Bug - Submit News Leads - Knowledge Base Player Guide - Policies - Join ISD - Fan Submissions - DevFinder LiteÖ |
|
|

CCP Wrangler

|
Posted - 2007.09.11 19:30:00 -
[5]
Tranquility has been restarted and is now open for business, please keep in mind that a lot of people logging in can create some initial lag so make sure you keep out of harms way for the first 15 minutes.
Wrangler Community Manager EVE Online
Contact Support - Contact Moderators - Report Bug - Submit News Leads - Knowledge Base Player Guide - Policies - Join ISD - Fan Submissions - DevFinder LiteÖ |
|
|

CCP Wrangler

|
Posted - 2007.09.11 19:37:00 -
[6]
Originally by: Arimus Darkhart
Is there any chance of having a slightly more technical explanation of:- 1. What went wrong, 2. Cause or possible causes there of (providing they're not exploitable), 3. What remedial plan is in place to fix 2.
I'll do my best.
1. As we posted earlier, there was a failover that caused the server to crash. 2. We are working with Microsoft to get it fixed. 3. Once we get the results from the techs we will do as they recommend and hopefully that will fix the problem.
Maybe not as detailed as you want I'm afraid.
Wrangler Community Manager EVE Online
Contact Support - Contact Moderators - Report Bug - Submit News Leads - Knowledge Base Player Guide - Policies - Join ISD - Fan Submissions - DevFinder LiteÖ |
|
|

CCP Wrangler

|
Posted - 2007.09.11 19:44:00 -
[7]
Originally by: Arimus Darkhart
Originally by: CCP Wrangler
1. As we posted earlier, there was a failover that caused the server to crash. 2. We are working with Microsoft to get it fixed. 3. Once we get the results from the techs we will do as they recommend and hopefully that will fix the problem.
Sorry, missed the earlier post amongst all the other posts.
Any timescales on an answer back from Microsoft?
We have both MS looking at it as well as our own guys working on it, and it's safe to say we all want to get this fixed as soon as we possibly can.
Wrangler Community Manager EVE Online
Contact Support - Contact Moderators - Report Bug - Submit News Leads - Knowledge Base Player Guide - Policies - Join ISD - Fan Submissions - DevFinder LiteÖ |
|
|

CCP Wrangler

|
Posted - 2007.09.11 19:48:00 -
[8]
Originally by: DerArt1st I¦ve found the error:
Originally by: CCP Wrangler
2. We are working with Microsoft to get it fixed.
Sorry if I wasn't clear. 
Wrangler Community Manager EVE Online
Contact Support - Contact Moderators - Report Bug - Submit News Leads - Knowledge Base Player Guide - Policies - Join ISD - Fan Submissions - DevFinder LiteÖ |
|
|

CCP John Proctor

|
Posted - 2007.09.11 20:58:00 -
[9]
We believe we have traced this issue to a failed controller card on our old RAMSAN that we attached a few days ago.
Our hardware profile has not changed much during the past 30 days other then adding the unit back into the SAN array.
Our other failovers that we have experienced since on the new SQL hardware was due to insufficient memory resources being assigned to the OS, that failover happened the day the new hardware was installed and was corrected, and another error was fixed by the installation of Service Pack 2 on the SQL server and the final one due to max degree of parallelism set too high and our queryÆs using too many processors at once leaving other requests too starve.
These failover problems we are now having are the ones we were having prior to the upgrade of the SQL hardware (same error codes and little evidence of errors in the logs).
Currently unfortunately we are running on the effected hardware but we have implemented steps to reduce the use of it by shifting all I/O's to the new RAMSAN, we will not be able to completely phase out the effected hardware until the next downtime and then we can switch too a different controller later on in the week on the unit and try and re-integrate it back into the cluster when we have ran a large battery of tests on it.
We are working to address these issues, we care very deeply about the stability of TQ.
To give you a brief recap on our database and the amazing things it does, with the increase in the player base and new features that have been rolled out via the API site and other tools we are at over 8,500 transactions a second, so you can see how trying to go back and go through all those transactions and data for the 1 transaction call that causes the server to initiate a failover can be a daunting task.
But we are fully confident that we can fix this issue in a timely manner. Please have patience with us.
We are throwing our full resources into this problem and feel as helpless as you do.
Our apologies.
|
|
| |
|