Pages: 1 [2] 3 :: one page |
|
Author |
Thread Statistics | Show CCP posts - 10 post(s) |
|
![CCP Masterplan CCP Masterplan](https://images.evetech.net/characters/901599088/portrait?size=64)
CCP Masterplan
C C P C C P Alliance
1720
![](/images/icon_dev.gif)
|
Posted - 2015.08.07 18:27:26 -
[31] - Quote
Vincent Athena wrote:Legacy code? I'll make a prediction. The channel you used for campaign logging was used in the past for doing something else. You thought that code was removed, but some part of it still remains. When you started campaign logging, some old code woke up, tried to do something related to that channel, and "bad things" resulted.
Edit: CCP, you do not really answer one question we all kept asking over and over.
Why not roll back while you worked the issue? In the blog you stated you stated "Our test of the rollback was confirmed to work, but we still didnGÇÖt believe the code to be the issue". But, so what? Why did you let this belief stop you from doing a rollback and letting us get on the server?
I just do not see the link here. I see your thought: "we still didnGÇÖt believe the code to be the issue", and the result: No roll back, but I do not understand your reasoning for letting that thought get that result. What was your reasoning?
When we said "Our test of the rollback was confirmed to work..." that was more referring to the fact that the rollback process would work, not that the rollback would fix the problem. So we verified that we could re-deploy the previous day's build to TQ without corrupting the game state in the DB, not that the previous day's build would manage to get past startup.
Sometimes when we deploy some new changes/feature, we have to mutute the data in the DB in some one-way fashion. Therefore such code updates cannot be rolled back in isolation without either writing an explicit revert mutation, or doing a full DB restore from backup (which can be done but takes time).
All that that comment really means is that such a code rollback would require no special DB operations to go along side it.
"This one time, on patch day..."
@ccp_masterplan | Team Five-0: Rewriting the law
|
|
![Vincent Athena Vincent Athena](https://images.evetech.net/characters/1890350737/portrait?size=64)
Vincent Athena
V.I.C.E.
3582
|
Posted - 2015.08.07 18:33:24 -
[32] - Quote
CCP Masterplan, I get that. Thanks for the reply. But why not keep going and do the startup? Did you have some "one way" DB changes with this update that would have taken extra effort to revert or restore? So much effort that, at any given time, it looked better to just keep trying to fix the issue rather than get de-railed trying to roll back?
Know a Frozen fan? Check this out
Frozen fanfiction
|
![Ransu Asanari Ransu Asanari](https://images.evetech.net/characters/92552265/portrait?size=64)
Ransu Asanari
AQUILA INC Verge of Collapse
315
|
Posted - 2015.08.07 18:34:48 -
[33] - Quote
Pretty fascinating, thanks for the detailed explanation. |
![Ishtanchuk Fazmarai Ishtanchuk Fazmarai](https://images.evetech.net/characters/1414361813/portrait?size=64)
Ishtanchuk Fazmarai
3921
|
Posted - 2015.08.07 18:56:15 -
[34] - Quote
"See, I am a log. I MIGHT show you something, but then I MUST kill you... got it?"
73% of EVE characters stay in high security space. 62% of EVE subscribers barely PvP. 40% of all new accounts just "level up their Ravens". Probably that's why PvE content in EVE Online is sub-par and CCP is head over heels for PvP...
|
![Kerodan Alduin Kerodan Alduin](https://images.evetech.net/characters/93125235/portrait?size=64)
Kerodan Alduin
EVE University Ivy League
4
|
Posted - 2015.08.07 19:04:29 -
[35] - Quote
Thanks for the amusing writeup!
I totally know what its like when you run code that in theory should work but in practice doesn't. Then again, the maximum number of users waiting for my bits of programming was around 3 ![Cool](https://forums-archive.eveonline.com/Images/Emoticons/ccp_cool.png) |
![Eli Stan Eli Stan](https://images.evetech.net/characters/94455078/portrait?size=64)
Eli Stan
Center for Advanced Studies Gallente Federation
326
|
Posted - 2015.08.07 20:06:04 -
[36] - Quote
Very interesting writeup, thank you!
How exactly is a log channel set up? I assume the default log channel and campaign log channel both are directed to the same storage space? Is it possible for a log entry to trigger a process elsewhere, rather than being explicitly called? A while ago I had to help some devs figure out some MSSQL performance issues that was caused by triggers. Hate those things... |
![elitatwo elitatwo](https://images.evetech.net/characters/267692542/portrait?size=64)
elitatwo
Eve Minions Poopstain Removal Team
788
|
Posted - 2015.08.07 20:34:00 -
[37] - Quote
Windoze memory management... Could it be possible that the second call caused the server nodes to use swap space and the surge in memory requests made the harddrives go nuts?
Maybe the second call called the first one, creating 250*500*500 calls instead of 250*500 which would explain the behavior. Maybe rename a word in the second call so you can see which ones are displayed.
Or I am totally wrong and it's a MSSQL thing.
Tired of low and nullsec? Join Eve Minions and experience the beauty of wormholes!
|
![Kasli Catal Kasli Catal](https://images.evetech.net/characters/720135281/portrait?size=64)
Kasli Catal
SniggWaffe WAFFLES.
10
|
Posted - 2015.08.07 21:31:05 -
[38] - Quote
What the **** did I even just read? ![Shocked](https://forums-archive.eveonline.com/Images/Emoticons/ccp_shocked.png) |
![Iam Widdershins Iam Widdershins](https://images.evetech.net/characters/1888333448/portrait?size=64)
Iam Widdershins
Project Nemesis
889
|
Posted - 2015.08.07 21:59:16 -
[39] - Quote
CLIFFHANGER BOYS
Lobbying for your right to delete your signature
|
![Jasmine Cheryu Jasmine Cheryu](https://images.evetech.net/characters/93620257/portrait?size=64)
Jasmine Cheryu
Perkone Caldari State
16
|
Posted - 2015.08.07 22:41:33 -
[40] - Quote
CCP Goliath wrote:Ezekiel Marr wrote:So... is castello.is a pizza place of choice for CCP? We usually get pizza from Castellos yeah. It doesn't actually deliver to our area unless it's for us :p
Do you think they would deliver during fanfest next year??
If so I'm buying the entire Dev Team pizza on one of the fanfest days, they deserve it for putting in all this hard work for us players!! ![Cool](https://forums-archive.eveonline.com/Images/Emoticons/ccp_cool.png)
Sure we pay your wages by playing the game and paying for plex//subscriptions, but we all (or well.. most of us) really do appreciate all you do the keep the game we love and enjoy online for us
Thanks for the blog outlining that terrible day ![Smile](https://forums-archive.eveonline.com/Images/Emoticons/ccp_smile.png)
Jas |
|
![Aeon Amadii Aeon Amadii](https://images.evetech.net/characters/95775705/portrait?size=64)
Aeon Amadii
Federal Navy Academy Gallente Federation
16
|
Posted - 2015.08.08 00:37:27 -
[41] - Quote
Thank you for writing this!
As someone just starting school for Computer Science, this was very exciting and enlightening ![Big smile](https://forums-archive.eveonline.com/Images/Emoticons/ccp_smile-big.png)
(This character is the Eve version of Aeon Amadi)
|
![Cor'len Cor'len](https://images.evetech.net/characters/1960383049/portrait?size=64)
Cor'len
Remnant of an Empire Independent Stars Allied Forces
9
|
Posted - 2015.08.08 00:44:24 -
[42] - Quote
Vincent Athena wrote:CCP Masterplan, I get that. Thanks for the reply. But why not keep going and do the startup? Did you have some "one way" DB changes with this update that would have taken extra effort to revert or restore? So much effort that, at any given time, it looked better to just keep trying to fix the issue rather than get de-railed trying to roll back?
Also, it looked like you found the temporary fix by experimenting on TQ, something you would not have been able to do if you had done the roll back.
I expect it was also a case of "We can't reproduce this reliably on our test servers, so we have to debug it in production". I think it was somewhat unclear whether the DB was modified in a way which would've prevented the rollback, and as Masterplan said, a DB restore takes time - I seem to recall a figure of multiple hours.
CCP: Thanks for fixing it, for the skillpoints, and also for the well-written report on the pizza. <3 |
![Dunkov Dunkov](https://images.evetech.net/characters/1312321878/portrait?size=64)
Dunkov
DeepSpace Resources DeepSpace.
0
|
Posted - 2015.08.08 01:19:06 -
[43] - Quote
As a Splunk SME at my workplace, I'm very happy, proud and a bit enthralled that CCP uses my favorite big data tool! Huzzah CCP Splunk Ninjas! |
![MeagerMiner MeagerMiner](https://images.evetech.net/characters/90032570/portrait?size=64)
MeagerMiner
10
|
Posted - 2015.08.08 01:23:50 -
[44] - Quote
Hell even I could follow along. Good job !
Thanks for your continued dedication to EVE ......... |
![Nevyn Auscent Nevyn Auscent](https://images.evetech.net/characters/91786526/portrait?size=64)
Nevyn Auscent
Broke Sauce
2348
|
Posted - 2015.08.08 01:34:50 -
[45] - Quote
10/10 Dev blog, would read again. Great explanation of what happened and how you go about such procedures. |
![Jonathan Yatolila Jonathan Yatolila](https://images.evetech.net/characters/93651888/portrait?size=64)
Jonathan Yatolila
APOCALYPTIC INFESTATION
3
|
Posted - 2015.08.08 04:04:28 -
[46] - Quote
Cor'len wrote:Vincent Athena wrote:CCP Masterplan, I get that. Thanks for the reply. But why not keep going and do the startup? Did you have some "one way" DB changes with this update that would have taken extra effort to revert or restore? So much effort that, at any given time, it looked better to just keep trying to fix the issue rather than get de-railed trying to roll back?
Also, it looked like you found the temporary fix by experimenting on TQ, something you would not have been able to do if you had done the roll back. I expect it was also a case of "We can't reproduce this reliably on our test servers, so we have to debug it in production". I think it was somewhat unclear whether the DB was modified in a way which would've prevented the rollback, and as Masterplan said, a DB restore takes time - I seem to recall a figure of multiple hours. CCP: Thanks for fixing it, for the skillpoints, and also for the well-written report on the pizza. <3
As others have said - great job on the fix, and an even better huzzah on the report!!! From your write-up - the only way to fix it was to leave the system down and to troubleshoot it on the "live" system - since it was working on the test servers and such. Kuddos to all of you. |
![Beta Maoye Beta Maoye](https://images.evetech.net/characters/93369638/portrait?size=64)
Beta Maoye
71
|
Posted - 2015.08.08 04:11:30 -
[47] - Quote
Feels like reading Sherlock Holmes. Nice jobs. |
![Raiz Nhell Raiz Nhell](https://images.evetech.net/characters/1843545211/portrait?size=64)
Raiz Nhell
Demon-War-Lords SpaceMonkey's Alliance
415
|
Posted - 2015.08.08 05:02:20 -
[48] - Quote
Best Dev Blog ever :)
Great explanation... Great solution :)
Situations like that are a developers worst nightmare... but also the biggest rush... working under the pump, brainstorming, fiddling and then the Eureka!!! moment...
Then the inevitable "So who's code was it?" discussion :)
There is no such thing as a fair fight...
If your fighting fair you have automatically put yourself at a disadvantage.
|
![Jenni Concarnadine Jenni Concarnadine](https://images.evetech.net/characters/565611144/portrait?size=64)
Jenni Concarnadine
SYNDIC Unlimited
6
|
Posted - 2015.08.08 09:20:35 -
[49] - Quote
Thank you very much for this.
It offered a clean account of what must have been chaos and much gnashing of teeth at the time.
White I wouldn't give back my SP, this is worth as much. |
![Richard TheLordOfDance Richard TheLordOfDance](https://images.evetech.net/characters/639511384/portrait?size=64)
Richard TheLordOfDance
Operation Fishbowl Inc.
13
|
Posted - 2015.08.08 09:42:36 -
[50] - Quote
Is it weird that this blog made me want to sit down and do some coding?
Extremely well written, almost like a short detective story complete with a cliffhanger at the end! :D |
|
![Flay Nardieu Flay Nardieu](https://images.evetech.net/characters/92142623/portrait?size=64)
Flay Nardieu
63
|
Posted - 2015.08.08 12:27:51 -
[51] - Quote
The candor about the incident and the insight to how it was handled was very appreciated. |
![Snape Dieboldmotor Snape Dieboldmotor](https://images.evetech.net/characters/1134293287/portrait?size=64)
Snape Dieboldmotor
Minotaur Congress
39
|
Posted - 2015.08.08 13:34:19 -
[52] - Quote
Great read. THANKS |
![Hel O'Ween Hel O'Ween](https://images.evetech.net/characters/1655827332/portrait?size=64)
Hel O'Ween
Men On A Mission
122
|
Posted - 2015.08.08 13:34:44 -
[53] - Quote
Raiz Nhell wrote:Best Dev Blog ever :)
I wouldn't say best (technical) blog ever, i.e. I remember a dev blog about TQ's hardware, which was also a very interesting read. But this one's definitely one of the most interesting blogs.
Thx, for the write-up, CCP. Now looking forward to the resolution's blog, once the investigation has revealed the culprit.
tl;dr
+1, would read again. ![Smile](https://forums-archive.eveonline.com/Images/Emoticons/ccp_smile.png)
EVEWalletAware - an offline wallet manager.
|
![Ezio di Firenze Ezio di Firenze](https://images.evetech.net/characters/91024670/portrait?size=64)
Ezio di Firenze
Original Sinners The Bastion
0
|
Posted - 2015.08.08 14:39:07 -
[54] - Quote
+1 great article. It also peeked my intrest, in the wiki it says that the database servers run on Microsoft server and SQL server. What do the Sol layer servers run on? is that also windows or linux or maybe a custom CCP OS? What do you guys use for that grid computing orchestration, its sounds really awesome how you do that! |
![Stanislav Kolomnitcki Stanislav Kolomnitcki](https://images.evetech.net/characters/91502232/portrait?size=64)
Stanislav Kolomnitcki
The Scope Gallente Federation
0
|
Posted - 2015.08.08 14:53:14 -
[55] - Quote
maybe you use the "print" function as debug in "campaign_logger" and do not have default "stdout"? =) |
![Jessica Danikov Jessica Danikov](https://images.evetech.net/characters/91452926/portrait?size=64)
Jessica Danikov
Eternity INC. Goonswarm Federation
450
|
Posted - 2015.08.08 15:38:22 -
[56] - Quote
Logs can be a pain as, without them, you can have an issue that occurs only on your production servers that has no clear indication on how to reproduce, but if the logging isn't sufficient, sometimes all you can do is make prospective changes to the logging in the hope that the cause is better indicated next release cycle around.
Worse still, logging can be a performance bottleneck, from when you have multiple loggers logging to the same file which would normally require some degree of synchronization to loggers doing reflection to give you nice, informative logging information at the cost of taking 100x longer per call. This makes minimizing logging in production usually desired to stop everything being so slow, at the cost of never knowing what's wrong (the logs show nothing!).
+1 for the article and another +1 for making more technical articles more of a habit (even if it means locking Devs up). |
|
![CCP DeNormalized CCP DeNormalized](https://images.evetech.net/characters/238488823/portrait?size=64)
CCP DeNormalized
C C P C C P Alliance
295
![](/images/icon_dev.gif)
|
Posted - 2015.08.08 17:26:16 -
[57] - Quote
Cor'len wrote:
I expect it was also a case of "We can't reproduce this reliably on our test servers, so we have to debug it in production". I think it was somewhat unclear whether the DB was modified in a way which would've prevented the rollback, and as Masterplan said, a DB restore takes time - I seem to recall a figure of multiple hours.
CCP: Thanks for fixing it, for the skillpoints, and also for the well-written report on the pizza. <3
We take full backups prior to each DT, so it would of been a full backup in no recovery mode plus a few transaction log backups to bring us to just past DT.
Roughly 3 hours for the 3 TB+ restore.
Funny enough we can restore faster in our test env. due to having a massive pool of SAS disks (100's) on the SAN vs. the small pool of SSD disks that the TQ DB uses :)
CCP DeNormalized
DBA
Virtual World Operations
|
|
![Dradis Aulmais Dradis Aulmais](https://images.evetech.net/characters/94543022/portrait?size=64)
Dradis Aulmais
RW Vindicator Connection Phoebe Freeport Republic
989
|
Posted - 2015.08.08 17:38:38 -
[58] - Quote
Sounds like Ghost in the machine.
TQ is a very unique system. 12 years old, reborn several times. Code here code there, its own little ecosystem. Its like the ultimate Capsleer.
Dradis Aulmais, Federal Attorney Number 54896
Free The Scope Three
|
![Soldarius Soldarius](https://images.evetech.net/characters/268384296/portrait?size=64)
Soldarius
Naliao Inc. Test Alliance Please Ignore
1361
|
Posted - 2015.08.08 17:47:40 -
[59] - Quote
"IBM"
Found your problem. /sarcasm I used to work at IBM. Won't repeat.
lel, seriously though. Very interesting write-up.
http://youtu.be/YVkUvmDQ3HY
|
![KenFlorian KenFlorian](https://images.evetech.net/characters/95262986/portrait?size=64)
KenFlorian
Jednota Inc
24
|
Posted - 2015.08.08 17:57:15 -
[60] - Quote
Marc Callan wrote:Illuminating. But worryingly, I got the distinct impression that CCP figured out what was causing the problem but not why - unless the underlying cause of the logging issues has since been determined?
As a former software developer/IT guy this happens more often than most of us choose to publicly acknowledge. Hat off to CCP for telling us what happened as best they could sort it out. They, more than anybody, would like a perfectly coherent explanation...some of the time it's never possible. |
|
|
|
|
Pages: 1 [2] 3 :: one page |
First page | Previous page | Next page | Last page |