Pages: 1 [2] 3 :: one page |
|
Author |
Thread Statistics | Show CCP posts - 8 post(s) |
Irma Bondis
|
Posted - 2008.12.28 15:43:00 -
[31]
Originally by: Roy Batty68 So you guys reindex the major tables every downtime?
I would expect so... MS SQL server is in essence a fork of Sybase, and one of the major things with Sybase has always been indexes and the need to rebuild them after a large amount of changes have been made. Certainly when one uses clustered indexes (improved speed over non clustered indexes) then due to the nature of the cluster structure a lot of updates will result in a lot of empty spaces in the pages that make up the table. Thus once a day on a frequently updated table you will want to reorganize the table pages and reclaim some of the empty space in the table making the whole thing faster then it was before.
For a none clustered index you wills till want to do something like that even though the used space part is not so important the amount of data changes can make the index less then accurate and you will want to recalculate it, which is pretty much the same as making the database have a new look at the data and decide upon the best way to index it.
In short any and all databases with a lot of changes on a daily basis will need to have their indexes poked around from time to time in order to keep the index working as fast as possible.
For example: If you decide to split a 10.000 row table into 1000 parts ordering the first 1000 entries in the first part en the next 1000 in the next etc. Then assuming you are looking for data from the first 1000 entries the index will cause the database simply only evaluate the first set of data and ignore the rest. But when you do a couple of thousand insert/update/delete actions each day the index will very soon be point less as the data that was originally stored in the 1st set of a 1000 records might now belong (thanks to the updates) in the 10th set and maybe you now have 20.000 records thanks to all the inserts or only 1000 thanks to all the deletes. So what you do is you make the database have a fresh look at the table recalculate the best way to index the data in the table, basically rebuilding the index. In a very simple explanation that is why you would rebuild/reindex your tables every downtime.
So if the process spoken about kills the reindexing of the table or it is a process where the index is removed the indexed column(s) updated and the index rebuild stopping that process will cause the next step that most likely relies on that index being there to load the database to its absolute max as it tries to scan a huge table completely and most likely many times over. If I am even half right CCP should be able to simply check if the index looks happy before running a process that relies on the index being there, if not make the process scream bloody murder and wait till someone with a carbon based brain can come and fix the issue.
|
Maria Kalista
|
Posted - 2008.12.28 15:55:00 -
[32]
Quote: In a very simple explanation that is why you would rebuild/reindex your tables every downtime
/Me nods (still don't understand what was going on), but why do my (and others) socks still get lost?
Originally by: AkRoYeR
...the beauty of EvE. You have to live on the edge all the time. If you don't stay frosty, you will die!
Best game ever!
|
Zhora Six
|
Posted - 2008.12.28 15:56:00 -
[33]
Originally by: Roy Batty68 Ooo! Moar database geekyness please!
Yes, I am fascinated as well! I'm currently studying SQL and Exchange, so I might know something of what they're talking about...
It sounds like they have a single-instanced cluster for better performance, but that comes with some downsides. I wonder if they would have been able to recover the job in a multi-instanced cluster, or if it was simply doomed from the start... _______________________________________ Pull a lever, push a button, have a banana, die.
Space Monkeys. |
Guilty Man
Minmatar Guilty People
|
Posted - 2008.12.28 16:17:00 -
[34]
Originally by: Maria Kalista but why do my (and others) socks still get lost?
my socks usually get lost when I come home drunk.
|
Irma Bondis
|
Posted - 2008.12.28 16:23:00 -
[35]
Socks get lost due to the fact that the big washing machine in the sky would be a lonely place without the occasional sock crossing over and camping out there for a while.
As to why you keep getting disconnected, two options, one the proxy you are logging onto goes belly up and thus you get dropped or any number of things between your computer and the EVE cluster gets messed up and your connection to the proxy is lost. Since the EVE client is supposed to be handle quite significant drops in connection quality I would say it is likely that one or more proxies are unhappy over at the CCP server farm. I have been connected with two clients for a few hours now without any issues thus I am sure that not all proxies are unhappy.
|
|
CCP Prism X
Gallente C C P
|
Posted - 2008.12.28 16:24:00 -
[36]
Originally by: Irma Bondis
Originally by: Roy Batty68 So you guys reindex the major tables every downtime?
Stuff <-- See! Quotes don't have to include an insane block of text. Yay for fewer characters stored!
Your tech speak is spot on but your assumption is aaaalmost half-right. The two different procedures have no step by step relation though as you described. One is a downtime job and the other is fetching character information on login. That's why the server made it up until people started logging in at which point the constant fullscans caused CPU to skyrocket, important cluster calls were not getting through and nodes started dying.
You are however right in assuming that we could code each and every procedure to check for the existence of those indexes we'd expect it to use although it would have to cover quite a lot as SQL Serve can sometimes be a big black box of hate and do things that you'd never expect in its query plans. (Personal note to CCP Atlas: See, I wrote "its" rather than "it's". I get the paradigm! /personalJoke). But it's somewhat obvious that that is a lot of redundant overhead we can't really accept in a DB that needs to be as robust as possible. We also shouldn't need to accept it as we should be able to trust our indexes not to go *poof* on us.
However, it's an unacceptable risk. 135 minutes of extra downtime that could have been avoided is really not acceptable to anyone here in CCP. Sure we are now all aware of this possibility and know how to detect it but it's still (See Atlas! SEE!) an utterly unnecessary risk of increased downtime, even if it saves us some minutes of the daily downtime. So this will most likely change in the near future.
Lesson learned: Automatically dropping indexes ftl.
~ Prism X EvE Database Developer Relocating your character to a cozy, secure container since 2006. Relocating your cozy, secure container to the EVE cemetery since 2008. |
|
rValdez5987
Amarr 32nd Amarrian Imperial Navy Regiment.
|
Posted - 2008.12.28 16:30:00 -
[37]
Edited by: rValdez5987 on 28/12/2008 16:30:42
Originally by: Irma Bondis
Originally by: Roy Batty68 So you guys reindex the major tables every downtime?
Stuff <-- See! Quotes don't have to include an insane block of text. Yay for fewer characters stored!
I love it when you talk dirty to me!
|
Kazrm
Caldari Abundance Industries The Economy
|
Posted - 2008.12.28 16:40:00 -
[38]
Originally by: CCP Prism X
Originally by: Irma Bondis
Originally by: Roy Batty68 So you guys reindex the major tables every downtime?
Stuff <-- See! Quotes don't have to include an insane block of text. Yay for fewer characters stored!
Your tech speak is spot on but your assumption is aaaalmost half-right. The two different procedures have no step by step relation though as you described. One is a downtime job and the other is fetching character information on login. That's why the server made it up until people started logging in at which point the constant fullscans caused CPU to skyrocket, important cluster calls were not getting through and nodes started dying.
You are however right in assuming that we could code each and every procedure to check for the existence of those indexes we'd expect it to use although it would have to cover quite a lot as SQL Serve can sometimes be a big black box of hate and do things that you'd never expect in its query plans. (Personal note to CCP Atlas: See, I wrote "its" rather than "it's". I get the paradigm! /personalJoke). But it's somewhat obvious that that is a lot of redundant overhead we can't really accept in a DB that needs to be as robust as possible. We also shouldn't need to accept it as we should be able to trust our indexes not to go *poof* on us.
However, it's an unacceptable risk. 135 minutes of extra downtime that could have been avoided is really not acceptable to anyone here in CCP. Sure we are now all aware of this possibility and know how to detect it but it's still (See Atlas! SEE!) an utterly unnecessary risk of increased downtime, even if it saves us some minutes of the daily downtime. So this will most likely change in the near future.
Lesson learned: Automatically dropping indexes ftl.
Now if only CCP could make a post like this every time there's a major issue to explain why there's a problem, admitting that it's unacceptable, and letting us know that they plan to fix it soon. |
Zhora Six
|
Posted - 2008.12.28 17:07:00 -
[39]
Originally by: CCP Prism X However, it's an unacceptable risk. 135 minutes of extra downtime that could have been avoided is really not acceptable to anyone here in CCP. Sure we are now all aware of this possibility and know how to detect it but it's still (See Atlas! SEE!) an utterly unnecessary risk of increased downtime, even if it saves us some minutes of the daily downtime. So this will most likely change in the near future.
Lesson learned: Automatically dropping indexes ftl.
Thanks for the detailed response. I'm amazed at the time you all take to respond to the community. In my experience, no other mmo offers that level of interaction. It is quite appreciated! _______________________________________ Pull a lever, push a button, have a banana, die.
Space Monkeys. |
Yoinx
Caldari Black Elite
|
Posted - 2008.12.28 17:33:00 -
[40]
Originally by: CCP Prism X However, it's an unacceptable risk. 135 minutes of extra downtime that could have been avoided is really not acceptable to anyone here in CCP. Sure we are now all aware of this possibility and know how to detect it but it's still (See Atlas! SEE!) an utterly unnecessary risk of increased downtime, even if it saves us some minutes of the daily downtime. So this will most likely change in the near future.
Sounds like a Downtime Nerf is coming!
- I wish I had something witty to put in a signature. - |
|
Roy Batty68
Caldari Immortal Dead
|
Posted - 2008.12.28 17:44:00 -
[41]
Originally by: Zhora Six
Thanks for the detailed response. I'm amazed at the time you all take to respond to the community. In my experience, no other mmo offers that level of interaction. It is quite appreciated!
Very much this!
----
≡v≡ |
Wy LinChow
|
Posted - 2008.12.28 18:45:00 -
[42]
Please tell us this problem was not as the result of tweaks being made during the holiday. The culprit or culprit DBAs should be made to wait outside the station for 20 minutes without a space suite!
|
Kuranta
Minmatar Pator Tech School
|
Posted - 2008.12.28 18:49:00 -
[43]
The logs - they showed nothing!!
|
|
CCP Explorer
|
Posted - 2008.12.28 18:56:00 -
[44]
Originally by: Wy LinChow Please tell us this problem was not as the result of tweaks being made during the holiday.
This was not the result of tweaks being made during the Holidays.
Erlendur S. Thorsteinsson Software Director EVE Online, CCP Games |
|
Dr Sheepbringer
Gallente
|
Posted - 2008.12.28 19:16:00 -
[45]
OK, now for the killer:
CCP can this happen again? Stop whining. |
Xaviar Onassis
Iyen-Oursta Salvage
|
Posted - 2008.12.28 19:28:00 -
[46]
Originally by: CCP Prism X
...(Personal note to CCP Atlas: See, I wrote "its" rather than "it's". I get the paradigm! /personalJoke)....it's still (See Atlas! SEE!) ....
Does CCP Atlas also plan to get the downtime warning message corrected so that it says "out of harm's way" instead of "out of harms way"?
I petitioned it once, and a GM told me that since the plural of "harm" is "more harm", not "harms", no apostrophe was needed to indicate possession (it being "the way of harm" that you're getting out of)....
|
Blane Xero
Amarr The Firestorm Cartel
|
Posted - 2008.12.28 19:32:00 -
[47]
Originally by: Dr Sheepbringer OK, now for the killer:
CCP can this happen again?
Jesus Prism stated that it is extremely unlikely and even then, they will soon be fixing it so that it cannot happen again. ______________________________________________ Haruhiist since December 2008
|
Shimrod Ombreflamme
Gallente French Empire Squad
|
Posted - 2008.12.28 19:36:00 -
[48]
Many thanks to all CCP people who solved the problem this sunday.
|
Glengrant
TOHA Heavy Industries
|
Posted - 2008.12.28 19:48:00 -
[49]
Originally by: Dr Sheepbringer OK, now for the killer:
CCP can this happen again?
Not only *can* something something like this happen again - something *will* happen.
A big DB, a lot of daily updates & inserts, regular changes due to new features, refactoring and optimizations. There's simply no chance at all that this will work flawlessly all the time.
A couple hours here and there will get lost to "unscheduled downtime". Accept that as a fact of life, shrug it off and you'll live a much more relaxed life.
Re "skill change around downtime": Come on folks - so you loose a couple hours worth of SP. Repeat after me: "This is not the end of the world.". --- Save the forum: Think before you post. ISK BUYER = LOSER EVE TV- Bring it back!
|
Qordel
Caldari School of Applied Knowledge
|
Posted - 2008.12.28 20:21:00 -
[50]
Imagine that, a Microsoft SQL server having problems. *gasp*
-- What's your EVE New Year's Resolution for 2009? |
|
Qordel
Caldari School of Applied Knowledge
|
Posted - 2008.12.28 20:32:00 -
[51]
Edited by: Qordel on 28/12/2008 20:35:59
Originally by: Roy Batty68 Ooo! Moar database geekyness please!
So you guys reindex the major tables every downtime?
I've always assumed that's what the down-time was for, but it never really made any sense to me. Sure, it's faster to re-index and run database maintenance on a non-active system, but it's nearly 2009. Why not run them hot? My postgres datbase only serves about half as many people as CCP, but it NEVER goes down except temporarily for upgrades. Everything can be run with very limited impact on performance throughout the day.
The only thing I can think of is that perhaps *IF* things do go bad during maintenance, it's easier for them to handle and recover with the least damage if the entire cluster is down, in which case perhaps an hour of down time everyday is a necessary evil. Especially if MS SQL deosn't have point-in-time recovery, write-ahead-logging, etc -- (and again, my company owns MySQL and bankrolls a couple of the top postgres devs so my experience which is mostly OUTSIDE of a professional scope in the first place where databases are concerned, is primarily with these databases and not anything Microsoft related so I'm a bit ignorant on that front -- just making assumptions where MS SQL is concerned).
Personally, I wish there was never any down time. Having an hour a day of downtime impacts a greater range of time than that. You have to build any major activities (especially in groups) around it which means easing things down toward downtime or ramping them up after downtime makes a few hours every day where it's sort of in a "meh" state. Perhaps they can work downtime into New Eden lore some how? *snicker*
Then again, I'm sure if they could eliminate downtime all together while mitigating potential disaster, they'd already do it. And perhaps it's a necessary evil in their situation since outside of the database itself they're also dealing with a lot of new technologies and when any of them are on the fritz, it's easier to extend a downtime that is drilled into all of our heads than zap us with a downtime out of the blue. Mentally, I think people are more willing to accept it when a scheduled downtime is already part of our day than if we only had downtimes during emergencies. -- What's your EVE New Year's Resolution for 2009? |
Gane Green
Gallente Dominus Imperium
|
Posted - 2008.12.28 21:08:00 -
[52]
Skill que when? If God was a number he would be over 9,000!!!!!!!!! |
AshenShugar01
|
Posted - 2008.12.28 22:44:00 -
[53]
So all is well now, tho i DC'd twice this morning, and this is a good thing. However what happens to yesterdays Faction War statistics? I mean I ran a large amount of plexes yesterday and I hope the Victory Points havent just dissapeared into the wormhole.....
|
Trillian Darkwater
|
Posted - 2008.12.29 07:34:00 -
[54]
Originally by: Xaviar Onassis
Originally by: CCP Prism X
...(Personal note to CCP Atlas: See, I wrote "its" rather than "it's". I get the paradigm! /personalJoke)....it's still (See Atlas! SEE!) ....
Does CCP Atlas also plan to get the downtime warning message corrected so that it says "out of harm's way" instead of "out of harms way"?
I petitioned it once, and a GM told me that since the plural of "harm" is "more harm", not "harms", no apostrophe was needed to indicate possession (it being "the way of harm" that you're getting out of)....
And the GM is right. Language-****s are only successful when they're right
|
Trillian Darkwater
|
Posted - 2008.12.29 07:36:00 -
[55]
Originally by: Qordel Stuff about DT
Alot of other stuff happens during DT that they can't/won't do on a live system, such as sovereignty changes, etc.
There's posts/devblogs about this if you search...
|
ollobrains2
Gallente New Eve Order Holdings
|
Posted - 2008.12.29 11:01:00 -
[56]
what work has been done today to minimuze the chances of another delay today
|
Darwin Duck
Caldari Provisions
|
Posted - 2008.12.29 11:49:00 -
[57]
Edited by: Darwin Duck on 29/12/2008 11:49:15 Still not loading after 50 minutes downtime. Lets hope there are no problems. Decided to start playing again today and problems always arise in mmo's when I decide to return.
Edit: Yay, loading
|
ollobrains2
Gallente New Eve Order Holdings
|
Posted - 2008.12.29 12:26:00 -
[58]
3 ctd already and my memory usage goes up so that indicates a memory leak somewhere and its not my system ive got 8gb of ddr 2 ram new system. So must be server end
|
Taloth Saldono
|
Posted - 2008.12.29 12:37:00 -
[59]
Originally by: Xaviar Onassis
Originally by: CCP Prism X
...(Personal note to CCP Atlas: See, I wrote "its" rather than "it's". I get the paradigm! /personalJoke)....it's still (See Atlas! SEE!) ....
Does CCP Atlas also plan to get the downtime warning message corrected so that it says "out of harm's way" instead of "out of harms way"?
I petitioned it once, and a GM told me that since the plural of "harm" is "more harm", not "harms", no apostrophe was needed to indicate possession (it being "the way of harm" that you're getting out of)....
Obviously isn't a huge issue :D still, it should be harm's way. Possessive always uses an apostrophe, harm is singular so "harm's", so is player thus "player's", but players is plural thus "players'". So petition again :D
On-topic again, I'm kinda surprised that checking for valid indexes isn't included in the DB health check before start-up. It may take a while to check everything, but it's definitely better than having 10+ people working on unexpected issues and another 30000+ people waiting and overloading the forums :D
As for the Skill queue, before you know it people will start crying for API import so they can import EveMon plan queue, and if I've grasped the policy correctly, that will never ever happen (third-party imports). The Skill queue is on the drawing board btw.
Fly safe, while you still can.
|
ollobrains2
Gallente New Eve Order Holdings
|
Posted - 2008.12.29 13:38:00 -
[60]
still getting these 5 secon lockups every 4th one or so leads to ctd , memory leak
|
|
|
|
|
Pages: 1 [2] 3 :: one page |
First page | Previous page | Next page | Last page |