Pages: 1 2 3 :: [one page] |
|
Author |
Thread Statistics | Show CCP posts - 10 post(s) |
|
CCP Guard
C C P C C P Alliance
2211
|
Posted - 2012.05.03 16:41:00 -
[1] - Quote
Our EVE 3rd party development community is simply awe inspiring and now that I've gotten that out of my system, I recommend all 3rd party developers go here and read this dev blog about some changes we're making to the EVE 3rd Party Toolkit. CCP Guard | EVE Community Developer |-á@ccp_guard |
|
Cathrine Kenchov
Ice Cold Ellites
2
|
Posted - 2012.05.03 16:46:00 -
[2] - Quote
cool stuff |
Steve Ronuken
Fuzzwork Enterprises
392
|
Posted - 2012.05.03 16:47:00 -
[3] - Quote
/me cries quietly about more conversion work
Yay! Time to learn something new. (that's actually a positive yay, rather than a sarcastic one)
Now to wait for the update so I can download it. FuzzWork Enterprises http://www.fuzzwork.co.uk/ Blueprint calculator, invention chance calculator, isk/m3 Ore chart-á and other 'useful' utilities. |
Akrasjel Lanate
Black Thorne Corporation Black Thorne Alliance
719
|
Posted - 2012.05.03 16:49:00 -
[4] - Quote
Cool but short |
Packtu'sa
Nabaal Construction and Industrials Corp Nabaal Syndicate
0
|
Posted - 2012.05.03 16:54:00 -
[5] - Quote
Can you elaborate on the decision to change to YAML? My understanding of the format (having used it before) is that it's appropriate when the information needs to be both human- and machine-readable. Configuration files are an obvious candidate. Static data, on the other hand, seems more appropriately stored in a database or at least a table format (like CSV). YAML is a lot of key/value pairs and arrays. Why was it chosen to represent tabular data? |
Droxlyn
TOHA Heavy Industries TOHA Conglomerate
78
|
Posted - 2012.05.03 16:59:00 -
[6] - Quote
I'm confused as well. Why not keep the data in the database and export it at each release for the client to use?
For those that do not want to change, it'll just require ramming it back into the database somehow. |
ctx2007
Wychwood and Wells
61
|
Posted - 2012.05.03 17:05:00 -
[7] - Quote
sixth |
darmwand
Repo.
37
|
Posted - 2012.05.03 17:14:00 -
[8] - Quote
Aaah, this is awesome! Thanks a lot, this should make it much easier to deal with EVE data, using YAML is definitely the right thing to do. darmwand Repossession Agent http://www.repo-corp.net/ Recruitment is OPEN |
Droxlyn
TOHA Heavy Industries TOHA Conglomerate
78
|
Posted - 2012.05.03 17:40:00 -
[9] - Quote
Samples for those too lazy to open it up:
graphicsIDs.yaml wrote: 12: description: Stargate graphicFile: res:/Model/Jumpgate/Caldari/cj2/cj2.blue obsolete: true
typeIDs.yaml wrote: 5: graphicID: 6 6: graphicID: 1015
I hope they find more value in later YAMLs. These could have been done just as well with CSV files. |
Antihrist Pripravnik
Scorpion Road Industry
6
|
Posted - 2012.05.03 17:59:00 -
[10] - Quote
Never heard of YAML, but I've heard (and worked) a lot with JSON. Why YAML and not JSON? CCP Ytterbium: Yarrblblbgrlblbgrlblblblbblbgrlblblbgrblblyarrrrdrooooooolonthekeyboardlikealunatic |
|
Thebriwan
LUX Uls Xystus
41
|
Posted - 2012.05.03 18:08:00 -
[11] - Quote
I truly don't understand why anyone would use this Yam-something over XML but that is not my decision to make.
Where I really see a problem is here:
There are at least two totally different uses cases for the static data:
a) Something like a stand alone application for processing manually some data.
b) A web app processing Millions of datasets and calculation on the fly.
For a) it does not really matter if there are csv oder xml or yam-something files.
But b) needs a database.
And for all the effort you are doing here there is no way to provide sql files anymore? |
Two step
Aperture Harmonics K162
1926
|
Posted - 2012.05.03 18:10:00 -
[12] - Quote
A suggestion:
Before you guys switch over to the new system, how about you release a *full* data dump in the new format? Giving us 2 sample files, especially when it doesn't sound like there will be a 1-1 mapping of old tables to the new files isn't enough for people to build tools to handle the new data.
Edit: I just re-read the blog, sounds like you are *only* changing those two files for Inferno. My request stands for when you change over more stuff.. :) CSM 7 Secretary CSM 6 Alternate Delegate @two_step_eve on Twitter My Blog
|
Packtu'sa
Nabaal Construction and Industrials Corp Nabaal Syndicate
0
|
Posted - 2012.05.03 18:18:00 -
[13] - Quote
Antihrist Pripravnik wrote:Never heard of YAML, but I've heard (and worked) a lot with JSON. Why YAML and not JSON?
YAML is a superset of JSON. That said, I don't think either are appropriate. They can handle tabular data, but that's like saying you can send someone a picture as a Word document filled with comma-separated color values. Yes, the information is there, but it's in the wrong format.
(As far as I know,) EVE static data is tabular, not flexible key/value data like YAML is meant to represent. XML is sort of in between the two.
Honestly, I'm more concerned that CCP is choosing to use YAML internally than that they're forcing us to use it. I guess I shouldn't jump to conclusions before CCP can comment on it, though. |
Xarrg
Crushed Ambitions Reckless Ambition
3
|
Posted - 2012.05.03 18:43:00 -
[14] - Quote
How easy/hard to transport them back to sql format ? I'm sure most of the 3rd party guys will stick with they ms/mysql, so this will be just a extra step for us.
|
Zagdul
Clan Shadow Wolf Fatal Ascension
580
|
Posted - 2012.05.03 18:50:00 -
[15] - Quote
Item ID please stop using names.
It's not rocket surgery. |
Vessper
Eve Engineering Finance Eve Engineering
8
|
Posted - 2012.05.03 18:51:00 -
[16] - Quote
How long before we stop getting the MSSQL format?
|
Abdiel Kavash
Paladin Order Fidelas Constans
435
|
Posted - 2012.05.03 18:55:00 -
[17] - Quote
Because relational databases which have decades of research, optimalizations, and development behind them are too mainstream. Let's throw XML at the problem.
Also, we want to make life easy for application programmers, so for the forseeable future we will make you pull half of the data from a DB, and half of it from XML. You know, just to expand your horizons. |
|
Chribba
Otherworld Enterprises Otherworld Empire
3365
|
Posted - 2012.05.03 18:56:00 -
[18] - Quote
Vessper wrote:How long before we stop getting the MSSQL format?
This?
I'd probably be looking at converting it all back to db since that's what I prefer myself though.
/c
|
|
|
CCP Solomon
C C P C C P Alliance
142
|
Posted - 2012.05.03 19:01:00 -
[19] - Quote
As some have correctly noted, the reason for the split format delivery is due to an internal process change in how we manage static data, both during authoring and at run-time. This is a gradual migration effort that will see more and more portions of the static data dump delivered as YAML files. There is currently no date for when we will stop delivering the database dump, rather it will occur when there is no data left in it.
We will be delivering up to date samples of the YAML files and data base dumps ahead of each major release (assuming the data has changed) to give 3rd party developers a chance to update their tools.
These will be posted on the EVE Toolkit page. Associate Technical Producer - Foundation Technology |
|
|
CCP Solomon
C C P C C P Alliance
142
|
Posted - 2012.05.03 19:03:00 -
[20] - Quote
Chribba wrote:Vessper wrote:How long before we stop getting the MSSQL format?
This? I'd probably be looking at converting it all back to db since that's what I prefer myself though. /c
Yes, we absolutely encourage this and anticipated it to a certain extent, the 3rd party developer community are a resourceful bunch.
Associate Technical Producer - Foundation Technology |
|
|
James Bryant
Deep Core Mining Inc. Caldari State
7
|
Posted - 2012.05.03 19:18:00 -
[21] - Quote
CCP Solomon wrote:As some have correctly noted, the reason for the split format delivery is due to an internal process change in how we manage static data, both during authoring and at run-time. This is a gradual migration effort that will see more and more portions of the static data dump delivered as YAML files.
Can you shed any light on what this process change is and why it was undertaken? That might help us select the correct tools to manage a hybrid YAML/SQL environment like what I'm assuming you guys will be doing.
|
Andrea Griffin
274
|
Posted - 2012.05.03 19:26:00 -
[22] - Quote
I'm a bit befuddled over the choice of YAML but hey, if it works for you guys then that's great. I know that a lot of guys where I work are moving a lot of our configs over from a simple key=value format to YAML.
I'm not sure I would want it for the massive data sets that Eve uses, but it's a standard format, it isn't XML (which is good and bad), and it's very readable. So, what'ev. : > I'm just happy that CCP is awesome enough to provide us with the data in the first place. CCP Sreegs is my favorite developer. |
Ciar Meara
Virtus Vindice
643
|
Posted - 2012.05.03 19:49:00 -
[23] - Quote
casual corpse collector (wait, what?)
I love my corpses, but now I can love em even more ... - [img]http://go-dl1.eve-files.com/media/corp/janus/ceosig.jpg[/img] [yellow]English only please. Zymurgist[/yellow] |
Packtu'sa
Nabaal Construction and Industrials Corp Nabaal Syndicate
1
|
Posted - 2012.05.03 19:58:00 -
[24] - Quote
CCP Solomon wrote:As some have correctly noted, the reason for the split format delivery is due to an internal process change in how we manage static data, both during authoring and at run-time.
Yes, but why? In what way is the data changing such that YAML is the most appropriate format, and/or in what way is YAML the most appropriate format for the existing data?
There may well be a very good reason. I'm just interested in hearing it. |
Matthew
BloodStar Technologies
2
|
Posted - 2012.05.03 20:05:00 -
[25] - Quote
CCP Solomon wrote:As some have correctly noted, the reason for the split format delivery is due to an internal process change in how we manage static data, both during authoring and at run-time. This is a gradual migration effort that will see more and more portions of the static data dump delivered as YAML files.
The blog itself seems to suggest that only the client is currently using the static data in YAML format. Does this mean that the server (fed as it is through a lovely beast of an SQL Server), still has this data in a database format?
If so, what is the logic behind not providing all the data in both formats?
Or is the plan that all static data will exist only as YAML throughout both client and server?
My concern with this move is that while, as you point out, there are plenty of yaml readers for common programming languages, support for it in more off-the-shelf usage scenarios (ranging from an SQL Express install on someone's desktop, right down to someone knocking together their own home-brew spreadsheet) is rather less complete. Unfortunately, the latter group of data-dump users are unlikely to really even consider themselves 3rd party developers, so the first we'll probably hear from them is the wave of moans when the first of the really key data tables transitions over (the rest of invTypes, for example).
I guess I'd just be happier at accepting the higher barrier to entry that this creates if there was a bit more detail as to the advantages you expect in moving the data to YAML-only. Right now it looks like a shift of format of what is essentially perfectly happy, tablular, relational data, without any obvious benefit.
Chribba wrote: I'd probably be looking at converting it all back to db since that's what I prefer myself though.
If CCP are really going to push the data as YAML-only, then I can see this being a very popular service, particularly with myself! |
|
CCP Redundancy
C C P C C P Alliance
26
|
Posted - 2012.05.03 20:20:00 -
[26] - Quote
I figure I'll just answer some questions in an incomprehensible techy way.
As an organization, CCP has decided that we benefit from developers being able to work in branches (in a Perforce sense), and working in branches eventually means that you need to be able to change your data with your code and not affect other people. Binary formats and DBs can be perfect for the run-time data requirements that you have, but they're sucky from the point of view of understanding and merging data as a frail human doing an integration. So we want our data in files, and we want to be able to merge them.
So why not CSV? Yes, it would seem to be more appropriate to some of the data we've shown off so far, but at some point you also have to realize that the reason your data is tabular is because you've been storing it in a tabular storage medium. It's also really sucky to deal with foreign key relations between text files. As we progressively migrate more data, and deal with things like moons being the children of planets, we can decide to represent that in the structure of our data.
And why not JSON? Well, JSON is spiffy and all, but we already use YAML as an institution (if you've ever looked at our .red files for our ship assets etc), and it can deal with some things much more nicely than JSON (we use YAML reference support for some things). JSON somewhat suffers for having been built for a language that hasn't had a standard concept of a map until ECMA script 6 (stringified attribute names on objects just don't count, and this issue carries over into the proposed JSON schema validation standard).
We can output our static data as JSON, it's just not what we want to work with. We asked about this at the fanfest, but I think the detail was probably missed in the noise of the orbital bombardment
So we wanted to use YAML, and a no-sql-ish document-like data setup, but this isn't really appropriate for the runtime. In fact, we don't use it for the runtime... we use a structured binary format that's built from the data (think MessagePack), attempts to minimize memory overhead and disk seek operations. In some cases, we might even use Sqlite (woo, standard python library!) as appropriate for the use-case.
The end result of what we're trying to achieve is better runtime memory overhead performance, with an easier time for our developers to add / remove and change data formats in a human-understandable base format that can be versioned with changes to the code that are associated with it and isolated in branches. Some of the data will get built straight back into relation tables in MS-SQL, but some of it will just sit in a structured format that means that all related data to a particular "thing" is just accessible right there without subsequent join operations or lookups being required. We get to be able to sync our source control and get our data as it was at that point.
An important thing for us is to just try this and see if it can work for us, which is why we're starting so small. Beyond the changes to the data are much bigger changes to our tools and methods of working, not so much based on what the format is, but more on the issue of the data being local, isolated and in files rather than a central authoring DB.
I would suggest that the best long-term solution *might* be to look at NoSQL type databases, of which there are a number of free and very good options, or you can choose to try and maintain scripts that process our data into relational structures. |
|
Dalmont Delantee
The Black Legionnares SpaceMonkey's Alliance
27
|
Posted - 2012.05.03 20:32:00 -
[27] - Quote
Wow, that is seriously nerdgasm speak...I understood about 1 out of 10000 words but still made me shiver :P
|
James Bryant
Deep Core Mining Inc. Caldari State
8
|
Posted - 2012.05.03 20:39:00 -
[28] - Quote
Thanks CCP Redundacy,
We too are dealing with the difficulties of maintaining some kind of versioning system with DDL and static data; it is really a problem that doesn't have a good solution yet.
I can understand going to files that are easily parseable by CVS/Git/Whatever. It also allows people to check out the files and make their own changes locally without affecting the test database, and without having to load a fresh database copy into their local test environment every time they need something.
VS2010 Premium actually has some good SQL versioning capabilities when used with Team Foundation Server, but I have to say that I'm intrigued by CCP's approach here. Kinda the best of both worlds, in a certain sense (for you guys, a bit less so for us). A bit unwieldy, for multiple join type queries, but that stuff can be handled in code instead of in the database, I suppose. |
|
CCP Nobody
Royal Amarr Institute Amarr Empire
0
|
Posted - 2012.05.03 21:02:00 -
[29] - Quote
Matthew wrote: The blog itself seems to suggest that only the client is currently using the static data in YAML format. Does this mean that the server (fed as it is through a lovely beast of an SQL Server), still has this data in a database format?
No, we will also change the servers static data to YAML because, as CCP Redundancy said, we are moving away from the central authoring DB solution.
The gain in this is that the client will not contain data that it doesn't need and neither will the server, which can't be bad |
|
Packtu'sa
Nabaal Construction and Industrials Corp Nabaal Syndicate
1
|
Posted - 2012.05.03 21:28:00 -
[30] - Quote
Cheers. To clarify, does CCP have any plans to add structures which can't easily be represented in a database? (I'm having difficulty imagining what these might be, but YAML can do a lot.)
If there are any performance issues with YAML in third-party applications, I'm sure someone over at the Technology Lab will come up with a more useful package. (Something similar to the binary format that CCP Redundancy mentioned?)
[EDIT]
CCP Redundancy wrote:I figure I'll just answer some questions in an incomprehensible techy way. This, please, more of this! I've recently come back to EVE after playing some other in-development games, and it's refreshing to once again chat with devs who respect the player base and are themselves respectable. |
|
Alx Warlord
SUPERNOVA SOCIETY Tribal Conclave
104
|
Posted - 2012.05.03 21:46:00 -
[31] - Quote
Yammy database !!! uhmmnnn tasty!!!
* oh it is not yammy it is yaml... D : |
|
CCP Redundancy
C C P C C P Alliance
30
|
Posted - 2012.05.03 21:51:00 -
[32] - Quote
Packtu'sa wrote:Cheers. To clarify, does CCP have any plans to add structures which can't easily be represented in a database? (I'm having difficulty imagining what these might be, but YAML can do a lot.) If there are any performance issues with YAML in third-party applications, I'm sure someone over at the Technology Lab will come up with a more useful package. (Something similar to the binary format that CCP Redundancy mentioned?) [EDIT] CCP Redundancy wrote:I figure I'll just answer some questions in an incomprehensible techy way. This, please, more of this! I've recently come back to EVE after playing some other in-development games, and it's refreshing to once again chat with devs who respect the player base and are themselves respectable.
I don't recommend YAML for anything where you worry about performance. Check out MongoDB (BSON) or MessagePack as a starting point (also NoSQL in general is an interesting thing to play with if you've been all-relational, but I won't pretend that it's a good solution to everything). If you need to use YAML, make sure you're using a native parser at least (pyYAML + libYAML, for example).
We'll be sticking to lists and dicts and nested objects (pretty much JSON), and mainly focusing on working out how to convert our existing datasets (that are already in the DB) to this sort of thing without screwing things up for everyone at CCP. Python is very handy at working with this sort of data, so I personally recommend that for transforming it to whatever format you prefer.
This sort of structure: { 1: ['a', 'cat'], 2:['two','dogs'] } is a pain in the ass in a DB... do-able, but I don't want to insist that people build a relational version unless they need to.
|
|
|
CCP Redundancy
C C P C C P Alliance
30
|
Posted - 2012.05.03 21:57:00 -
[33] - Quote
James Bryant wrote:VS2010 Premium actually has some good SQL versioning capabilities when used with Team Foundation Server, but I have to say that I'm intrigued by CCP's approach here. Kinda the best of both worlds, in a certain sense (for you guys, a bit less so for us). A bit unwieldy, for multiple join type queries, but that stuff can be handled in code instead of in the database, I suppose.
We evaluated that technology, but determined that it probably wasn't going to fit our needs.
There are a few ways to handle multiple joins in a NoSQL-y way: you can separate the data out into another document collection and do the lookup (like MongoDB document links) and you can also duplicate and pre-embed the data. That sounds wasteful, but if you're sensible about it, it's no way near as bad as the memory overhead that python has (~10MB of data in terms of pure integer/float etc memory can easily blow up to 90MB, which starts to add up if you pickle and unpickle large data structures). We use schemas to omit type and attribute name information (like all of those "graphicID" strings in the raw data), which can be a big factor in more permissive structured data representations. [Side note - this is a big reason why we have typically seen a rise in the memory of the character selection screen each expansion as we add more/new data ]
Keep in mind that this stuff is heavily built towards static data that's immutable at runtime (do you know how difficult it is to find a key-value storage system library that's built for that particular requirement?). We can build all sorts of indices however we want - planets could be embedded inside of a solar system document, but we could still make efficient indices for looking up planets by ID within that. We can also load the data from disk in a cache-friendly manner if needed.
So in general, when dealing with static data, pre-bake your joins - funnily, we tend to already do this in performance critical databases by denormalizing data (only denormalized relational databases can't do that for lists or parent-child relations).
At least, that's the theory... |
|
|
CCP Nobody
Royal Amarr Institute Amarr Empire
1
|
Posted - 2012.05.03 22:09:00 -
[34] - Quote
..*slow clap*... |
|
Zaotome
Schweine im Weltall.
1
|
Posted - 2012.05.03 22:58:00 -
[35] - Quote
slow clap? clap! clapclapclap! |
Packtu'sa
Nabaal Construction and Industrials Corp Nabaal Syndicate
1
|
Posted - 2012.05.04 00:59:00 -
[36] - Quote
Alright, you've convinced me. A unit of Spirits to you! |
James Bryant
Deep Core Mining Inc. Caldari State
8
|
Posted - 2012.05.04 01:37:00 -
[37] - Quote
CCP Redundancy wrote:There are a few ways to handle multiple joins in a NoSQL-y way: you can separate the data out into another document collection and do the lookup (like MongoDB document links) and you can also duplicate and pre-embed the data. The problem is, in YAML, that's the slowest part, and with a binary format like MessagePack, how the heck are you getting at a specific piece of data you want without unpacking the whole thing? If something as large as invTypes needs to be parsed (or unpacked), that's an awfully large piece of memory (and slow code). For the PHP and other web folks who have to load data for every page, that gets pretty nasty. I suppose that's probably just not the right tool for that particular job, but I'm just trying to flesh out the options.
I can see MongoDB or another No-SQL style as perhaps the weapon of choice for the web guys for that reason if the data ever gets to the point of being outside the realm of what can be handled by a traditional relational format.
I haven't read completely through the MessagePack docs (which are pretty bare), but I'm not seeing random access capability. It is entirely possible I'm completely missing something though.
There's an additional problem I see for the 3rd party folks, and that's for non-dynamically-typed languages. The YAML (or JSON, or BSON) can have any number of arbitrary data structures. I suppose, like for Java or C#, you could maybe just use a Hashmap.
Quote:So in general, when dealing with static data, pre-bake your joins - funnily, we tend to already do this in performance critical databases by denormalizing data (only denormalized relational databases can't do that for lists or parent-child relations). True, and I tend to do the same, trying to denormalize when it makes performance sense, such as adding some information from invTypes to various other API pulls, like assets and the wallet transactions, to avoid having to join them every time. |
SkillQueueMonitor
Pator Tech School Minmatar Republic
2
|
Posted - 2012.05.04 02:13:00 -
[38] - Quote
Bout time. That denormalized table inside SQL made my soul hurt.
AND
I never have to install MSSQL ever again. |
Lairel Dallocort
Dreddit Test Alliance Please Ignore
4
|
Posted - 2012.05.04 03:00:00 -
[39] - Quote
As a Linux user who has no access to an MSSQL server, this makes me super happy! |
Jinli mei
Dreddit Test Alliance Please Ignore
109
|
Posted - 2012.05.04 05:31:00 -
[40] - Quote
James Bryant wrote: For the PHP and other web folks who have to load data for every page, that gets pretty nasty. I suppose that's probably just not the right tool for that particular job, but I'm just trying to flesh out the options.
With web-based stuff you can cache it either using a nosql approach like mongo, or something sane people use like memcached. If you think about it hard enough, you realize that most data you're pulling from CCP should likely be in a cached state rather than pinging the database for it or parsing it anyway.
|
|
Khir
Het Kruidvat
3
|
Posted - 2012.05.04 05:53:00 -
[41] - Quote
RavenDB is a pretty nice no-sql database for the .net platform that is free if your project is open source. I was already thinking about trying that as my backend store with denormalized data migrated from MSSQL.
I had no problem whatsoever with MSSQL, but I can appreciate the new setup will be better for those that don't develop on Microsoft platforms.
Any chance you guys want to share what you think the schema for some of the other yaml documents will be like? Even if they will not be published as yaml data at this point? |
Jack Tronic
borkedLabs
43
|
Posted - 2012.05.04 05:58:00 -
[42] - Quote
Meh, JSON is more readable and friendlier than YAML |
Real Poison
Aura of Darkness Nulli Secunda
101
|
Posted - 2012.05.04 06:13:00 -
[43] - Quote
Jack Tronic wrote:Meh, JSON is more readable and friendlier than YAML
While i love JSON for its purposes. That is plain wrong. YAML is the least cluttered and easiest format to store Array and Hashed Objects.
YAML Ain't Markup Language <- FTW! |
Matthew
BloodStar Technologies
2
|
Posted - 2012.05.04 07:51:00 -
[44] - Quote
Many thanks for the detailed explanation, sounds like it has the potential for significant benefits, which makes it easy to accept the additional work it'll need.
Though a 3rd party community project to script this back into SQL tables would be awesome (and I suspect far better than what I will otherwise cludge together on my own!). |
Risingson
20
|
Posted - 2012.05.04 08:08:00 -
[45] - Quote
even if it sounds like a state of the art move i hope there will be a mssql dump provided by ccp to have backward compatibility with existing tools. in my case doing a web for eve is a hobby not a job for a living. no mssql dump may make me quit it due to lack of time.... no crying, but panda. Eveeye.com-á- New Eden Bordcomputer Systems |
Freibuis
Legion of Lost Souls The Lego Cartel
1
|
Posted - 2012.05.04 08:18:00 -
[46] - Quote
where do I start.. CCP.. thanks for making my day and ruining my day in the same dev blog. ;) good one CCP /me looks through all my Stored procs I have made over the years. Shrugs and says.. i guess that I didnt need `em any way.
moving to a noSQL style is a great idea.. not sure about YAML tho.. never had it work properly. ended up chewing more resourse then it was worth.. but its good to see a decentralized approach in the future,
Question: These tables being removed or left in as well. if these tables are getting dropped could you save us OLD timers and give us a sql file with all the inserts for the table so we could use either YAML and or keep SQL that we have grown up.
will we have to write our own tools to put the data back into the SQL database?
|
|
CCP Nobody
Royal Amarr Institute Amarr Empire
3
|
Posted - 2012.05.04 11:36:00 -
[47] - Quote
With this data structure change we wanted to move over to a standardized way of retrieving static data. This is what YAML provides, it gives us a vast collection of parsers for the majority of programming languages and it is not tied to a particular OS (which apparently makes Lariel Dallocort super happy ).
The process we in Team Core Graphics Tools are following is that after a system is ported to the new structure, we will drop the unneeded tables. And after we have finished porting a system we can give you a look at how that systems schema will look (because we won't know before we start porting it).
Currently we do not have any plans of creating tools that put the data back into a DB. However this is just us giving you the actual data that is used within the game (although the in-game data has been optimized to pieces) and the method of storing and reading that data is totally up to you guys/girls because your needs differ. - If your application needs some sort of fast key-value lookup you could take a look at level-db - I would personally recommend mongo-db, because it is schemaless and should be easily used with the yaml data structure. |
|
Vessper
Eve Engineering Finance Eve Engineering
9
|
Posted - 2012.05.04 11:52:00 -
[48] - Quote
CCP Nobody wrote:The process we in Team Core Graphics Tools are following is that after a system is ported to the new structure, we will drop the unneeded tables. So just to confirm, future MSSQL data exports which have had some data converted to the new structure will be missing certain data tables (as these will be provided in the YAML files)? I guess I'm trying to establish if we will continue to get a full (as in, the pre-Inferno schema) SQL export until you've finished this project or we need to start working on partial conversions now. |
Thebriwan
LUX Uls Xystus
41
|
Posted - 2012.05.04 12:01:00 -
[49] - Quote
Yesterday I swallowed my comments - because they would be a bit bitter.
It seems to be more pointless now, but I do in anyway...
Thank you CCP Nobody for the deep insight in your whys and hows.
BUT:
There is still a standard in Web-hosting. It's called (X)AMP(P). That is what you get. No MongoDBs no nothing.
Yes one can get his own virtual server and do what he pleases. But like someone wrote a before me: This is just a hobby.
I can not spend eternity with setting up unknown systems (and update them every time a new zero-day-exploit is found). I can not spend the money - because it is still just an hobby.
And I would like to see the no-sqldb that calculates the gain on my sell orders for the last 5 years in a timely manner on the fly.
So. I need MySQL-Tables and I will be very thankful if someone can sill provide them.
|
Freibuis
Legion of Lost Souls The Lego Cartel
1
|
Posted - 2012.05.04 12:16:00 -
[50] - Quote
CCP Nobody wrote:With this data structure change we wanted to move over to a standardized way of retrieving static data. This is what YAML provides, it gives us a vast collection of parsers for the majority of programming languages and it is not tied to a particular OS (which apparently makes Lariel Dallocort super happy ). The process we in Team Core Graphics Tools are following is that after a system is ported to the new structure, we will drop the unneeded tables. And after we have finished porting a system we can give you a look at how that systems schema will look (because we won't know before we start porting it). Currently we do not have any plans of creating tools that put the data back into a DB. However this is just us giving you the actual data that is used within the game (although the in-game data has been optimized to pieces) and the method of storing and reading that data is totally up to you guys/girls because your needs differ. - If your application needs some sort of fast key-value lookup you could take a look at level-db - I would personally recommend mongo-db, because it is schemaless and should be easily used with the yaml data structure.
Dont get me wrong. Its great what you are doing.. But (and there is always a butt!) until all the data is in the new format. this method is going to be a pain/. Part data in one and part data in the other. us old timers will have to Re-import (or god forbid not update) the YAML data into MS-SQL so that or functions/Stored Procs/SQL goodness will still work.
there would be no point moving to a new data struture until ALL static is moved to YAML format. Also this will cause coding issues every time a new YAML port is released..
I would still release the complete SQL database whole until the 100% of the static data is released. that way we wont have to do code changes EVERY TIME.. only once.
I would rather spend a week converting Stored Proc's then spending a day here and there until 100% Statics data is converted to YAML.
I am not saying I dont want YAML... I am saying.. I would rather do it at one go then every month. most people who have code like mine will have to import back into SQL to keep stuff working until the eventual day when there is no SQL at all
|
|
Hosedna
FumbleFamily Corp
8
|
Posted - 2012.05.04 12:35:00 -
[51] - Quote
The shared hosting I pay for only have MySQL / PostgreSQL options, as most, so I guess it will become a bit tricky to do the requests for industry on YAML files. We'll lost the expression power of SQL and have to do the joints "by hand"... Unless there is something a bit in the line of xpath for YAML ? It's not as good as SQL but it could be a first step to help structuring requests... |
|
CCP Nobody
Royal Amarr Institute Amarr Empire
3
|
Posted - 2012.05.04 14:07:00 -
[52] - Quote
The plan is to drop the migrated tables from the data dump with every release. We know that this is difficult but it will add a lot of overhead to insist that while moving over to a more flexible data format, that we maintain backwards compatibility to a completely separate representation form GÇô in the end this would make us less flexible and able to take advantage of the benefits of the new format.
Unfortunately there are so many systems and so much static data in Eve that any attempt to do them all at once would be a multi-month effort that would be doomed to failure because we wouldnGÇÖt have worked through all the problems and issues while trying to apply the solution. We would also cause all feature development to stop, and break all of the tools that we use in day to day development, while likely introducing issues into every single game system. This just isnGÇÖt a practical option for us or for you.
|
|
James Bryant
Deep Core Mining Inc. Caldari State
8
|
Posted - 2012.05.04 14:29:00 -
[53] - Quote
Hosedna wrote:The shared hosting I pay for only have MySQL / PostgreSQL options, as most, so I guess it will become a bit tricky to do the requests for industry on YAML files. We'll loose the expression power of SQL and have to do the joints "by hand"... Unless there is something a bit in the line of xpath for YAML ? It's not as good as SQL but it could be a first step to help structuring requests...
That is definitely something that is going to bite quite a few folks. I happen to have a virtual server for my hosting, so not a big deal for me, but I have a feeling that I'm in the minority of Eve dev hobbyists. Still, the solutions are out there, this just might push a few people past their commitment point, unfortunately. Still, my feeling is that somebody will step up to the plate and convert all this into SQL after each release anyhow. There's no way I'd be able to do one of the more join heavy queries I do now like getting the top ten profitable market categories out of all our trades for the month, or maybe the wackiest query ever, T2 build requirements (uf!).
I'll tell you where this hurts the most, and that is in Android land, where I also develop. Many devices, especially ones still running Gingerbread or earlier, don't have much in terms of heap space, usually only 16Mb (or less on junk devices, of which there are many). That ought to be fun trying to parse/unpack/unserialize something massive like the map data or invTypes. Combine that with statically typed Java, and you have a challenge.
Still, I like a challenge. We'll see how this shakes out. |
Xander Hunt
35
|
Posted - 2012.05.04 14:32:00 -
[54] - Quote
*sigh*
I don't even know where to begin...
First...
YAML: YAML Ain't Markup Language
... come on.. really? I'm seriously, physically rolling my eyes at this.
I've very quickly just skimmed over the what the structure is about. So f'n not impressed.
Cons I see....
- First, looking at the "yaml.org" website, it looks like it was coded by a five year old with limited knowledge of anything to do with a computer, let alone design a new type of data structure. Designed in something pre-Netscape Designer. Doesn't ooze a lot of professionalism and confidence towards code base (If there is a "code base" behind specifications of a data structure) and functionality and theory behind the actual concept of the data format, really, nor does it raise any kind of confidence behind who the designers of this data format are when this has been around since 2001. (Yes it was a run-on sentence - sorta) However, I'll give credit where credit is due and note that they did use an external style sheet... which reading on down the code looks like the page was generated anyways. Makes me wonder if the site itself is read from a YAML file?
- Second, just like XML, JSON, and any other non-managed database system that doesn't rely on an index of sorts, one must read all data, or at least to the point where the data you want exists while assuming the data is sorted, from top to bottom, to get that bit (literal) of information to determine whether or not typeID 21471 is a Published object. What a waste. Don't get me wrong. Both have their place. Exchange of clear, described data to be put somewhere. I know massive XML documents float around at times, exchanging hands from one type of system to another, but that XML file isn't used as a "lookup information" source 99.995% of the time.
- The volume of data within EVE... Looking at the SQLite database conversion from Crucible, its over 200meg in size. Thats with packed data (IE: 10 character numbers in 4 bytes of data), indexing, structure definitions, page files, etc. A query to pull any data from anywhere in that file takes MILLISECONDS worth of time (Just timed it, 19ms to find out it is published). Text files? Too large to handle. I'd have to read thousands of lines to GET to that point.
- Not sure I'm too keen on the whole idea of just taking data out of the existing MS SQL backup and dropping them into text files to be re-consumed. I'd ask that all data exists in the MS SQL data and slowly roll out the new YAML files as you massage out the structure you want. Then when all tables are done, then drop MS SQL
- Some of this structure looks similar to Windows INI files.... 'Cept, headers are marked with an identifier followed by a colon instead of a [identifier] type of ordeal. I do acknowledge there are advancements in comparison to the INI format, but not much more.
Pros...
- No MSSql - Although I started off with training against MS SQL 2000 Enterprise, I've moved very far from it simply due to costs. Yes, I know its free NOW, but it wasn't like that until recently, and I've never looked back. I might have been an MSSql fanboi if there were always free versions. I'm cheap.
- Take the generalized data and put it into a proprietary structure our applications work with. MSSql, MySQL, SQLite, CSV, our own structure of data (I'm looking at you EVEMon! {wink}) or whatever we want is a GREAT bonus.
Final thoughts
Of course, all we (us?) developers are going to have to follow your lead if we're going to keep developing our tools for your game, but honestly, I've never been, never will be, a complete fan of single or multiple text files that is supposed to relay some sort of structured data. I avoid creating XML, I avoid CSV, I avoid plain text simply because repetitive reads of data slows the whole process down, ESPECIALLY when you get into thousands of lines.
With all the enhancements you ladies and gents at CCP have been putting into improving UI response times, I'm quite thrown back that you'd go to a text file to manage database worthy information, static data or not.
{30 minutes later}
... come to think of it... YAML originated in 2001 and has had pretty much NO MOVEMENT since then... and you're using it for in-house processes and implementing it as a data store in half way though 2012?!?!?! |
Katrina Bekers
Rim Collection RC Test Alliance Please Ignore
93
|
Posted - 2012.05.04 15:33:00 -
[55] - Quote
Speaking of NoSQL:
Redis.
You will never go back. EVER. << THE RABBLE BRIGADE >> |
Kouryusei
The Bitter Sea Trading Company
28
|
Posted - 2012.05.04 16:15:00 -
[56] - Quote
Following up on Katrina, go play with Couchbase (not CouchDB), it's just as sexy as redis.
In other news, **** YAML. Royally. I'll convert it to a plethora of formats since, like I said - **** YAML. |
Steve Ronuken
Fuzzwork Enterprises
392
|
Posted - 2012.05.04 16:22:00 -
[57] - Quote
As long as the data remains in a form that can be easily represented in a tabular form, I'll be backporting it, along with the mysql conversions I've been doing.
Something I would love though: A separate file that specified the keys (as some are optional) and max lengths of values. Just reduces the amount of preprocessing I'll have to do on import.
It's not a biggy though. FuzzWork Enterprises http://www.fuzzwork.co.uk/ Blueprint calculator, invention chance calculator, isk/m3 Ore chart-á and other 'useful' utilities. |
Etil DeLaFuente
New Eclipse Initiative Mercenaries
6
|
Posted - 2012.05.04 17:32:00 -
[58] - Quote
So if i understood right, more and more data will be available on the client in YAML format ?
Or, will we still have to rely on the toolkit ? |
Lan Staz
Aperture Harmonics K162
16
|
Posted - 2012.05.04 18:41:00 -
[59] - Quote
I think you are going to have a perception problem due to the choice of initial samples which are far too simple to show the advantages of structured over relational data.
Maybe showing something more complex, even if it is just an indicator of how things might look, would be a good idea. Something that currently requires several tables and lots of joins between them that would collapse down to one list of structured objects, such as ship definitions. or the map.
I'd post something here as an example except there doesn't appear to be a way to post code samples on these boards without losing the structure.
Oh, and as someone who has no access to MS SQL and works in Python anyway, yay for YAML!
|
Antihrist Pripravnik
Scorpion Road Industry
6
|
Posted - 2012.05.04 19:19:00 -
[60] - Quote
Big thanks to all devs that replied with a lot of technical stuff! I can see the future now CCP Ytterbium: Yarrblblbgrlblbgrlblblblbblbgrlblblbgrblblyarrrrdrooooooolonthekeyboardlikealunatic |
|
Dil'e Mahn
The Bastards
5
|
Posted - 2012.05.04 20:22:00 -
[61] - Quote
I'm in the "I'd like to see a somewhat more elaborate example of a datastructure you guys are toying with" camp.
Pros and cons of YAML versus other versionable data storage aside: it's a farily simple to parse format, and there's going to be plenty of folks offering conversions, just like they do with MSSQL to MySQL/SQLite/whatever today. I'm not worried about that, and frankly I don't care that much whether it's YAML or something else, just as long as I (or someone else) can cook up a conversion to whatever format I'd want to use, it's all cool.
At least text-based formats mean I don't depend on someone else to do the conversion for me (no MSSQL here), so that's progress.
I don't think anyone in their right mind would want to run a webapp or a low-resources mobile app on big text files, but then again you don't have to. Just pick and choose the data you need and stash that in a format your environment is happy with. I imagine you have to do that today, as well, unless it's possible to run a few-hundred MB MSSQL database on an Android 2.1 device... =]
Come to think of it: for a web app, plenty of things (ship/module data, for example) could very well be stored in separate files containing a JSON blob / PHP serialized array / XML blob / pickle for that item. Name them by itemID, and you have your basic lookup system ready. No more table joins, everything there is to know about that item lives in that file. It might not be the most effective use of disk space, but space is relatively cheap these days, and it works a treat for the "shared hosting and limited to MySQL/PostGres, NoSQL is not an option" crowd. For the things that can't be done that way, there'll be conversions as soon as the spec is out.
Don't worry, people, we'll be fine.
Also: CCP devs, your ability and willingness to speak nerdy to your customers is highly appreciated. Keep rocking. Shooting people in the face for fun and profit. Well, for fun, mostly. |
Charlie Parker Sidrat
EntroPrelatial Industria
0
|
Posted - 2012.05.05 01:39:00 -
[62] - Quote
Hmm. Gulp.
cries a bit with confusing and worry.
Cards on table - I'm not a coder. Never have been and tried repeatedly since line numbers were required.
I enjoy designing spreadsheets and messing about with sql and then Access to get the queries and make tables to import and use as pivot tables on Excel.
Just when I've finally got Eve Industrial Organiser to the point where it's very very easy to keep it updated in a few key strokes (the hardest part is remembering how to restore the backup database each patch release as I only do it for the data dumps), you're going to change it to the point where it doesn't seem like query look up tables is going to be an option straight out of the box.
Perhaps it will make coding easier to understand? Maybe I'll lose the fear factor and just start 'getting' it - like I eventually did for excel and access.
Worst case scenario I stop updating EIO, best case scenario - I figure it all out and produce an exe version that doesn't require the use of Excel 2007 which will make more than a few potential users very happy.
|
Lan Staz
Aperture Harmonics K162
16
|
Posted - 2012.05.05 19:05:00 -
[63] - Quote
With the move to YAML for static data, is there any plan to support YAML in the API as well? |
Nirnaeth Ornoediad
Clan Shadow Wolf Fatal Ascension
106
|
Posted - 2012.05.05 20:20:00 -
[64] - Quote
CCP Redundancy wrote: So in general, when dealing with static data, pre-bake your joins - funnily, we tend to already do this in performance critical databases by denormalizing data (only denormalized relational databases can't do that for lists or parent-child relations).
At least, that's the theory...
They can: it's just that you're limited to either pre-defining the list length (or number of children) ahead of time, or living with a table definition which can change at runtime (which, of course, has it's own set of challenges*).
* Hint: Views are your friend if you do this, as a View can effectively maintain a "static" view of a table even if it suddenly starts adding columns to itself at runtime.
Just curious: one of the advantages of an MS-SQL dump is that it's trivial to import into Access or Excel and run some basic analytics. Does anyone know of a good YAML-to-relational database integration tool that's freeware? Even an industrial-grade integration tool like Informatica or Cast Iron would be good, so long as they were free. "The Mittani isn't even gone for a day and CCP's management is already making bad decisions."
THE MITTANI for CEO of CCP 1-800-273-8255 |
Freibuis
Legion of Lost Souls The Lego Cartel
1
|
Posted - 2012.05.06 00:08:00 -
[65] - Quote
I have had time to "stue" then over in my head for a couple of days.
if the data is not going to be kept in the SQL database and you are not going to load the data in there.
would it be possible to leave the empty table in there.
that way I can automatically import VIA SSIS (SQL integration services). that way I could dump SQL file with all the inserts in it for all the SQL users (MSSQL/MySQL) back to the community that needs it |
LifeHatesMe
SKULLDOGS RED.OverLord
7
|
Posted - 2012.05.06 00:56:00 -
[66] - Quote
Thank you CCP for making my head spin xP
Now you got me thinking on Plain Text databases, MySQL & RDMS's, anddd stuff like YAML, Reddis, and MongoDB...
There are so many things that make this way too complicated for me to understand at face value. I wish they had a index of what makes a database good for management instead of "Hey you, go write in 15+ different database wrappers so you can figure out specifically which one is best suited for your specific application" xD |
Ein Spiegel
Fly-by-Night Industries LLC PTY LTD Drama Flakes
11
|
Posted - 2012.05.07 04:19:00 -
[67] - Quote
I love the geeky talk from both Nobody and Redundancy. I even understood some of it. But I wanted to ask about a different angle...
Is the move to YAML going to make integrating your branches back and forth across the depot structure in Perforce easier, or is it going to allow you to simply be able to keep the data out of a database structure and allow you to have versioned static data as a part of any specific label or changelist? Maybe a bit of both, I think... but will your QA still like you in the morning?
If the static data stops being, well, static, across all of the developers, changes to static data could happen which makes integrating a bear. Seeing how this change affects the current branch works, but how will it break across multiple client workspaces going through multiple integrations down to the release branch?
Also - you guys use Perforce? I can't imagine the kind of binary assets you've got in that SCM, I had enough problems with trying to version Word documents in it. (Don't ask. I don't work there anymore and I will not talk about it.) |
Zor'katar
Leeole's Legion
2
|
Posted - 2012.05.07 15:00:00 -
[68] - Quote
So who's going to be the hero of Weaksauce Developers such as myself by converting everything back to an SQLite database? |
Zifrian
Licentia Ex Vereor Intrepid Crossing
279
|
Posted - 2012.05.11 05:11:00 -
[69] - Quote
Zor'katar wrote:So who's going to be the hero of Weaksauce Developers such as myself by converting everything back to an SQLite database? Not me. I have no idea what any of this is but I guess I have to learn it. I will just convert the data into SQLite and use what I have already. I'm more interested in making my app work and not have to spend a lot of time on it. So I hope this is a permanent thing because I don't care about new tech stuff like this really lol Maximze your Industry Potential! - Get EVE Isk per Hour! |
Jognu
French Kiss Singularity Astromechanica Federatis
15
|
Posted - 2012.05.25 14:51:00 -
[70] - Quote
If that can help some people : https://forums.eveonline.com/default.aspx?g=posts&m=1362009 EveAI developper: https://forums.eveonline.com/default.aspx?g=posts&t=21803 YamlToSQL developper: https://forums.eveonline.com/default.aspx?g=posts&m=1362009 |
|
suenoni terracotta
Armada Ministry Defence Fidelas Constans
0
|
Posted - 2012.07.16 12:34:00 -
[71] - Quote
I may be blind or something, but I can't find wich version of YAML is used. At present there are three to choose from. So which version is it? |
Allan Ahra
EVE University Ivy League
4
|
Posted - 2012.07.21 08:52:00 -
[72] - Quote
I have to vent some anger over this too :p
I totally understand AND applaud the move away from a database dump. The MSSQL datadump means it's basiclaly only accessible to people running Windows. MOving away from that and to a "plain text" format is the right thing to do. For this same reason lots of interchange text formats are being used to make different systems communicate with eachother.
However, I do have worries about yaml. The yaml format is now over 10 years old, and yet, it's not really seeing a whole lot of traction. The available tools for dealing with yaml are in a pretty sorry state. I tried several libs and most can't even handle the extended sample file. With the state of things, it looks like I'll have to make my own yaml parser just to get the features/performance/memory use I need. XML would have been my personal preference as an interchange format (much more mature, LOTS of support tools available, and this really nifty/elegant thing called XSLT), but I'm not really going to debate that issue.
What I AM worried about however is the complete lack of data description. With the DB, you at least could query the database for the table colums, type, nullability which in turn makes it easy to prepare your own internal structures. Yaml is completely form-free. And this is a HUGE disadvantage for what most people use the eve datadump for. Form-free data has it's uses, tabular data ISN'T one of them.
With how YAML is now, you need to read the ENTIRE yaml file into memory, just to be able to figure out what all the columns are in the table. YOu also need to read the entire yaml file into memory just to have a rough idea of the datatypes. If you want actual datatypes, you need to iterate over the entire table and data and infer actual types from the basic types (int, float, string) yaml provides. And even then, we only know what IS there, not what COULD be there in some future version of the data dump.
So you read a yaml file, see that one field called humptyDumpty is an interger type in yaml, and by looking at all the values you find, you see no value is larger than 255. SO you infer it's a 1 byte unsigned integer. Then in the next datadump more data is added and oops, your internal table no longer works. Is a string going to need unicode ? or will ANSI/ASCII suffice ?
You stated it yourself, if you want performance, you don't want to use the yaml "as is" in a dictionary type associative array. It is too slow to load, too memory wasteful. If you need to convert, then knowing the correct "record" format up front is a huge benefit.
So here are my 2 big questions in this issue... 1-¦ Can we expect to get "easily computer-parsable" descriptions with decent types of each of the yaml tables ?
2-¦ With the poor state of yaml parsers out there. Can we get a guarantee about what features of yaml you guys will be using? If you have to make your own parser (or pick one that has the features you're going to be using), then the plain "unordered key/values blocks" / associative array/dictionary thing as used in the current 2 yaml files is easy enough to handle. Some of the other yaml features are... quite complex to parse.
Semi related question: Are bugs/inconsistensies in the datadump something we should report in the normal bug forum ? Or is there another place for those ?
|
Zifrian
Licentia Ex Vereor Intrepid Crossing
377
|
Posted - 2012.08.10 21:07:00 -
[73] - Quote
After going through 3 patches now that changed the file structure of the datadump, I really think we need a better rollout plan for this change. This onesey twosey stuff isn't going to cut it. I appreciate not wanting to change everything all at once but I would much rather have the full yaml files with the database dump for like a month or two and then just hard switching it over. That way I can update my updater to rebuild the MSSQL db first then run my already established SQLite conversion for IPH. I'm not re-writing all my stuff to use yaml and will just convert it to my current DB system anyway. I just don't see how deleting "radius" from the table and putting it in a yaml file is helping me any. All I do is look at the table and ask "is this something I need" No? OK, comment it out and move on. I'm pretty sure most other 3rd party devs are doing something similar. But every patch we are going to do this?
Is there something I'm missing here? Why can't we just do a hard cut over date on this and be done with it? I would rather have data I need and want to import, do the coding to get it, and then hopefully when there isn't a huge patch going in (Inferno 1.2), I update it.
So seriously, can we get some input here? Can we get a full transition day or over a month? Can we get that transition to happen NOT with a expansion/update/patch day? I haven't heard a peep on this since this dev blog was posted. Any updates? Maximze your Industry Potential! - Get EVE Isk per Hour! |
Galen Kamari
Pelican. Cascade Imminent
2
|
Posted - 2012.08.14 12:25:00 -
[74] - Quote
Lan Staz wrote:With the move to YAML for static data, is there any plan to support YAML in the API as well? CREST will return data in a JSON format, so yes (sort of). It's something for the future, though.
-GK |
|
|
|
Pages: 1 2 3 :: [one page] |