These forums have been archived and are now read-only.

The new forums are live and can be found at https://forums.eveonline.com/

EVE Information Portal

 
  • Topic is locked indefinitely.
 

New dev blog: Changes to Toolkit Exported Data

First post First post
Author
Khir
Het Kruidvat
#41 - 2012-05-04 05:53:31 UTC
RavenDB is a pretty nice no-sql database for the .net platform that is free if your project is open source. I was already thinking about trying that as my backend store with denormalized data migrated from MSSQL.

I had no problem whatsoever with MSSQL, but I can appreciate the new setup will be better for those that don't develop on Microsoft platforms.

Any chance you guys want to share what you think the schema for some of the other yaml documents will be like? Even if they will not be published as yaml data at this point?
Jack Tronic
borkedLabs
#42 - 2012-05-04 05:58:56 UTC
Meh, JSON is more readable and friendlier than YAML
Real Poison
Sebiestor Tribe
Minmatar Republic
#43 - 2012-05-04 06:13:27 UTC
Jack Tronic wrote:
Meh, JSON is more readable and friendlier than YAML


While i love JSON for its purposes. That is plain wrong.
YAML is the least cluttered and easiest format to store Array and Hashed Objects.

YAML Ain't Markup Language <- FTW!
Matthew
BloodStar Technologies
#44 - 2012-05-04 07:51:29 UTC
Many thanks for the detailed explanation, sounds like it has the potential for significant benefits, which makes it easy to accept the additional work it'll need.

Though a 3rd party community project to script this back into SQL tables would be awesome (and I suspect far better than what I will otherwise cludge together on my own!).
Risingson
#45 - 2012-05-04 08:08:56 UTC
even if it sounds like a state of the art move i hope there will be a mssql dump provided by ccp to have backward compatibility with existing tools. in my case doing a web for eve is a hobby not a job for a living. no mssql dump may make me quit it due to lack of time.... no crying, but panda.
Freibuis
Legion of Lost Souls
#46 - 2012-05-04 08:18:58 UTC
where do I start..
CCP.. thanks for making my day and ruining my day in the same dev blog. ;) good one CCP Big smile
/me looks through all my Stored procs I have made over the years. Shrugs and says.. i guess that I didnt need `em any way.

moving to a noSQL style is a great idea.. not sure about YAML tho.. never had it work properly. ended up chewing more resourse then it was worth.. but its good to see a decentralized approach in the future,

Question: These tables being removed or left in as well. if these tables are getting dropped could you save us OLD timers and give us a sql file with all the inserts for the table so we could use either YAML and or keep SQL that we have grown up.

will we have to write our own tools to put the data back into the SQL database?

CCP Nobody
C C P
C C P Alliance
#47 - 2012-05-04 11:36:06 UTC
With this data structure change we wanted to move over to a standardized way of retrieving static data. This is what YAML provides, it gives us a vast collection of parsers for the majority of programming languages and it is not tied to a particular OS (which apparently makes Lariel Dallocort super happy Big smile).

The process we in Team Core Graphics Tools are following is that after a system is ported to the new structure, we will drop the unneeded tables. And after we have finished porting a system we can give you a look at how that systems schema will look (because we won't know before we start porting it).

Currently we do not have any plans of creating tools that put the data back into a DB. However this is just us giving you the actual data that is used within the game (although the in-game data has been optimized to pieces) and the method of storing and reading that data is totally up to you guys/girls because your needs differ.
- If your application needs some sort of fast key-value lookup you could take a look at level-db
- I would personally recommend mongo-db, because it is schemaless and should be easily used with the yaml data structure.
Vessper
Dark Mason Society
#48 - 2012-05-04 11:52:04 UTC  |  Edited by: Vessper
CCP Nobody wrote:
The process we in Team Core Graphics Tools are following is that after a system is ported to the new structure, we will drop the unneeded tables.

So just to confirm, future MSSQL data exports which have had some data converted to the new structure will be missing certain data tables (as these will be provided in the YAML files)? I guess I'm trying to establish if we will continue to get a full (as in, the pre-Inferno schema) SQL export until you've finished this project or we need to start working on partial conversions now.
Thebriwan
LUX Uls Xystus
#49 - 2012-05-04 12:01:59 UTC
Yesterday I swallowed my comments - because they would be a bit bitter.

It seems to be more pointless now, but I do in anyway...

Thank you CCP Nobody for the deep insight in your whys and hows.

BUT:

There is still a standard in Web-hosting. It's called (X)AMP(P).
That is what you get. No MongoDBs no nothing.

Yes one can get his own virtual server and do what he pleases. But like someone wrote a before me:
This is just a hobby.

I can not spend eternity with setting up unknown systems (and update them every time a new zero-day-exploit is found).
I can not spend the money - because it is still just an hobby.

And I would like to see the no-sqldb that calculates the gain on my sell orders for the last 5 years in a timely manner on the fly.

So. I need MySQL-Tables and I will be very thankful if someone can sill provide them.

Freibuis
Legion of Lost Souls
#50 - 2012-05-04 12:16:04 UTC
CCP Nobody wrote:
With this data structure change we wanted to move over to a standardized way of retrieving static data. This is what YAML provides, it gives us a vast collection of parsers for the majority of programming languages and it is not tied to a particular OS (which apparently makes Lariel Dallocort super happy Big smile).

The process we in Team Core Graphics Tools are following is that after a system is ported to the new structure, we will drop the unneeded tables. And after we have finished porting a system we can give you a look at how that systems schema will look (because we won't know before we start porting it).

Currently we do not have any plans of creating tools that put the data back into a DB. However this is just us giving you the actual data that is used within the game (although the in-game data has been optimized to pieces) and the method of storing and reading that data is totally up to you guys/girls because your needs differ.
- If your application needs some sort of fast key-value lookup you could take a look at level-db
- I would personally recommend mongo-db, because it is schemaless and should be easily used with the yaml data structure.


Dont get me wrong. Its great what you are doing.. But (and there is always a butt!) until all the data is in the new format. this method is going to be a pain/. Part data in one and part data in the other. us old timers will have to Re-import (or god forbid not update) the YAML data into MS-SQL so that or functions/Stored Procs/SQL goodness will still work.

there would be no point moving to a new data struture until ALL static is moved to YAML format. Also this will cause coding issues every time a new YAML port is released..

I would still release the complete SQL database whole until the 100% of the static data is released. that way we wont have to do code changes EVERY TIME.. only once.

I would rather spend a week converting Stored Proc's then spending a day here and there until 100% Statics data is converted to YAML.

I am not saying I dont want YAML... I am saying.. I would rather do it at one go then every month. most people who have code like mine will have to import back into SQL to keep stuff working until the eventual day when there is no SQL at all

Hosedna
FumbleFamily Corp
#51 - 2012-05-04 12:35:02 UTC  |  Edited by: Hosedna
The shared hosting I pay for only have MySQL / PostgreSQL options, as most, so I guess it will become a bit tricky to do the requests for industry on YAML files. We'll loose the expression power of SQL and have to do the joints "by hand"... Unless there is something a bit in the line of xpath for YAML ? It's not as good as SQL but it could be a first step to help structuring requests...
CCP Nobody
C C P
C C P Alliance
#52 - 2012-05-04 14:07:08 UTC
The plan is to drop the migrated tables from the data dump with every release. We know that this is difficult but it will add a lot of overhead to insist that while moving over to a more flexible data format, that we maintain backwards compatibility to a completely separate representation form – in the end this would make us less flexible and able to take advantage of the benefits of the new format.

Unfortunately there are so many systems and so much static data in Eve that any attempt to do them all at once would be a multi-month effort that would be doomed to failure because we wouldn’t have worked through all the problems and issues while trying to apply the solution. We would also cause all feature development to stop, and break all of the tools that we use in day to day development, while likely introducing issues into every single game system. This just isn’t a practical option for us or for you.
James Bryant
Deep Core Mining Inc.
Caldari State
#53 - 2012-05-04 14:29:55 UTC  |  Edited by: James Bryant
Hosedna wrote:
The shared hosting I pay for only have MySQL / PostgreSQL options, as most, so I guess it will become a bit tricky to do the requests for industry on YAML files. We'll loose the expression power of SQL and have to do the joints "by hand"... Unless there is something a bit in the line of xpath for YAML ? It's not as good as SQL but it could be a first step to help structuring requests...


That is definitely something that is going to bite quite a few folks. I happen to have a virtual server for my hosting, so not a big deal for me, but I have a feeling that I'm in the minority of Eve dev hobbyists. The solutions are out there, this just might push a few people past their commitment point, unfortunately. Still, my feeling is that somebody will step up to the plate and convert all this into SQL after each release anyhow. There's no way I'd be able to do one of the more join heavy queries I do now like getting the top ten profitable market categories out of all our trades for the month, or maybe the wackiest query ever, T2 build requirements (uf!).

I'll tell you where this hurts the most, and that is in Android land, where I also develop. Many devices, especially ones still running Gingerbread or earlier, don't have much in terms of heap space, usually only 16Mb (or less on junk devices, of which there are many). That ought to be fun trying to parse/unpack/unserialize something massive like the map data or invTypes. Combine that with statically typed Java, and you have a challenge.

Still, I like a challenge. We'll see how this shakes out. Don't get me wrong, for the platform CCP is developing for, the PC, which has memory to spare, in a dynamically typed language with the need for authoring version control, this is exactly a perfect fit. It just happens to also be a really terrible fit for anything that isn't that.

It really just means that instead of us all being able to talk SQL to each other about problems, everybody will turn the data into whatever format is most convenient for their particular application.
Xander Hunt
#54 - 2012-05-04 14:32:30 UTC
*sigh*

I don't even know where to begin...

First...

YAML: YAML Ain't Markup Language

... come on.. really? I'm seriously, physically rolling my eyes at this.

I've very quickly just skimmed over the what the structure is about. So f'n not impressed.

Cons I see....

- First, looking at the "yaml.org" website, it looks like it was coded by a five year old with limited knowledge of anything to do with a computer, let alone design a new type of data structure. Designed in something pre-Netscape Designer. Doesn't ooze a lot of professionalism and confidence towards code base (If there is a "code base" behind specifications of a data structure) and functionality and theory behind the actual concept of the data format, really, nor does it raise any kind of confidence behind who the designers of this data format are when this has been around since 2001. (Yes it was a run-on sentence - sorta) However, I'll give credit where credit is due and note that they did use an external style sheet... which reading on down the code looks like the page was generated anyways. Makes me wonder if the site itself is read from a YAML file?

- Second, just like XML, JSON, and any other non-managed database system that doesn't rely on an index of sorts, one must read all data, or at least to the point where the data you want exists while assuming the data is sorted, from top to bottom, to get that bit (literal) of information to determine whether or not typeID 21471 is a Published object. What a waste. Don't get me wrong. Both have their place. Exchange of clear, described data to be put somewhere. I know massive XML documents float around at times, exchanging hands from one type of system to another, but that XML file isn't used as a "lookup information" source 99.995% of the time.

- The volume of data within EVE... Looking at the SQLite database conversion from Crucible, its over 200meg in size. Thats with packed data (IE: 10 character numbers in 4 bytes of data), indexing, structure definitions, page files, etc. A query to pull any data from anywhere in that file takes MILLISECONDS worth of time (Just timed it, 19ms to find out it is published). Text files? Too large to handle. I'd have to read thousands of lines to GET to that point.

- Not sure I'm too keen on the whole idea of just taking data out of the existing MS SQL backup and dropping them into text files to be re-consumed. I'd ask that all data exists in the MS SQL data and slowly roll out the new YAML files as you massage out the structure you want. Then when all tables are done, then drop MS SQL

- Some of this structure looks similar to Windows INI files.... 'Cept, headers are marked with an identifier followed by a colon instead of a [identifier] type of ordeal. I do acknowledge there are advancements in comparison to the INI format, but not much more.

Pros...

- No MSSql - Although I started off with training against MS SQL 2000 Enterprise, I've moved very far from it simply due to costs. Yes, I know its free NOW, but it wasn't like that until recently, and I've never looked back. I might have been an MSSql fanboi if there were always free versions. I'm cheap.

- Take the generalized data and put it into a proprietary structure our applications work with. MSSql, MySQL, SQLite, CSV, our own structure of data (I'm looking at you EVEMon! {wink}) or whatever we want is a GREAT bonus.

Final thoughts

Of course, all we (us?) developers are going to have to follow your lead if we're going to keep developing our tools for your game, but honestly, I've never been, never will be, a complete fan of single or multiple text files that is supposed to relay some sort of structured data. I avoid creating XML, I avoid CSV, I avoid plain text simply because repetitive reads of data slows the whole process down, ESPECIALLY when you get into thousands of lines.

With all the enhancements you ladies and gents at CCP have been putting into improving UI response times, I'm quite thrown back that you'd go to a text file to manage database worthy information, static data or not.

{30 minutes later}

... come to think of it... YAML originated in 2001 and has had pretty much NO MOVEMENT since then... and you're using it for in-house processes and implementing it as a data store in half way though 2012?!?!?!
Katrina Bekers
A Blessed Bean
Pandemic Horde
#55 - 2012-05-04 15:33:43 UTC
Speaking of NoSQL:

Redis.

You will never go back. EVER.

<< THE RABBLE BRIGADE >>

Kouryusei
Keizai Inc
#56 - 2012-05-04 16:15:25 UTC
Following up on Katrina, go play with Couchbase (not CouchDB), it's just as sexy as redis.

In other news, **** YAML. Royally. I'll convert it to a plethora of formats since, like I said - **** YAML.
Steve Ronuken
Fuzzwork Enterprises
Vote Steve Ronuken for CSM
#57 - 2012-05-04 16:22:19 UTC
As long as the data remains in a form that can be easily represented in a tabular form, I'll be backporting it, along with the mysql conversions I've been doing.

Something I would love though:
A separate file that specified the keys (as some are optional) and max lengths of values. Just reduces the amount of preprocessing I'll have to do on import.

It's not a biggy though.

Woo! CSM XI!

Fuzzwork Enterprises

Twitter: @fuzzysteve on Twitter

Etil DeLaFuente
Aliastra
Gallente Federation
#58 - 2012-05-04 17:32:04 UTC
So if i understood right, more and more data will be available on the client in YAML format ?

Or, will we still have to rely on the toolkit ?
Lan Staz
Silver Technologies
Minmatar Fleet Associates
#59 - 2012-05-04 18:41:36 UTC
I think you are going to have a perception problem due to the choice of initial samples which are far too simple to show the advantages of structured over relational data.

Maybe showing something more complex, even if it is just an indicator of how things might look, would be a good idea. Something that currently requires several tables and lots of joins between them that would collapse down to one list of structured objects, such as ship definitions. or the map.

I'd post something here as an example except there doesn't appear to be a way to post code samples on these boards without losing the structure.

Oh, and as someone who has no access to MS SQL and works in Python anyway, yay for YAML!

Antihrist Pripravnik
Scorpion Road Industry
#60 - 2012-05-04 19:19:55 UTC
Big thanks to all devs that replied with a lot of technical stuff! I can see the future now Cool