These forums have been archived and are now read-only.

The new forums are live and can be found at https://forums.eveonline.com/

EVE Information Portal

 
  • Topic is locked indefinitely.
 

New dev blog: Changes to Toolkit Exported Data

First post First post
Author
Dil'e Mahn
Savage and Average
#61 - 2012-05-04 20:22:30 UTC
I'm in the "I'd like to see a somewhat more elaborate example of a datastructure you guys are toying with" camp.

Pros and cons of YAML versus other versionable data storage aside: it's a farily simple to parse format, and there's going to be plenty of folks offering conversions, just like they do with MSSQL to MySQL/SQLite/whatever today. I'm not worried about that, and frankly I don't care that much whether it's YAML or something else, just as long as I (or someone else) can cook up a conversion to whatever format I'd want to use, it's all cool.

At least text-based formats mean I don't depend on someone else to do the conversion for me (no MSSQL here), so that's progress.

I don't think anyone in their right mind would want to run a webapp or a low-resources mobile app on big text files, but then again you don't have to. Just pick and choose the data you need and stash that in a format your environment is happy with. I imagine you have to do that today, as well, unless it's possible to run a few-hundred MB MSSQL database on an Android 2.1 device... =]

Come to think of it: for a web app, plenty of things (ship/module data, for example) could very well be stored in separate files containing a JSON blob / PHP serialized array / XML blob / pickle for that item. Name them by itemID, and you have your basic lookup system ready. No more table joins, everything there is to know about that item lives in that file. It might not be the most effective use of disk space, but space is relatively cheap these days, and it works a treat for the "shared hosting and limited to MySQL/PostGres, NoSQL is not an option" crowd. For the things that can't be done that way, there'll be conversions as soon as the spec is out.

Don't worry, people, we'll be fine.

Also: CCP devs, your ability and willingness to speak nerdy to your customers is highly appreciated. Keep rocking.

Shooting people in the face for fun and profit. Well, for fun, mostly.

Charlie Parker Sidrat
KarmaFleet
Goonswarm Federation
#62 - 2012-05-05 01:39:44 UTC
Hmm. Gulp.

cries a bit with confusing and worry.

Cards on table - I'm not a coder. Never have been and tried repeatedly since line numbers were required.

I enjoy designing spreadsheets and messing about with sql and then Access to get the queries and make tables to import and use as pivot tables on Excel.

Just when I've finally got Eve Industrial Organiser to the point where it's very very easy to keep it updated in a few key strokes (the hardest part is remembering how to restore the backup database each patch release as I only do it for the data dumps), you're going to change it to the point where it doesn't seem like query look up tables is going to be an option straight out of the box.

Perhaps it will make coding easier to understand? Maybe I'll lose the fear factor and just start 'getting' it - like I eventually did for excel and access.

Worst case scenario I stop updating EIO, best case scenario - I figure it all out and produce an exe version that doesn't require the use of Excel 2007 which will make more than a few potential users very happy.
Lan Staz
Silver Technologies
Minmatar Fleet Associates
#63 - 2012-05-05 19:05:48 UTC
With the move to YAML for static data, is there any plan to support YAML in the API as well?
Nirnaeth Ornoediad
GoonWaffe
Goonswarm Federation
#64 - 2012-05-05 20:20:40 UTC
CCP Redundancy wrote:

So in general, when dealing with static data, pre-bake your joins - funnily, we tend to already do this in performance critical databases by denormalizing data (only denormalized relational databases can't do that for lists or parent-child relations).

At least, that's the theory...


They can: it's just that you're limited to either pre-defining the list length (or number of children) ahead of time, or living with a table definition which can change at runtime (which, of course, has it's own set of challenges*).

* Hint: Views are your friend if you do this, as a View can effectively maintain a "static" view of a table even if it suddenly starts adding columns to itself at runtime.


Just curious: one of the advantages of an MS-SQL dump is that it's trivial to import into Access or Excel and run some basic analytics. Does anyone know of a good YAML-to-relational database integration tool that's freeware? Even an industrial-grade integration tool like Informatica or Cast Iron would be good, so long as they were free.

Fix POSes.  Every player should want one (even if all players can't have one).

Freibuis
Legion of Lost Souls
#65 - 2012-05-06 00:08:47 UTC
I have had time to "stue" then over in my head for a couple of days.

if the data is not going to be kept in the SQL database and you are not going to load the data in there.

would it be possible to leave the empty table in there.

that way I can automatically import VIA SSIS (SQL integration services). that way I could dump SQL file with all the inserts in it for all the SQL users (MSSQL/MySQL) back to the community that needs it
LifeHatesMe
LifeHatesUsAll
#66 - 2012-05-06 00:56:28 UTC
Thank you CCP for making my head spin xP

Now you got me thinking on Plain Text databases, MySQL & RDMS's, anddd stuff like YAML, Reddis, and MongoDB...

There are so many things that make this way too complicated for me to understand at face value. I wish they had a index of what makes a database good for management instead of "Hey you, go write in 15+ different database wrappers so you can figure out specifically which one is best suited for your specific application" xD
Ein Spiegel
Fly-by-Night Industries LLC PTY LTD
#67 - 2012-05-07 04:19:22 UTC
I love the geeky talk from both Nobody and Redundancy. I even understood some of it. But I wanted to ask about a different angle...

Is the move to YAML going to make integrating your branches back and forth across the depot structure in Perforce easier, or is it going to allow you to simply be able to keep the data out of a database structure and allow you to have versioned static data as a part of any specific label or changelist? Maybe a bit of both, I think... but will your QA still like you in the morning?

If the static data stops being, well, static, across all of the developers, changes to static data could happen which makes integrating a bear. Seeing how this change affects the current branch works, but how will it break across multiple client workspaces going through multiple integrations down to the release branch?

Also - you guys use Perforce? I can't imagine the kind of binary assets you've got in that SCM, I had enough problems with trying to version Word documents in it. (Don't ask. I don't work there anymore and I will not talk about it.)
Zor'katar
Matari Recreation
#68 - 2012-05-07 15:00:30 UTC
So who's going to be the hero of Weaksauce Developers such as myself by converting everything back to an SQLite database? Big smile
Zifrian
The Frog Pond
Ribbit.
#69 - 2012-05-11 05:11:59 UTC
Zor'katar wrote:
So who's going to be the hero of Weaksauce Developers such as myself by converting everything back to an SQLite database? Big smile

Not me. I have no idea what any of this is but I guess I have to learn it. I will just convert the data into SQLite and use what I have already. I'm more interested in making my app work and not have to spend a lot of time on it. So I hope this is a permanent thing because I don't care about new tech stuff like this really lol

Maximze your Industry Potential! - Download EVE Isk per Hour!

Import CCP's SDE - EVE SDE Database Builder

Jognu
French Kiss Singularity
#70 - 2012-05-25 14:51:55 UTC
If that can help some people : https://forums.eveonline.com/default.aspx?g=posts&m=1362009

EveAI developper: https://forums.eveonline.com/default.aspx?g=posts&t=21803

suenoni terracotta
Republic Military School
Minmatar Republic
#71 - 2012-07-16 12:34:53 UTC
I may be blind or something, but I can't find wich version of YAML is used. At present there are three to choose from. So which version is it?
Allan Ahra
The Scope
Gallente Federation
#72 - 2012-07-21 08:52:57 UTC
I have to vent some anger over this too :p


I totally understand AND applaud the move away from a database dump. The MSSQL datadump means it's basiclaly only accessible to people running Windows. MOving away from that and to a "plain text" format is the right thing to do. For this same reason lots of interchange text formats are being used to make different systems communicate with eachother.



However, I do have worries about yaml.
The yaml format is now over 10 years old, and yet, it's not really seeing a whole lot of traction. The available tools for dealing with yaml are in a pretty sorry state. I tried several libs and most can't even handle the extended sample file.
With the state of things, it looks like I'll have to make my own yaml parser just to get the features/performance/memory use I need.
XML would have been my personal preference as an interchange format (much more mature, LOTS of support tools available, and this really nifty/elegant thing called XSLT), but I'm not really going to debate that issue.

What I AM worried about however is the complete lack of data description. With the DB, you at least could query the database for the table colums, type, nullability which in turn makes it easy to prepare your own internal structures. Yaml is completely form-free. And this is a HUGE disadvantage for what most people use the eve datadump for. Form-free data has it's uses, tabular data ISN'T one of them.

With how YAML is now, you need to read the ENTIRE yaml file into memory, just to be able to figure out what all the columns are in the table. YOu also need to read the entire yaml file into memory just to have a rough idea of the datatypes. If you want actual datatypes, you need to iterate over the entire table and data and infer actual types from the basic types (int, float, string) yaml provides. And even then, we only know what IS there, not what COULD be there in some future version of the data dump.

So you read a yaml file, see that one field called humptyDumpty is an interger type in yaml, and by looking at all the values you find, you see no value is larger than 255. SO you infer it's a 1 byte unsigned integer. Then in the next datadump more data is added and oops, your internal table no longer works. Is a string going to need unicode ? or will ANSI/ASCII suffice ?

You stated it yourself, if you want performance, you don't want to use the yaml "as is" in a dictionary type associative array. It is too slow to load, too memory wasteful. If you need to convert, then knowing the correct "record" format up front is a huge benefit.



So here are my 2 big questions in this issue...
1° Can we expect to get "easily computer-parsable" descriptions with decent types of each of the yaml tables ?

2° With the poor state of yaml parsers out there. Can we get a guarantee about what features of yaml you guys will be using? If you have to make your own parser (or pick one that has the features you're going to be using), then the plain "unordered key/values blocks" / associative array/dictionary thing as used in the current 2 yaml files is easy enough to handle. Some of the other yaml features are... quite complex to parse.



Semi related question: Are bugs/inconsistensies in the datadump something we should report in the normal bug forum ? Or is there another place for those ?
Zifrian
The Frog Pond
Ribbit.
#73 - 2012-08-10 21:07:11 UTC  |  Edited by: Zifrian
After going through 3 patches now that changed the file structure of the datadump, I really think we need a better rollout plan for this change. This onesey twosey stuff isn't going to cut it. I appreciate not wanting to change everything all at once but I would much rather have the full yaml files with the database dump for like a month or two and then just hard switching it over. That way I can update my updater to rebuild the MSSQL db first then run my already established SQLite conversion for IPH. I'm not re-writing all my stuff to use yaml and will just convert it to my current DB system anyway. I just don't see how deleting "radius" from the table and putting it in a yaml file is helping me any. All I do is look at the table and ask "is this something I need" No? OK, comment it out and move on. I'm pretty sure most other 3rd party devs are doing something similar. But every patch we are going to do this?

Edit: I re-read the previous post about how you need to do it this way but there has to be a better way no? I mean honestly, I could care less that you are using yaml. All you are doing is making me spend hours of development time over the next what? How many months? How long will this transition take? This is really a crappy situation because I have spent a lot of my own personal time developing my app for fun and this isn't fun for me. I'm sure other devs will feel the same when they have to convert the data they need for their apps. Right now the three fields you changed probably mean little to most of us.

Yeah we are a "resourceful bunch" but man, this is going to start sucking when you migrate things that matter. I thought CCP wanted to support our community. Can't we come up with another process for this transition?

Maximze your Industry Potential! - Download EVE Isk per Hour!

Import CCP's SDE - EVE SDE Database Builder

Galen Kamari
The Scope
Gallente Federation
#74 - 2012-08-14 12:25:44 UTC
Lan Staz wrote:
With the move to YAML for static data, is there any plan to support YAML in the API as well?

CREST will return data in a JSON format, so yes (sort of). It's something for the future, though.

-GK