These forums have been archived and are now read-only.

The new forums are live and can be found at https://forums.eveonline.com/

EVE Technology Lab

 
  • Topic is locked indefinitely.
 

[MYSQL] XML files - Loading character / Corporation Assets into the database.

Author
Troyd23
Repercussus
#1 - 2013-11-26 16:08:40 UTC
Hey Everyone,

For one of my ongoing projects, I`m required to utilize the eve online API for importing corporation / character data. Typically I`m able to utilize the Load XML function for very quick and easy imports into a MySQL database. With some extra PHP scripting, I`m able to keep this table up to date in sync with the eveonline cache expiry differentials by downloading the file and running a `LOAD XML` command via MySQL.

Load XML Function mysql

http://dev.mysql.com/doc/refman/5.5/en/load-xml.html

A really easy to load xml file from the api export

http://wiki.eve-id.net/APIv2_Corp_MemberTracking_XML


When it comes to an asset import, I run into a problem. Assets are often contained one within another, thereby necessitating one to many relationships across multiple tables in order to properly characterize what belongs to what. Ie. Asset A in has 5 rows belonging to it in table B via some key.



View of the asset XML and relevant details.

http://wiki.eve-id.net/APIv2_Char_AssetList_XML



Currently I resolve the issue using the simple XML functions of PHP and a series of foreach loops to go through each of the lines in an xml file. The lines are then added individually using a MySQL statement. AS you can see this is incredibly slow comparative to using one or two load xml commands.

Simple XML Documentation
http://php.net/manual/en/book.simplexml.php


My question, is there a way to utilize the load xml command across multiple tables for the hierarchical file that is presented by the assets xml in a similar manner that it can be used to import a much simpler xml file from say the member lists XML?

Secondly, if not.. how did you solve this issue when importing character / corporation asset data into a MySQL environment? Or were you forced to use excessive server side scripting?

Thanks for your time! I appreciate any thoughts / answers!
Torana Norvana
Center for Advanced Studies
Gallente Federation
#2 - 2013-11-27 11:04:07 UTC
Given that Asset-Cache timers are usually 5+ Hours is it really a Problem, given you already seem to have a working solution?
Troyd23
Repercussus
#3 - 2013-11-27 16:07:43 UTC
Torana Norvana wrote:
Given that Asset-Cache timers are usually 5+ Hours is it really a Problem, given you already seem to have a working solution?


Right now, the sheer number amount of data means the script runs for 2-3 hours. This is only going to increase moving forward and will eventually hit 6 hours. I`m trying to future proof it for larger volumes of asset data as my exposure to asset data is increased. I know the load xml method is more efficient - as I use it for most of the other eve online API imports and it literally takes seconds for thousands of rows.
Paka-Tegat Birshiri
Paragon Material Extraction and Processing
#4 - 2013-11-27 23:30:26 UTC  |  Edited by: Paka-Tegat Birshiri
I'd take a look at your code. I'm currently running a small (non-public... may change if I ever get my **** together to harden it) database operation for real life friends and everyone in my corporation. The asset list is currently updating, every six hours, 25 characters and six corporations, and does so almost instantly.

First, it sounds like you're creating multiple tables? That seems like an odd choice, since you lose a lot of indexing power in SQL. My assets table currently has 9 columns: The 7 from the API, 1 that's the characterID/corporationID of the owning individual, and 1 that's the parentItemID. Those last two are added in the loop that processes the XML.

Because of that, I don't think it's possible to do it with direct MySQL. But it's possible to write very fast code to deal with the size of the data you'll be receiving.

The main worker is a recursive function that drills down each item. If it's a top level item, it sets parentItemID to NULL. If there are children rows (provided by SimpleXML: count($row->children)), the function calls itself, passing the current itemID as the parentItemID of the children. Those children can have children, and eventually the whole tangled mess of recursion ends when you're done.

Two things: Depending on how you want to handle things, you'll need to either flush the tables or write your queries using the INSERT ... ON DUPLICATE UPDATE syntax. I'd only advise flushing if you're also using transactions, just in case something bad happens in the middle. Second, there are occasional weirdness when you're using mining crystals; I've not figured that edge case out, but it's little more than a single duplicate key error in a sea of a table that's currently almost 50k records large. (You should see the wallet journal table!)

I don't have access to the production code from where I am right now, but here's a precursor that worked just as well... just not as elegantly.
http://pastebin.com/SGzY83Tz