These forums have been archived and are now read-only.

The new forums are live and can be found at https://forums.eveonline.com/

EVE Technology Lab

 
  • Topic is locked indefinitely.
 

Killmail Parser Code

Author
Khorkrak
KarmaFleet
Goonswarm Federation
#1 - 2012-04-22 19:17:24 UTC  |  Edited by: Khorkrak
I've pasted a kill mail parser along with an example of using it to iterate through the parsed results here:
Kill Mail Parser
Example Usage

It's implemented with declarative set of grammar rules and a few functions that get triggered when certain tokens are parsed to provide some additional semantic checks.

The result is something that's both sanitized and simple to work with for further processing such as converting strings to ids for storing in your database. The parser leverages the pyparsing module that performs a recursive descent parse given the grammar you define.

Something I found quite useful while spending a couple of hours learning about this parsing package is that each grammar rule can itself parse a string. So for example to verify that the datetime rule is working as expected you can test it directly:

g = KillMailGrammar()
r = g.datetime.parseString("2011.02.29 12:12:12") # 2011 wasn't a leap year...

>>> pyparsing.ParseException: day is out of range for month

r = g.datetime.parseString("2011.02.28 12:12") # Valid day and leave off seconds as EDK does
print r
['2011.02.28', '12:12:00']

So it adds in the missing :00 for you too. I tested it out a bit with a few ship kills, a capsule, a POS, a customs station etc. If you try it and find a glitch let me know.

Developer of http://www.decloaked.com and http://sourceforge.net/projects/pykb/

Khorkrak
KarmaFleet
Goonswarm Federation
#2 - 2012-04-25 20:57:16 UTC
I've updated my kill mail parser to add in the (drone bay) qualifier that I missed earlier along with correcting defects with -10.0 and 10.0 security status values. Much to my pleasant surprise, the developer of the awesome pyparsing package noticed my code and provided me with some performance and cosmetic enhancement suggestions which I've incorporated.

The next step is to review the hundreds of lines of EDK PHP kill mail parsing code that I'd avoided reading earlier because looking at PHP tends to give me a headache due to its toxicity to see if I missed anything and then to add in support for Russian and German text which should be trivial. I have an idea for avoiding cluttering up the grammar with alternative literals that'll hopefully keep it clean.

Where might I find some sample Russian or German kill mails to test with? There's also this coming up apparently: Further Localization which will add somewhat to the complexity of parsing this stuff as along with the static text portions of the kill mail text the identifiers, that is item type names, will also be localized but with hints in there to help avoid having to do lots of look-ups when converting these to their corresponding ids. Looks rather unappealing.

Developer of http://www.decloaked.com and http://sourceforge.net/projects/pykb/

Khorkrak
KarmaFleet
Goonswarm Federation
#3 - 2012-04-26 02:07:22 UTC  |  Edited by: Khorkrak
Added support for Russian and German text. Wasn't too hard actually. The static text on the right-hand side of the Kill mail gets translated to English now as well - except for Faction names - I'll deal with fixing that tomorrow.

Check out a sample input file in Russian: russian.txt and the resulting print out of the parsed data including things like (laid the final blow) and (cargo) translated as well: Killmail Parse printout

Can't say enough about how fantastic the pyparsing module is for a task like this. Really helps to avoid having to write of bunch hacky code. Big smile

Kill Mail Parser
Example Usage

What's left to do is get the mapping of faction names for English to Russian and German. It now converts, the Russian and German equivalents with various spelling for Unknown, None, Victim, Faction In Container and Drone Bay properly along with switching the decimal separator from comma to dot and allowing Customs Office Kill Mails with no victim.

Also strangely for Russian Kill mails the destroyed items sometimes appear before the dropped items and an extraneous decimal point (comma) with zeroes appears for integer values like Damage Taken and Done so that's all dealt with too - along with the weird (No Data) instead of None or Unknown in Russian.

Developer of http://www.decloaked.com and http://sourceforge.net/projects/pykb/