These forums have been archived and are now read-only.

The new forums are live and can be found at https://forums.eveonline.com/

EVE Technology Lab

 
  • Topic is locked indefinitely.
 

EVE client log file format readability issues

Author
Tarsas Phage
Sniggerdly
#1 - 2011-12-02 19:30:27 UTC  |  Edited by: Tarsas Phage
I'm trying to make a bunch of Windows/UNIX command-line utilities (perl/ActiveState perl) that parse the log files produced by the EVE client (chat logs, game/combat logs, etc)

However, the "text" files that the client makes are not straight-up text files like one would assume, and the binary bits in them render them neigh-unreadable by perl (and any other utility, such as grep, sed, awk, and so-on) no matter what kind of formatting sorcery you try. Hell, running 'file' on one of these log files tells you it's 'data'

Has anyone cracked this problem yet and have a straight-forward way of getting perl or any other similar UNIX-style parsing program to grok these things? For whatever reason, CCP can't just spit out straight-up CRLF text files, and it's terrible.
Dragonaire
Here there be Dragons
#2 - 2011-12-02 20:18:25 UTC
there's some python stuff out there that most people are using but the name of it is escaping me at the moment. Python is not my strong suit but I believe if you do some searching around about 'pickled' files you'll find some more info on the format.

Finds camping stations from the inside much easier. Designer of Yapeal for the Eve API. Check out the Yapeal PHP API Library thread.

Tonto Auri
Vhero' Multipurpose Corp
#3 - 2011-12-02 21:56:13 UTC  |  Edited by: Tonto Auri
Tarsas Phage wrote:
However, the "text" files that the client makes are not straight-up text files like one would assume

They are straight up text files.
If you have any issues, go man iconv

iconv -s -f UTF-16LE -t UTF-8 logfile | yourscript

Sneak edit: But i'm sure Perl have enough tools to open them natively.

Two most common elements in the universe are hydrogen and stupidity. -- Harlan Ellison

Cassidy Asedya
Thalagat Industries
#4 - 2011-12-02 23:50:37 UTC
well, since the patch i also encountered some problems with my market-log-parser. usually it went quite smooth, bot now it stops reading a line at the date-entry

165000.0,10.0,8101,32767,2362855464,10,1,False,2 0 1 1 - 1 2 - 0 2 1 0 : 4 2 : 1 9 , 1,60012316,10000002,30000202,16,
-->
165000.0,10.0,8101,32767,2362855464,10,1,False,2
Tonto Auri
Vhero' Multipurpose Corp
#5 - 2011-12-03 03:11:05 UTC
Cassidy Asedya wrote:
well, since the patch i also encountered some problems with my market-log-parser. usually it went quite smooth, bot now it stops reading a line at the date-entry

165000.0,10.0,8101,32767,2362855464,10,1,False,2 0 1 1 - 1 2 - 0 2 1 0 : 4 2 : 1 9 , 1,60012316,10000002,30000202,16,
-->
165000.0,10.0,8101,32767,2362855464,10,1,False,2

Your issue have nothing to do with this thread.

Two most common elements in the universe are hydrogen and stupidity. -- Harlan Ellison

Dragonaire
Here there be Dragons
#6 - 2011-12-03 03:20:22 UTC  |  Edited by: Dragonaire
Mark my stupid post above up to not feeling well today Oops to the point actually couldn't go to work P They are of course in text format but CCP has been playing around with their encoding which as Tonto Auri pointed out you'll have to work around.

Edit:
Hi know your using Perl etc and I'm not sure if it was the incorrect formed UTF-16 characters that was giving you problems but I know that was Cassidy Asedya problem and I have a PHP solution for that one.

$trimmer = array("\0", "\r\n", "\n");
$line = trim(str_replace($trimmer, '', fgets($fp)), ' ,');

I'm assuming your reading the file a line at a time from file with the above but you can replace 'fgets($fp)' with any string that has a line you need fixed. Note that it only works correctly on a single line but could be used in a function with array_walk(). I'm sure this can be translated into Perl etc fairly easily as well.

Finds camping stations from the inside much easier. Designer of Yapeal for the Eve API. Check out the Yapeal PHP API Library thread.

Tarsas Phage
Sniggerdly
#7 - 2011-12-05 03:16:35 UTC
Ah, so these text files are UTF-16?

Might explain some things considering my C (shell) locales are set to UTF-8. I'll play around with iconv when I'm back at my $HOME computer and test this, and if it works I'm sure perl's Encode::Decode module might come to the rescue.

Thanks for the pointers and I'll report back with any success.
Tonto Auri
Vhero' Multipurpose Corp
#8 - 2011-12-05 04:08:58 UTC
You can, usually, say much about a file by looking at it's first few bytes.
Or you may have file(1) look it up for you.
Quote:
[...\Documents\EVE\Logs\Chatlogs]$ file Planetology_20110504_125100.txt
Planetology_20110504_125100.txt: Little-endian UTF-16 Unicode text, with very long lines, with CRLF, CR, LF line terminators

Two most common elements in the universe are hydrogen and stupidity. -- Harlan Ellison

Tarsas Phage
Sniggerdly
#9 - 2011-12-05 17:12:29 UTC
Tonto Auri wrote:
You can, usually, say much about a file by looking at it's first few bytes.
Or you may have file(1) look it up for you.
Quote:
[...\Documents\EVE\Logs\Chatlogs]$ file Planetology_20110504_125100.txt
Planetology_20110504_125100.txt: Little-endian UTF-16 Unicode text, with very long lines, with CRLF, CR, LF line terminators


As I mentioned in my OP, file(1) on my systems just saw it as 'data'

This was on MacOS X and Solaris, so it's likely that the file magic numbers database that file(1) uses on those systems isn't up to date to grok anything more bit-y than UTF-8
Tonto Auri
Vhero' Multipurpose Corp
#10 - 2011-12-05 19:56:47 UTC
Probably. I've tried it on Cygwin with file-5.05
And i'm sure you can update magic database by yourself without much trouble.

Two most common elements in the universe are hydrogen and stupidity. -- Harlan Ellison