These forums have been archived and are now read-only.

The new forums are live and can be found at https://forums.eveonline.com/

EVE Technology Lab

 
  • Topic is locked indefinitely.
 

Proposed Changes to the Static Data Export (SDE)

First post First post
Author
Cwittofur Cesaille
Sniggerdly
Pandemic Legion
#41 - 2016-04-26 17:29:30 UTC
Desmont McCallock wrote:
CCP Tellus wrote:
Desmont McCallock wrote:
In my tool I have an algorithm that tries to detect if the user has dropped in the specific folder the zipped file or (s)he has extracted the content of the zipped file. If a zipped file is detected the tool unzips the content automatically. The algorithm does a match search depending on the file name (we don't want to unzip any zip file but only the SDE one)
.

You can use the following regular expression:
sde-(.+).zip

We will probably not change the file naming scheme, but we may change where we host static data exports in the future.

I would rather avoid adding another regex search but you're the Boss.

Edit: Just for educational purpose here the regex expression I use to match the filename of all SDE releases so far.
".*_\d+\.\d+[\.\d+]*_\d+[_db]*\.zip|sde-\d+-\w+[-legacy]*\.zip"


That's a gnarly regex. Why are you using Square brackets everywhere? Those only match single characters. It may be different based on languages but I believe you should change your
 [_db]*
to
(_db)?
and your
[-legacy]*
to
(-legacy)?
Desmont McCallock
#42 - 2016-04-26 17:32:43 UTC
No I don't, cause I don't want to group match. "_db" and "_legacy" may appear or not in the filename.
Cwittofur Cesaille
Sniggerdly
Pandemic Legion
#43 - 2016-04-26 17:45:01 UTC
Desmont McCallock wrote:
No I don't, cause I don't want to group match. "_db" and "_legacy" may appear or not in the filename.


That's what the ? is for, it makes it so it can appear but isn't required. I just read [_db]* as "zero or more of _ d or b"
Desmont McCallock
#44 - 2016-04-26 18:03:30 UTC
Cwittofur Cesaille wrote:
Desmont McCallock wrote:
No I don't, cause I don't want to group match. "_db" and "_legacy" may appear or not in the filename.


That's what the ? is for, it makes it so it can appear but isn't required. I just read [_db]* as "zero or more of _ d or b"

Parentheses are for grouping. I don't want that. Never the less, the regex expression works for me so...
Cwittofur Cesaille
Sniggerdly
Pandemic Legion
#45 - 2016-04-26 18:14:55 UTC
Desmont McCallock wrote:
Cwittofur Cesaille wrote:
Desmont McCallock wrote:
No I don't, cause I don't want to group match. "_db" and "_legacy" may appear or not in the filename.


That's what the ? is for, it makes it so it can appear but isn't required. I just read [_db]* as "zero or more of _ d or b"

Parentheses are for grouping. I don't want that. Never the less, the regex expression works for me so...


I know it's meant for grouping; I'm not being argumentative here I'm trying to understand. I was taught to use parenthesis when I needed a distinct pattern to appear. And adding the question mark at the end just makes the group optional. I don't see an issue with the grouping if you're not actively using it. Again; not trying to argue, I really just want to understand.
Desmont McCallock
#46 - 2016-04-26 18:19:44 UTC  |  Edited by: Desmont McCallock
OK. Here is some food for thought.

Let's say we have the folowings:

sde-20160426-TRANQUILITY.zip
sde-20160426-TRANQUILITY-legacy.zip
Parallax_1.0_115480_db.zip
YC-118-3_1.0_117575.zip
Aegis_1.1.1_114255_db.zip
someotherfilename.zip

Write a regex expression that matches with the first five (5) lines only without grouping.
Cwittofur Cesaille
Sniggerdly
Pandemic Legion
#47 - 2016-04-26 18:30:30 UTC  |  Edited by: Cwittofur Cesaille
Point made.

Why are you against grouping though?
Desmont McCallock
#48 - 2016-04-26 18:39:15 UTC
Cwittofur Cesaille wrote:
Point made.

Why are you against grouping though?

I'm not. It just doesn't serve any purpose in this case.
Steve Ronuken
Fuzzwork Enterprises
Vote Steve Ronuken for CSM
#49 - 2016-04-26 19:50:07 UTC
Something to try:

Rename a file to sde-20160426-TRANQUILITY-legcay.zip and see if it still matches.

It may just be the regex engine I'm used to (PCRE), but I think it'll still match.

Woo! CSM XI!

Fuzzwork Enterprises

Twitter: @fuzzysteve on Twitter

CCP Tellus
C C P
C C P Alliance
#50 - 2016-04-26 20:44:42 UTC
Desmont McCallock wrote:
3. Would you mind fixing the duplicate entries in blueprints.yaml file? (blueprintTypeID: 41590 has typeID: 38 for manufacturing specified two (2) times with different quantity)

This his how it appears in-game: "Required Input Materials" for that blueprint lists 3x Nocxium and 5x Nocxium.
Desmont McCallock
#51 - 2016-04-26 21:37:29 UTC  |  Edited by: Desmont McCallock
Steve Ronuken wrote:
Something to try:

Rename a file to sde-20160426-TRANQUILITY-legcay.zip and see if it still matches.

It may just be the regex engine I'm used to (PCRE), but I think it'll still match.
It does Steve and in this case the only expression I could come up with to exclude that is ".*_\d+\.\d+(\.\d+)?_\d+(_db)?\.zip|sde-\d+-\w+(-legacy)?\.zip" which justifies Cwittofur Cesaille case, but still there is no need to group matching. Never the less, this whole regex expression thing isn't about which is right but merely what suits needs.

Edit: And according to http://regexstorm.net/tester "(.*_\d+\.\d+(\.\d+)?_\d+(_db)?|sde-\d+-\w+(-legacy)?)\.zip" with 'explicit capture' regex option gives the desired result (http://regexstorm.net/tester?p=.*(.*_%5cd%2b%5c.%5cd%2b(%5c.%5cd%2b)%3f_%5cd%2b(_db)%3f%7csde-%5cd%2b-%5cw%2b(-legacy)%3f)%5c.zip&i=sde-20160426-TRANQUILITY.zip%0d%0asde-20160426-TRANQUILITY-legacy.zip%0d%0aParallax_1.0_115480_db.zip%0d%0aYC-118-3_1.0_117575.zip%0d%0aAegis_1.1.1_114255_db.zip%0d%0asomeotherfilename.zip%0d%0asde-20160426-TRANQUILITY-legcay.zip&o=inc)
Desmont McCallock
#52 - 2016-04-26 21:38:15 UTC
CCP Tellus wrote:
Desmont McCallock wrote:
3. Would you mind fixing the duplicate entries in blueprints.yaml file? (blueprintTypeID: 41590 has typeID: 38 for manufacturing specified two (2) times with different quantity)

This his how it appears in-game: "Required Input Materials" for that blueprint lists 3x Nocxium and 5x Nocxium.
And from your experience is this correct? Should it be such?
CCP Tellus
C C P
C C P Alliance
#53 - 2016-04-26 22:16:58 UTC
Desmont McCallock wrote:
And from your experience is this correct? Should it be such?

That's not for me to judge. I suggest you file a bug report and one of the game design teams will take a look at it. :)
Desmont McCallock
#54 - 2016-04-27 09:47:28 UTC  |  Edited by: Desmont McCallock
@CCP Tellus
Mate in legacy format, restoring the MSSQL datadump reveals that table dgmAtrributeTypes is empty.

Edit: Problem may be that displayName is defined as 'varchar(100)' and displayName for attributeID: 2445 exceeds that.

Edit2: Importing the dgmAtrributeTypes table from the yaml file, I notice that a clean up has been made. 994 entries have been removed including some that we use in EVEMon data files generator. Is this intentional?
The missing IDs can be found via:
SELECT Distinct a.attributeID FROM [dbo].[dgmTypeAttributes] as a
left join [dbo].[dgmAttributeTypes] as b on a.attributeID = b.attributeID
where b.attributeID is null order by a.attributeID
IDs are missing from dgmAttributeTypes although they are used in dgmTypeAttributes.
Pete Butcher
The Scope
Gallente Federation
#55 - 2016-04-27 13:16:45 UTC
And we're back to the integrity problems with the SDE. Someone should really clean it up.

http://evernus.com - the ultimate multiplatform EVE trade tool + nullsec Alliance Market tool

CCP Tellus
C C P
C C P Alliance
#56 - 2016-04-27 14:26:42 UTC
Desmont McCallock wrote:
Edit2: Importing the dgmAtrributeTypes table from the yaml file, I notice that a clean up has been made. 994 entries have been removed including some that we use in EVEMon data files generator. Is this intentional?

That was not intentional. I'll make a new build of the SDE with several fixes, should have it ready in a few hours. Thanks for your patience. :)
CCP Tellus
C C P
C C P Alliance
#57 - 2016-04-27 21:11:54 UTC
Hello everyone! I published a new SDE that should hopefully contain fixes for all the reported issues. Please let us know if there are any more issues.

https://cdn1.eveonline.com/data/sde/tranquility/sde-20160427-TRANQUILITY.zip
https://cdn1.eveonline.com/data/sde/tranquility/sde-20160427-TRANQUILITY-legacy.zip
Shish Tukay
Caldari Provisions
Caldari State
#58 - 2016-04-27 21:51:30 UTC
Constructive: If you really hate the idea that you'll get more groups in output, you can also do "(?:_db)?" to make the group non-capturing while still being correct :)

Criticism: Using square brackets really is incorrect though, not merely a matter of taste. Sure, it works, sort of, sometimes, if you ignore all the cases in which it doesn't work; but is it not better to work all the time? "(_db)?" = "may or may not contain '_db'", which is right; "[_db]*" = "can contain any number of '_', 'd', or 'b'", eg it allows 'bbbbbbbbb_b_b_b___ddbbdbb', which is wrong. Still really confused why you would even want to avoid groups though - that's like avoiding "else" statements and just writing "if(foo) {}; if(not foo) {}" all over the place because you don't like "if(foo) {} else {}" o___O
Steve Ronuken
Fuzzwork Enterprises
Vote Steve Ronuken for CSM
#59 - 2016-04-27 23:20:06 UTC
CCP Tellus wrote:
Hello everyone! I published a new SDE that should hopefully contain fixes for all the reported issues. Please let us know if there are any more issues.

https://cdn1.eveonline.com/data/sde/tranquility/sde-20160427-TRANQUILITY.zip
https://cdn1.eveonline.com/data/sde/tranquility/sde-20160427-TRANQUILITY-legacy.zip



\o/ <3

Big smile

Woo! CSM XI!

Fuzzwork Enterprises

Twitter: @fuzzysteve on Twitter

Desmont McCallock
#60 - 2016-04-28 08:07:40 UTC
Shish Tukay wrote:
Constructive: If you really hate the idea that you'll get more groups in output, you can also do "(?:_db)?" to make the group non-capturing while still being correct :)

Criticism: Using square brackets really is incorrect though, not merely a matter of taste. Sure, it works, sort of, sometimes, if you ignore all the cases in which it doesn't work; but is it not better to work all the time? "(_db)?" = "may or may not contain '_db'", which is right; "[_db]*" = "can contain any number of '_', 'd', or 'b'", eg it allows 'bbbbbbbbb_b_b_b___ddbbdbb', which is wrong. Still really confused why you would even want to avoid groups though - that's like avoiding "else" statements and just writing "if(foo) {}; if(not foo) {}" all over the place because you don't like "if(foo) {} else {}" o___O
Thanks for your input. The matter is closed. Explicit capture option resolves any mismatch in my case.