These forums have been archived and are now read-only.

The new forums are live and can be found at https://forums.eveonline.com/

EVE Technology Lab

 
  • Topic is locked indefinitely.
12Next page
 

CREST Market Data

First post
Author
Lucas Kell
Solitude Trading
S.N.O.T.
#1 - 2016-02-27 14:54:31 UTC
Hi
Looking at the way that the Market data is delivered, it's very much designed for a grab as needed approach. Knowing what EVE players are like however, there's undoubtedly a desire for being able to pull large amounts of data for aggregation, which requires hammering CREST for an extended period of time.

I was wondering how feasible it would be to instead have the servers generate a bulk file say once per day, with all active orders in it, one file for each region, which could then be compressed and then fetched by players. This would speed up how quickly most people would be able to get bulk data, level the playing field for people with reduced programming knowledge (the excel users basically) and I imagine would reduce the load on the server as it would have to deal with the same requests for each item in each region, and would send far less over the network.

The Indecisive Noob - EVE fan blog.

Wholesale Trading - The new bulk trading mailing list.

Pete Butcher
The Scope
Gallente Federation
#2 - 2016-02-29 06:55:23 UTC
I've been pushing for bulk import since CREST started: https://forums.eveonline.com/default.aspx?g=posts&m=6364870#post6364870
Join the party, but I think you shouldn't keep your hopes up.

http://evernus.com - the ultimate multiplatform EVE trade tool + nullsec Alliance Market tool

William Kugisa
Perkone
Caldari State
#3 - 2016-03-01 00:25:08 UTC
I also just ran into this. Even with a "grab as needed" approach, some industry stuff I wanted quickly ran into needing several hundred sometimes even thousands of requests if I wanted more than what https://public-crest.eveonline.com/market/prices/ provides directly...

What about third party / player data sources? CCP set the cache timer to 5 minutes, and it would be nice to have access to fairly "live" data.

Given CCP's 20 connections / 150 requests limit, you can maybe keep 1 region fully updated from a single server box (buy and sell plus number of SDE items with market groups gave me 24738, so at 150 requests/sec about 3 minutes.

But scaling that out to all regions seems expensive. Cheapest I can think of using AWS is about $40 per year per ip address (on the assumption that CCP rate limit is per-ip), plus all the storage, io, bandwidth, etc. charges.
Lucas Kell
Solitude Trading
S.N.O.T.
#4 - 2016-03-01 16:03:10 UTC  |  Edited by: Lucas Kell
I think hammering them from multiple IP addresses should probably not be encouraged. The thing is, they know people will want all of the data at time and that they will pull it one way or the other so it would seem to benefit them as well as us to provide this in bulk, even if it was just a daily data set. Aggregation as most of us use it doesn't usually need up to the minute data but does need very broad data sets.

Ed: We just need a way to coax foxfour in here. Someone going to fanfest this year offer to get him a hotdog or something.

The Indecisive Noob - EVE fan blog.

Wholesale Trading - The new bulk trading mailing list.

William Kugisa
Perkone
Caldari State
#5 - 2016-03-03 12:58:30 UTC
Well looks to me like the relevant public-crest API's are largely what the game client uses but with an update rate that is many times slower, so guess they didnt have time to make anything completely new.

Eve Central have an API (see http://dev.eve-central.com/evec-api/start ), including daily dumps, so for non-live datasets there is that. I believe they use CREST for at least some of it now (but could not find a clear "this is how we get data" page...), I guess they just run at the rate limit looping around all the types and regions
Lucas Kell
Solitude Trading
S.N.O.T.
#6 - 2016-03-03 13:20:52 UTC
William Kugisa wrote:
Well looks to me like the relevant public-crest API's are largely what the game client uses but with an update rate that is many times slower, so guess they didnt have time to make anything completely new.

Eve Central have an API (see http://dev.eve-central.com/evec-api/start ), including daily dumps, so for non-live datasets there is that. I believe they use CREST for at least some of it now (but could not find a clear "this is how we get data" page...), I guess they just run at the rate limit looping around all the types and regions
Yeah, EVE-central do their daily dumps but they include a lot of duplication and quite a wide spread of update times. I tend to just hammer the ever living **** out of the CREST API at like 140 rq/s for every order in the entire game, region by region instead since it gives cleaner data. At the moment I'm infrequently and manually triggering the updates, but long term I plan on doing this at least once per day, and that's an awful lot of requests to be firing off. Even if they were to build the data by firing off the same requests at their own server once per day, it would reduce the amount of requests they receive if at least two people started downloading that bulk data instead.

The Indecisive Noob - EVE fan blog.

Wholesale Trading - The new bulk trading mailing list.

William Kugisa
Perkone
Caldari State
#7 - 2016-03-03 13:52:56 UTC
If you look at say http://api.eve-central.com/api/quicklook?typeid=34&regionlimit=10000002 , I generally see that they have the same "reported" time for all listed orders of a given type and region, which makes sense if either cache scraping or using CREST, that is how the data comes. It is only when you look at multiple types or regions that the times vary, but then the same happens using CREST directly right now, since it takes hours to crawl all the types/regions.

So just using quicklook with all the types and no region filter would get nearly the same thing?

Only way I see being clearly better is the "multiple IP" thing, and aggregating on a central server (such that say EVE-Centrals data set is always less than 30 minutes, rather than upto around say 4 hours)
Lucas Kell
Solitude Trading
S.N.O.T.
#8 - 2016-03-03 15:30:03 UTC
They use CREST. There's at least an 8 hour spread on eve-central data, and while there's no duplicates using quicklook, I doubt they'd like people hammering mass data either, which is why they provide bulk files. The issue is that those bulk files contain all CREST data they have fetched, duplicates included, and there's no guarantee that all the data was fetched at the same rate. I imagine that they prioritise regions like The Forge to keep that more up to date, which is good if you're looking for region specific data but not so good if you're trying to aggregate data for roughly the same time period.

I mean if nothing changes, the best method will still be smashing CREST at close to the maximum requests per second just from a single IP, which works and is achievable in a reasonable time period. I'd not do multiple IPs because that's getting around their limits which would be bad. Just seems crazy that the best way to get bulk data is in thousands and thousands of individual requests.

The Indecisive Noob - EVE fan blog.

Wholesale Trading - The new bulk trading mailing list.

Pete Butcher
The Scope
Gallente Federation
#9 - 2016-03-03 16:08:41 UTC
Lucas Kell wrote:
They use CREST. There's at least an 8 hour spread on eve-central data, and while there's no duplicates using quicklook, I doubt they'd like people hammering mass data either, which is why they provide bulk files.


Evernus uses eve-central as an alternative and they haven't complained (yet), so maybe it's not a problem.

http://evernus.com - the ultimate multiplatform EVE trade tool + nullsec Alliance Market tool

William Kugisa
Perkone
Caldari State
#10 - 2016-03-03 18:38:32 UTC
Lucas Kell wrote:

I mean if nothing changes, the best method will still be smashing CREST at close to the maximum requests per second just from a single IP, which works and is achievable in a reasonable time period. I'd not do multiple IPs because that's getting around their limits which would be bad. Just seems crazy that the best way to get bulk data is in thousands and thousands of individual requests.

Well my thought on that is if n different tools/sites are going to hit CREST at the rate limit to get basically the same data every few hours, better that they cooperate and each use their "allowance" to get say different regions, then combine so everyone has a more up to date set and not hitting CREST with lots of duplicate crawling
Lucas Kell
Solitude Trading
S.N.O.T.
#11 - 2016-03-04 07:46:47 UTC
William Kugisa wrote:
Lucas Kell wrote:

I mean if nothing changes, the best method will still be smashing CREST at close to the maximum requests per second just from a single IP, which works and is achievable in a reasonable time period. I'd not do multiple IPs because that's getting around their limits which would be bad. Just seems crazy that the best way to get bulk data is in thousands and thousands of individual requests.

Well my thought on that is if n different tools/sites are going to hit CREST at the rate limit to get basically the same data every few hours, better that they cooperate and each use their "allowance" to get say different regions, then combine so everyone has a more up to date set and not hitting CREST with lots of duplicate crawling
Sure, but with this being EVE and all, there's no guarantee the data provided by all parties will be accurate. There's also the difference is collection and storage methods which make it more of a pain to combine and the risk that one would suddenly stop getting data creating a gap. Like I say, from my perspective I'll just be getting the data straight from CCP one way or the other, it just seems it would be better for us and them if they bulk fed it out too.

Pete Butcher wrote:
Lucas Kell wrote:
They use CREST. There's at least an 8 hour spread on eve-central data, and while there's no duplicates using quicklook, I doubt they'd like people hammering mass data either, which is why they provide bulk files.
Evernus uses eve-central as an alternative and they haven't complained (yet), so maybe it's not a problem.
Evernus fetches data on demand from clients, so it will be burts of data from mutiple IPs which I imagine they are OK with. That's a bit different from one IP fetching their whole database through their API once/twice a day.

The Indecisive Noob - EVE fan blog.

Wholesale Trading - The new bulk trading mailing list.

Pete Butcher
The Scope
Gallente Federation
#12 - 2016-03-04 08:57:51 UTC  |  Edited by: Pete Butcher
Lucas Kell wrote:
Evernus fetches data on demand from clients, so it will be burts of data from mutiple IPs which I imagine they are OK with. That's a bit different from one IP fetching their whole database through their API once/twice a day.


That's probably true. I wonder how many users actually tried fetching the whole market from eve-central - I wonder if it works without errors. Time to check it out (sorry, eve-central, 836562 requests).

http://evernus.com - the ultimate multiplatform EVE trade tool + nullsec Alliance Market tool

Lucas Kell
Solitude Trading
S.N.O.T.
#13 - 2016-03-19 15:17:34 UTC
FOXFOURWHYDOYOUHATEME

The Indecisive Noob - EVE fan blog.

Wholesale Trading - The new bulk trading mailing list.

Cornbread Muffin
The Chosen - Holy Warriors of Bob the Unforgiving
#14 - 2016-03-25 23:22:29 UTC  |  Edited by: Cornbread Muffin
Ugh, yes, please. There's no reason at all why the market data API needs to be designed the way it is. Someone over there picked up a book that pours the REST/ROA kool-aid and this is what we ended up with.

The API doesn't fit the use case, which breaks the first rule of good API design. As a result, we all jump through hoops and their tech gets hammered by swarms of connection attempts. This leads to ever-increasing piles of error-correction code on our end and even more connection attempts on theirs.

This could be as simple as transferring one file per region (most people only want the 30ish high-sec regions anyway).
Steve Ronuken
Fuzzwork Enterprises
Vote Steve Ronuken for CSM
#15 - 2016-03-26 00:52:40 UTC
Cornbread Muffin wrote:
Ugh, yes, please. There's no reason at all why the market data API needs to be designed the way it is. Someone over there picked up a book that pours the REST/ROA kool-aid and this is what we ended up with.

The API doesn't fit the use case, which breaks the first rule of good API design. As a result, we all jump through hoops and their tech gets hammered by swarms of connection attempts. This leads to ever-increasing piles of error-correction code on our end and even more connection attempts on theirs.

This could be as simple as transferring one file per region (most people only want the 30ish high-sec regions anyway).



Actually, it was designed this way for Dust. They then opened it up so we can get at it.


I'd suggest, if people want a firehose of market data, take a look at EMDR.


It doesn't let you request specific things, but it does pull most things, over time.


Long term, maybe we'll get a regions complete order book as a single download, though with the size it would be, it would have to be a slice in time, rather than 'live' data.

Woo! CSM XI!

Fuzzwork Enterprises

Twitter: @fuzzysteve on Twitter

Pete Butcher
The Scope
Gallente Federation
#16 - 2016-03-26 06:47:06 UTC
Steve Ronuken wrote:
Actually, it was designed this way for Dust. They then opened it up so we can get at it.


That further confirms no one really thought about Eve use cases.

http://evernus.com - the ultimate multiplatform EVE trade tool + nullsec Alliance Market tool

Cornbread Muffin
The Chosen - Holy Warriors of Bob the Unforgiving
#17 - 2016-03-26 08:49:33 UTC  |  Edited by: Cornbread Muffin
Pete Butcher wrote:
That further confirms no one really thought about Eve use cases.

I take it as a (potential) good thing. Not that having a Dust-appropriate API means they can't also have an EVE-appropriate API, but at least there's some operational reason for things to be how they are. That's a superior alternative to "this is how we think an EVE-appropriate API should function". Smile


Reasonably up-to-date snapshots would be sufficient for many uses and a bulk data endpoint could exist along side the existing resources. File size and the bandwidth they want to support is a legit issue, but bandwidth usage could be kept the same and service levels massively improved with a bulk data endpoint.

The last snapshot I was able to get covered the 23 empire regions and 10375 items (I took out stuff like Dust items from my requests, but that's most of the items in the game). That's 477,250 calls to the API. The response headers alone were 293MB and the payload (compressed on the wire) was 97.3MB. It took a little under 2 hours, so transferred at a ~.5Mbps overall. This was at ~70 rq/s, so roughly half the rate limit. I assume that's the bulk of the orders that exist in the entire game, but I've never requested null-sec markets so I don't know that for sure.

The same data gzipped as one file is 56.1MB. 10.5MB of that is The Forge. If I was adding a bulk data endpoint I'd pull out a lot of the redundant data (e.g. return no data at all for itemTypeIds that have no orders, remove most of the bulky urls), but even if the data was identical it would still be about 1/8th the size of what they are sending me now, most of which is wasted on headers. I could get the same market data at the same .5Mbps every ~15 minutes and it wouldn't take any more data transfer on their part. That's a worthwhile improvement IMO.
William Kugisa
Perkone
Caldari State
#18 - 2016-03-26 10:59:38 UTC
There is also a large amount in those resources that people that wanting large amounts of data dont want, so just stripping that out as a minimal CSV results in a lot less data, and less processing for CCP if those strings live in a different table to market orders.

As for bulk/dumps, id be perfectly happy with such a dump even just once a week, as seed data, or crawling the current api just once, if there was then a global transaction log I could poll say once a minute to sync my data (so again, drastically cheaper than trying to poll thousands of market endpoints).

Such a log would just have a set of create/update/delete (similar to how master-slave databases sync), it could in thoery detail each actual market transaction, or as now only show the changed order leaving the rest to estimations
Pete Butcher
The Scope
Gallente Federation
#19 - 2016-03-26 13:22:48 UTC
Cornbread Muffin wrote:

I take it as a (potential) good thing. Not that having a Dust-appropriate API means they can't also have an EVE-appropriate API, but at least there's some operational reason for things to be how they are. That's a superior alternative to "this is how we think an EVE-appropriate API should function". Smile


I disagree. This is just a bad excuse - given Dust market works like in Eve, the same issues still apply there. Bad design is still bad design; inconsistencies are still inconsistencies; unreliability is still unreliability. Telling it was metaphorically copy-pasted to eve is just another argument nobody thought CREST through.

http://evernus.com - the ultimate multiplatform EVE trade tool + nullsec Alliance Market tool

Steve Ronuken
Fuzzwork Enterprises
Vote Steve Ronuken for CSM
#20 - 2016-03-26 15:52:54 UTC
I have asked for snapshots of the data to be created.

Development time is, of course, an issue.

Woo! CSM XI!

Fuzzwork Enterprises

Twitter: @fuzzysteve on Twitter

12Next page